All of lore.kernel.org
 help / color / mirror / Atom feed
* [Intel-xe] [PATCH v5 0/3] drm/xe/pmu: Enable PMU interface
@ 2023-08-30  5:15 Aravind Iddamsetty
  2023-08-30  5:10 ` [Intel-xe] ✓ CI.Patch_applied: success for drm/xe/pmu: Enable PMU interface (rev5) Patchwork
                   ` (7 more replies)
  0 siblings, 8 replies; 29+ messages in thread
From: Aravind Iddamsetty @ 2023-08-30  5:15 UTC (permalink / raw)
  To: intel-xe

There are a set of engine group busyness counters provided by HW which are
perfect fit to be exposed via PMU perf events.

BSPEC: 46559, 46560, 46722, 46729, 52071, 71028

events can be listed using:
perf list
  xe_0000_03_00.0/any-engine-group-busy-gt0/         [Kernel PMU event]
  xe_0000_03_00.0/copy-group-busy-gt0/               [Kernel PMU event]
  xe_0000_03_00.0/interrupts/                        [Kernel PMU event]
  xe_0000_03_00.0/media-group-busy-gt0/              [Kernel PMU event]
  xe_0000_03_00.0/render-group-busy-gt0/             [Kernel PMU event]

and can be read using:

perf stat -e "xe_0000_8c_00.0/render-group-busy-gt0/" -I 1000
           time             counts unit events
     1.001139062                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     2.003294678                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     3.005199582                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     4.007076497                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     5.008553068                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     6.010531563              43520 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     7.012468029              44800 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     8.013463515                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     9.015300183                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
    10.017233010                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
    10.971934120                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/

The pmu base implementation is taken from i915.

v5:
1. Use spinlock in forcewake instead of mutex
2. take forcewake when accessing the OAG registers

v4: minor nits.

v3:
1. drop init_samples, as storing counters before going to suspend should
be sufficient.
2. ported the "drm/i915/pmu: Make PMU sample array two-dimensional" and
dropped helpers to store and read samples.
3. use xe_device_mem_access_get_if_ongoing to check if device is active
before reading the OA registers.
4. dropped format attr as no longer needed
5. introduce xe_pmu_suspend to call engine_group_busyness_store
6. few other nits.

v2:
Store last known value when device is awake return that while the GT is
suspended and then update the driver copy when read during awake.

Aravind Iddamsetty (3):
  drm/xe: Get GT clock to nanosecs
  drm/xe: Use spinlock in forcewake instead of mutex
  drm/xe/pmu: Enable PMU interface

 drivers/gpu/drm/xe/Makefile              |   2 +
 drivers/gpu/drm/xe/regs/xe_gt_regs.h     |   5 +
 drivers/gpu/drm/xe/xe_device.c           |   2 +
 drivers/gpu/drm/xe/xe_device_types.h     |   4 +
 drivers/gpu/drm/xe/xe_force_wake.c       |  14 +-
 drivers/gpu/drm/xe/xe_force_wake_types.h |   2 +-
 drivers/gpu/drm/xe/xe_gt.c               |   2 +
 drivers/gpu/drm/xe/xe_gt_clock.c         |   5 +
 drivers/gpu/drm/xe/xe_gt_clock.h         |   4 +-
 drivers/gpu/drm/xe/xe_irq.c              |  18 +
 drivers/gpu/drm/xe/xe_module.c           |   5 +
 drivers/gpu/drm/xe/xe_pmu.c              | 679 +++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_pmu.h              |  25 +
 drivers/gpu/drm/xe/xe_pmu_types.h        |  76 +++
 include/uapi/drm/xe_drm.h                |  16 +
 15 files changed, 850 insertions(+), 9 deletions(-)
 create mode 100644 drivers/gpu/drm/xe/xe_pmu.c
 create mode 100644 drivers/gpu/drm/xe/xe_pmu.h
 create mode 100644 drivers/gpu/drm/xe/xe_pmu_types.h

-- 
2.25.1


^ permalink raw reply	[flat|nested] 29+ messages in thread
* [Intel-xe] [PATCH v6 0/3] drm/xe/pmu: Enable PMU interface
@ 2023-09-01  7:06 Aravind Iddamsetty
  2023-09-01  7:06 ` [Intel-xe] [PATCH 2/3] drm/xe: Use spinlock in forcewake instead of mutex Aravind Iddamsetty
  0 siblings, 1 reply; 29+ messages in thread
From: Aravind Iddamsetty @ 2023-09-01  7:06 UTC (permalink / raw)
  To: intel-xe; +Cc: rodrigo.vivi

There are a set of engine group busyness counters provided by HW which are
perfect fit to be exposed via PMU perf events.

BSPEC: 46559, 46560, 46722, 46729, 52071, 71028

events can be listed using:
perf list
  xe_0000_03_00.0/any-engine-group-busy-gt0/         [Kernel PMU event]
  xe_0000_03_00.0/copy-group-busy-gt0/               [Kernel PMU event]
  xe_0000_03_00.0/interrupts/                        [Kernel PMU event]
  xe_0000_03_00.0/media-group-busy-gt0/              [Kernel PMU event]
  xe_0000_03_00.0/render-group-busy-gt0/             [Kernel PMU event]

and can be read using:

perf stat -e "xe_0000_8c_00.0/render-group-busy-gt0/" -I 1000
           time             counts unit events
     1.001139062                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     2.003294678                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     3.005199582                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     4.007076497                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     5.008553068                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     6.010531563              43520 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     7.012468029              44800 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     8.013463515                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     9.015300183                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
    10.017233010                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
    10.971934120                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/

The pmu base implementation is taken from i915.

v6:
1. drop engine_busyness_sample_type
2. update UAPI documentation

v5:
1. Use spinlock in forcewake instead of mutex
2. take forcewake when accessing the OAG registers

v4: minor nits.

v3:
1. drop init_samples, as storing counters before going to suspend should
be sufficient.
2. ported the "drm/i915/pmu: Make PMU sample array two-dimensional" and
dropped helpers to store and read samples.
3. use xe_device_mem_access_get_if_ongoing to check if device is active
before reading the OA registers.
4. dropped format attr as no longer needed
5. introduce xe_pmu_suspend to call engine_group_busyness_store
6. few other nits.

v2:
Store last known value when device is awake return that while the GT is
suspended and then update the driver copy when read during awake.

Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>

Aravind Iddamsetty (3):
  drm/xe: Get GT clock to nanosecs
  drm/xe: Use spinlock in forcewake instead of mutex
  drm/xe/pmu: Enable PMU interface

 drivers/gpu/drm/xe/Makefile              |   2 +
 drivers/gpu/drm/xe/regs/xe_gt_regs.h     |   5 +
 drivers/gpu/drm/xe/xe_device.c           |   2 +
 drivers/gpu/drm/xe/xe_device_types.h     |   4 +
 drivers/gpu/drm/xe/xe_force_wake.c       |  14 +-
 drivers/gpu/drm/xe/xe_force_wake_types.h |   2 +-
 drivers/gpu/drm/xe/xe_gt.c               |   2 +
 drivers/gpu/drm/xe/xe_gt_clock.c         |   5 +
 drivers/gpu/drm/xe/xe_gt_clock.h         |   4 +-
 drivers/gpu/drm/xe/xe_irq.c              |  18 +
 drivers/gpu/drm/xe/xe_module.c           |   5 +
 drivers/gpu/drm/xe/xe_pmu.c              | 661 +++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_pmu.h              |  25 +
 drivers/gpu/drm/xe/xe_pmu_types.h        |  76 +++
 include/uapi/drm/xe_drm.h                |  39 ++
 15 files changed, 855 insertions(+), 9 deletions(-)
 create mode 100644 drivers/gpu/drm/xe/xe_pmu.c
 create mode 100644 drivers/gpu/drm/xe/xe_pmu.h
 create mode 100644 drivers/gpu/drm/xe/xe_pmu_types.h

-- 
2.25.1


^ permalink raw reply	[flat|nested] 29+ messages in thread
* [Intel-xe] [PATCH v7 0/3] drm/xe/pmu: Enable PMU interface
@ 2023-09-14  6:13 Aravind Iddamsetty
  2023-09-14  6:13 ` [Intel-xe] [PATCH 2/3] drm/xe: Use spinlock in forcewake instead of mutex Aravind Iddamsetty
  0 siblings, 1 reply; 29+ messages in thread
From: Aravind Iddamsetty @ 2023-09-14  6:13 UTC (permalink / raw)
  To: intel-xe; +Cc: rodrigo.vivi

There are a set of engine group busyness counters provided by HW which are
perfect fit to be exposed via PMU perf events.

BSPEC: 46559, 46560, 46722, 46729, 52071, 71028

events can be listed using:
perf list
  xe_0000_03_00.0/any-engine-group-busy-gt0/         [Kernel PMU event]
  xe_0000_03_00.0/copy-group-busy-gt0/               [Kernel PMU event]
  xe_0000_03_00.0/interrupts/                        [Kernel PMU event]
  xe_0000_03_00.0/media-group-busy-gt0/              [Kernel PMU event]
  xe_0000_03_00.0/render-group-busy-gt0/             [Kernel PMU event]

and can be read using:

perf stat -e "xe_0000_8c_00.0/render-group-busy-gt0/" -I 1000
           time             counts unit events
     1.001139062                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     2.003294678                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     3.005199582                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     4.007076497                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     5.008553068                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     6.010531563              43520 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     7.012468029              44800 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     8.013463515                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
     9.015300183                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
    10.017233010                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/
    10.971934120                  0 ns  xe_0000_8c_00.0/render-group-busy-gt0/

The pmu base implementation is taken from i915.

v7:
1. update UAPI documentation
2. drop MEDIA_GT specific change for media busyness counter.

v6:
1. drop engine_busyness_sample_type
2. update UAPI documentation

v5:
1. Use spinlock in forcewake instead of mutex
2. take forcewake when accessing the OAG registers

v4: minor nits.

v3:
1. drop init_samples, as storing counters before going to suspend should
be sufficient.
2. ported the "drm/i915/pmu: Make PMU sample array two-dimensional" and
dropped helpers to store and read samples.
3. use xe_device_mem_access_get_if_ongoing to check if device is active
before reading the OA registers.
4. dropped format attr as no longer needed
5. introduce xe_pmu_suspend to call engine_group_busyness_store
6. few other nits.

v2:
Store last known value when device is awake return that while the GT is
suspended and then update the driver copy when read during awake.

Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>

Aravind Iddamsetty (3):
  drm/xe: Get GT clock to nanosecs
  drm/xe: Use spinlock in forcewake instead of mutex
  drm/xe/pmu: Enable PMU interface

 drivers/gpu/drm/xe/Makefile              |   2 +
 drivers/gpu/drm/xe/regs/xe_gt_regs.h     |   5 +
 drivers/gpu/drm/xe/xe_device.c           |   2 +
 drivers/gpu/drm/xe/xe_device_types.h     |   4 +
 drivers/gpu/drm/xe/xe_force_wake.c       |  14 +-
 drivers/gpu/drm/xe/xe_force_wake_types.h |   2 +-
 drivers/gpu/drm/xe/xe_gt.c               |   2 +
 drivers/gpu/drm/xe/xe_gt_clock.c         |   5 +
 drivers/gpu/drm/xe/xe_gt_clock.h         |   4 +-
 drivers/gpu/drm/xe/xe_irq.c              |  18 +
 drivers/gpu/drm/xe/xe_module.c           |   5 +
 drivers/gpu/drm/xe/xe_pmu.c              | 654 +++++++++++++++++++++++
 drivers/gpu/drm/xe/xe_pmu.h              |  25 +
 drivers/gpu/drm/xe/xe_pmu_types.h        |  76 +++
 include/uapi/drm/xe_drm.h                |  38 ++
 15 files changed, 847 insertions(+), 9 deletions(-)
 create mode 100644 drivers/gpu/drm/xe/xe_pmu.c
 create mode 100644 drivers/gpu/drm/xe/xe_pmu.h
 create mode 100644 drivers/gpu/drm/xe/xe_pmu_types.h

-- 
2.25.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2023-09-14  6:05 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-30  5:15 [Intel-xe] [PATCH v5 0/3] drm/xe/pmu: Enable PMU interface Aravind Iddamsetty
2023-08-30  5:10 ` [Intel-xe] ✓ CI.Patch_applied: success for drm/xe/pmu: Enable PMU interface (rev5) Patchwork
2023-08-30  5:10 ` [Intel-xe] ✗ CI.checkpatch: warning " Patchwork
2023-08-30  5:12 ` [Intel-xe] ✓ CI.KUnit: success " Patchwork
2023-08-30  5:15 ` [Intel-xe] [PATCH v3 1/3] drm/xe: Get GT clock to nanosecs Aravind Iddamsetty
2023-08-30  5:15 ` [Intel-xe] [PATCH 2/3] drm/xe: Use spinlock in forcewake instead of mutex Aravind Iddamsetty
2023-08-30  5:33   ` Dixit, Ashutosh
2023-08-30 20:56     ` Rodrigo Vivi
2023-08-30 22:19       ` Dixit, Ashutosh
2023-08-31  4:13         ` Aravind Iddamsetty
2023-08-30  5:15 ` [Intel-xe] [PATCH v5 3/3] drm/xe/pmu: Enable PMU interface Aravind Iddamsetty
2023-08-30 20:58   ` Rodrigo Vivi
2023-08-31 20:45     ` Dixit, Ashutosh
2023-08-31 22:14       ` Aravind Iddamsetty
2023-08-31  4:48   ` Dixit, Ashutosh
2023-08-31 10:29     ` Aravind Iddamsetty
2023-08-31 16:58       ` Dixit, Ashutosh
2023-08-31 22:11         ` Aravind Iddamsetty
2023-08-31 22:21           ` Belgaumkar, Vinay
2023-08-31 23:11             ` Aravind Iddamsetty
2023-08-31 23:22               ` Belgaumkar, Vinay
2023-08-31 23:16                 ` Dixit, Ashutosh
2023-08-31 23:57                   ` Belgaumkar, Vinay
2023-08-31 23:58                     ` Dixit, Ashutosh
2023-09-01  3:34                       ` Aravind Iddamsetty
2023-08-30  5:19 ` [Intel-xe] ✓ CI.Build: success for drm/xe/pmu: Enable PMU interface (rev5) Patchwork
2023-08-30  5:19 ` [Intel-xe] ✗ CI.Hooks: failure " Patchwork
  -- strict thread matches above, loose matches on Subject: below --
2023-09-01  7:06 [Intel-xe] [PATCH v6 0/3] drm/xe/pmu: Enable PMU interface Aravind Iddamsetty
2023-09-01  7:06 ` [Intel-xe] [PATCH 2/3] drm/xe: Use spinlock in forcewake instead of mutex Aravind Iddamsetty
2023-09-14  6:13 [Intel-xe] [PATCH v7 0/3] drm/xe/pmu: Enable PMU interface Aravind Iddamsetty
2023-09-14  6:13 ` [Intel-xe] [PATCH 2/3] drm/xe: Use spinlock in forcewake instead of mutex Aravind Iddamsetty

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.