Linux ARM-MSM sub-architecture
 help / color / mirror / Atom feed
* [PATCH 0/4] drm/msm: Clear perf counters across context switch
@ 2022-03-03 19:46 Rob Clark
  2022-03-03 19:46 ` [PATCH 2/4] drm/msm: Add SET_PARAM ioctl Rob Clark
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Rob Clark @ 2022-03-03 19:46 UTC (permalink / raw)
  To: dri-devel
  Cc: freedreno, linux-arm-msm, Jordan Crouse, Akhil P Oommen,
	Rob Clark, Abhinav Kumar, Bjorn Andersson, Dan Carpenter,
	Dmitry Baryshkov, Emma Anholt, Jonathan Marek, open list,
	Stephen Boyd, Viresh Kumar, Vladimir Lypak, Yangtao Li

From: Rob Clark <robdclark@chromium.org>

Some clever folks figured out a way to use performance counters as a
side-channel[1].  But, other than the special case of using the perf
counters for system profiling, we can reset the counters across context
switches to protect against this.

This series introduces a SYSPROF param which a sufficiently privilaged
userspace (like Mesa's pps-producer, which already must run as root) to
opt-out, and makes the default behavior to reset counters on context
switches.

[1] https://dl.acm.org/doi/pdf/10.1145/3503222.3507757

Rob Clark (4):
  drm/msm: Update generated headers
  drm/msm: Add SET_PARAM ioctl
  drm/msm: Add SYSPROF param
  drm/msm/a6xx: Zap counters across context switch

 drivers/gpu/drm/msm/adreno/a2xx.xml.h         |  26 +-
 drivers/gpu/drm/msm/adreno/a2xx_gpu.c         |   1 +
 drivers/gpu/drm/msm/adreno/a3xx.xml.h         |  30 +-
 drivers/gpu/drm/msm/adreno/a3xx_gpu.c         |   1 +
 drivers/gpu/drm/msm/adreno/a4xx.xml.h         | 112 ++-
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c         |   1 +
 drivers/gpu/drm/msm/adreno/a5xx.xml.h         |  63 +-
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c         |   1 +
 drivers/gpu/drm/msm/adreno/a6xx.xml.h         | 674 +++++++++++-------
 drivers/gpu/drm/msm/adreno/a6xx_gmu.xml.h     |  26 +-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c         |  30 +
 .../gpu/drm/msm/adreno/adreno_common.xml.h    |  31 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.c       |  14 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.h       |   2 +
 drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h   |  46 +-
 drivers/gpu/drm/msm/disp/mdp4/mdp4.xml.h      |  37 +-
 drivers/gpu/drm/msm/disp/mdp5/mdp5.xml.h      |  37 +-
 drivers/gpu/drm/msm/disp/mdp_common.xml.h     |  37 +-
 drivers/gpu/drm/msm/dsi/dsi.xml.h             |  37 +-
 drivers/gpu/drm/msm/dsi/dsi_phy_10nm.xml.h    |  37 +-
 drivers/gpu/drm/msm/dsi/dsi_phy_14nm.xml.h    |  37 +-
 drivers/gpu/drm/msm/dsi/dsi_phy_20nm.xml.h    |  37 +-
 drivers/gpu/drm/msm/dsi/dsi_phy_28nm.xml.h    |  37 +-
 .../gpu/drm/msm/dsi/dsi_phy_28nm_8960.xml.h   |  37 +-
 drivers/gpu/drm/msm/dsi/dsi_phy_5nm.xml.h     | 480 -------------
 drivers/gpu/drm/msm/dsi/dsi_phy_7nm.xml.h     |  43 +-
 drivers/gpu/drm/msm/dsi/mmss_cc.xml.h         |  37 +-
 drivers/gpu/drm/msm/dsi/sfpb.xml.h            |  37 +-
 drivers/gpu/drm/msm/hdmi/hdmi.xml.h           |  37 +-
 drivers/gpu/drm/msm/hdmi/qfprom.xml.h         |  37 +-
 drivers/gpu/drm/msm/msm_drv.c                 |  28 +
 drivers/gpu/drm/msm/msm_gpu.h                 |  29 +
 drivers/gpu/drm/msm/msm_submitqueue.c         |  34 +
 include/uapi/drm/msm_drm.h                    |  28 +-
 34 files changed, 1051 insertions(+), 1130 deletions(-)
 delete mode 100644 drivers/gpu/drm/msm/dsi/dsi_phy_5nm.xml.h

-- 
2.35.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 2/4] drm/msm: Add SET_PARAM ioctl
  2022-03-03 19:46 [PATCH 0/4] drm/msm: Clear perf counters across context switch Rob Clark
@ 2022-03-03 19:46 ` Rob Clark
  2022-03-03 19:46 ` [PATCH 3/4] drm/msm: Add SYSPROF param Rob Clark
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 11+ messages in thread
From: Rob Clark @ 2022-03-03 19:46 UTC (permalink / raw)
  To: dri-devel
  Cc: freedreno, linux-arm-msm, Jordan Crouse, Akhil P Oommen,
	Rob Clark, Rob Clark, Sean Paul, Abhinav Kumar, David Airlie,
	Daniel Vetter, Dan Carpenter, Vladimir Lypak, Dmitry Baryshkov,
	Jonathan Marek, Yangtao Li, Emma Anholt, Stephen Boyd,
	Bjorn Andersson, open list

From: Rob Clark <robdclark@chromium.org>

It was always expected to have a use for this some day, so we left a
placeholder.  Now we do.  (And I expect another use in the not too
distant future when we start allowing userspace to allocate GPU iova.)

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 drivers/gpu/drm/msm/adreno/a2xx_gpu.c   |  1 +
 drivers/gpu/drm/msm/adreno/a3xx_gpu.c   |  1 +
 drivers/gpu/drm/msm/adreno/a4xx_gpu.c   |  1 +
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c   |  1 +
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c   |  1 +
 drivers/gpu/drm/msm/adreno/adreno_gpu.c | 10 +++++++++
 drivers/gpu/drm/msm/adreno/adreno_gpu.h |  2 ++
 drivers/gpu/drm/msm/msm_drv.c           | 20 ++++++++++++++++++
 drivers/gpu/drm/msm/msm_gpu.h           |  2 ++
 include/uapi/drm/msm_drm.h              | 27 ++++++++++++++-----------
 10 files changed, 54 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a2xx_gpu.c b/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
index 22e8295a5e2b..6c9a747eb4ad 100644
--- a/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a2xx_gpu.c
@@ -471,6 +471,7 @@ static u32 a2xx_get_rptr(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
 static const struct adreno_gpu_funcs funcs = {
 	.base = {
 		.get_param = adreno_get_param,
+		.set_param = adreno_set_param,
 		.hw_init = a2xx_hw_init,
 		.pm_suspend = msm_gpu_pm_suspend,
 		.pm_resume = msm_gpu_pm_resume,
diff --git a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
index 2e481e2692ba..0ab0e1dd8bbb 100644
--- a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
@@ -486,6 +486,7 @@ static u32 a3xx_get_rptr(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
 static const struct adreno_gpu_funcs funcs = {
 	.base = {
 		.get_param = adreno_get_param,
+		.set_param = adreno_set_param,
 		.hw_init = a3xx_hw_init,
 		.pm_suspend = msm_gpu_pm_suspend,
 		.pm_resume = msm_gpu_pm_resume,
diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
index c5524d6e8705..0c6b2a6d0b4c 100644
--- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
@@ -621,6 +621,7 @@ static u32 a4xx_get_rptr(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
 static const struct adreno_gpu_funcs funcs = {
 	.base = {
 		.get_param = adreno_get_param,
+		.set_param = adreno_set_param,
 		.hw_init = a4xx_hw_init,
 		.pm_suspend = a4xx_pm_suspend,
 		.pm_resume = a4xx_pm_resume,
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index 3d28fcf841a6..407f50a15faa 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -1700,6 +1700,7 @@ static uint32_t a5xx_get_rptr(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
 static const struct adreno_gpu_funcs funcs = {
 	.base = {
 		.get_param = adreno_get_param,
+		.set_param = adreno_set_param,
 		.hw_init = a5xx_hw_init,
 		.pm_suspend = a5xx_pm_suspend,
 		.pm_resume = a5xx_pm_resume,
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 7d23c741db4a..237c2e7a7baa 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -1800,6 +1800,7 @@ static int a6xx_set_supported_hw(struct device *dev, struct adreno_rev rev)
 static const struct adreno_gpu_funcs funcs = {
 	.base = {
 		.get_param = adreno_get_param,
+		.set_param = adreno_set_param,
 		.hw_init = a6xx_hw_init,
 		.pm_suspend = a6xx_pm_suspend,
 		.pm_resume = a6xx_pm_resume,
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 15c8997b7251..6a37d409653b 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -283,6 +283,16 @@ int adreno_get_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
 	}
 }
 
+int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
+		     uint32_t param, uint64_t value)
+{
+	switch (param) {
+	default:
+		DBG("%s: invalid param: %u", gpu->name, param);
+		return -EINVAL;
+	}
+}
+
 const struct firmware *
 adreno_request_fw(struct adreno_gpu *adreno_gpu, const char *fwname)
 {
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
index b1ee453d627d..0490c5fbb780 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h
@@ -282,6 +282,8 @@ static inline int adreno_is_a650_family(struct adreno_gpu *gpu)
 
 int adreno_get_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
 		     uint32_t param, uint64_t *value);
+int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
+		     uint32_t param, uint64_t value);
 const struct firmware *adreno_request_fw(struct adreno_gpu *adreno_gpu,
 		const char *fwname);
 struct drm_gem_object *adreno_fw_create_bo(struct msm_gpu *gpu,
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index fdf401e6f09e..ca9a8a866292 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -614,6 +614,25 @@ static int msm_ioctl_get_param(struct drm_device *dev, void *data,
 				     args->param, &args->value);
 }
 
+static int msm_ioctl_set_param(struct drm_device *dev, void *data,
+		struct drm_file *file)
+{
+	struct msm_drm_private *priv = dev->dev_private;
+	struct drm_msm_param *args = data;
+	struct msm_gpu *gpu;
+
+	if (args->pipe != MSM_PIPE_3D0)
+		return -EINVAL;
+
+	gpu = priv->gpu;
+
+	if (!gpu)
+		return -ENXIO;
+
+	return gpu->funcs->set_param(gpu, file->driver_priv,
+				     args->param, args->value);
+}
+
 static int msm_ioctl_gem_new(struct drm_device *dev, void *data,
 		struct drm_file *file)
 {
@@ -899,6 +918,7 @@ static int msm_ioctl_submitqueue_close(struct drm_device *dev, void *data,
 
 static const struct drm_ioctl_desc msm_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(MSM_GET_PARAM,    msm_ioctl_get_param,    DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(MSM_SET_PARAM,    msm_ioctl_set_param,    DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(MSM_GEM_NEW,      msm_ioctl_gem_new,      DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(MSM_GEM_INFO,     msm_ioctl_gem_info,     DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(MSM_GEM_CPU_PREP, msm_ioctl_gem_cpu_prep, DRM_RENDER_ALLOW),
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index ad4fe05dee03..fde9a29f884e 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -45,6 +45,8 @@ struct msm_gpu_config {
 struct msm_gpu_funcs {
 	int (*get_param)(struct msm_gpu *gpu, struct msm_file_private *ctx,
 			 uint32_t param, uint64_t *value);
+	int (*set_param)(struct msm_gpu *gpu, struct msm_file_private *ctx,
+			 uint32_t param, uint64_t value);
 	int (*hw_init)(struct msm_gpu *gpu);
 	int (*pm_suspend)(struct msm_gpu *gpu);
 	int (*pm_resume)(struct msm_gpu *gpu);
diff --git a/include/uapi/drm/msm_drm.h b/include/uapi/drm/msm_drm.h
index 6cd45a7f6947..a1bb2a17a8b9 100644
--- a/include/uapi/drm/msm_drm.h
+++ b/include/uapi/drm/msm_drm.h
@@ -67,16 +67,20 @@ struct drm_msm_timespec {
 	__s64 tv_nsec;         /* nanoseconds */
 };
 
-#define MSM_PARAM_GPU_ID     0x01
-#define MSM_PARAM_GMEM_SIZE  0x02
-#define MSM_PARAM_CHIP_ID    0x03
-#define MSM_PARAM_MAX_FREQ   0x04
-#define MSM_PARAM_TIMESTAMP  0x05
-#define MSM_PARAM_GMEM_BASE  0x06
-#define MSM_PARAM_PRIORITIES 0x07  /* The # of priority levels */
-#define MSM_PARAM_PP_PGTABLE 0x08  /* => 1 for per-process pagetables, else 0 */
-#define MSM_PARAM_FAULTS     0x09
-#define MSM_PARAM_SUSPENDS   0x0a
+/* Below "RO" indicates a read-only param, "WO" indicates write-only, and
+ * "RW" indicates a param that can be both read (GET_PARAM) and written
+ * (SET_PARAM)
+ */
+#define MSM_PARAM_GPU_ID     0x01  /* RO */
+#define MSM_PARAM_GMEM_SIZE  0x02  /* RO */
+#define MSM_PARAM_CHIP_ID    0x03  /* RO */
+#define MSM_PARAM_MAX_FREQ   0x04  /* RO */
+#define MSM_PARAM_TIMESTAMP  0x05  /* RO */
+#define MSM_PARAM_GMEM_BASE  0x06  /* RO */
+#define MSM_PARAM_PRIORITIES 0x07  /* RO: The # of priority levels */
+#define MSM_PARAM_PP_PGTABLE 0x08  /* RO: Deprecated, always returns zero */
+#define MSM_PARAM_FAULTS     0x09  /* RO */
+#define MSM_PARAM_SUSPENDS   0x0a  /* RO */
 
 /* For backwards compat.  The original support for preemption was based on
  * a single ring per priority level so # of priority levels equals the #
@@ -335,9 +339,7 @@ struct drm_msm_submitqueue_query {
 };
 
 #define DRM_MSM_GET_PARAM              0x00
-/* placeholder:
 #define DRM_MSM_SET_PARAM              0x01
- */
 #define DRM_MSM_GEM_NEW                0x02
 #define DRM_MSM_GEM_INFO               0x03
 #define DRM_MSM_GEM_CPU_PREP           0x04
@@ -353,6 +355,7 @@ struct drm_msm_submitqueue_query {
 #define DRM_MSM_SUBMITQUEUE_QUERY      0x0C
 
 #define DRM_IOCTL_MSM_GET_PARAM        DRM_IOWR(DRM_COMMAND_BASE + DRM_MSM_GET_PARAM, struct drm_msm_param)
+#define DRM_IOCTL_MSM_SET_PARAM        DRM_IOW (DRM_COMMAND_BASE + DRM_MSM_SET_PARAM, struct drm_msm_param)
 #define DRM_IOCTL_MSM_GEM_NEW          DRM_IOWR(DRM_COMMAND_BASE + DRM_MSM_GEM_NEW, struct drm_msm_gem_new)
 #define DRM_IOCTL_MSM_GEM_INFO         DRM_IOWR(DRM_COMMAND_BASE + DRM_MSM_GEM_INFO, struct drm_msm_gem_info)
 #define DRM_IOCTL_MSM_GEM_CPU_PREP     DRM_IOW (DRM_COMMAND_BASE + DRM_MSM_GEM_CPU_PREP, struct drm_msm_gem_cpu_prep)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 3/4] drm/msm: Add SYSPROF param
  2022-03-03 19:46 [PATCH 0/4] drm/msm: Clear perf counters across context switch Rob Clark
  2022-03-03 19:46 ` [PATCH 2/4] drm/msm: Add SET_PARAM ioctl Rob Clark
@ 2022-03-03 19:46 ` Rob Clark
  2022-03-03 20:47   ` Stephen Boyd
  2022-03-03 19:46 ` [PATCH 4/4] drm/msm/a6xx: Zap counters across context switch Rob Clark
       [not found] ` <20220303194758.710358-2-robdclark@gmail.com>
  3 siblings, 1 reply; 11+ messages in thread
From: Rob Clark @ 2022-03-03 19:46 UTC (permalink / raw)
  To: dri-devel
  Cc: freedreno, linux-arm-msm, Jordan Crouse, Akhil P Oommen,
	Rob Clark, Rob Clark, Sean Paul, Abhinav Kumar, David Airlie,
	Daniel Vetter, Jonathan Marek, Emma Anholt, Stephen Boyd,
	Yangtao Li, open list

From: Rob Clark <robdclark@chromium.org>

Add a SYSPROF param for system profiling tools like Mesa's pps-producer
(perfetto) to control behavior related to system-wide performance
counter collection.  In particular, for profiling, one wants to ensure
that GPU context switches do not effect perfcounter state, and might
want to suppress suspend (which would cause counters to lose state).

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 drivers/gpu/drm/msm/adreno/adreno_gpu.c |  4 +++
 drivers/gpu/drm/msm/msm_drv.c           |  8 ++++++
 drivers/gpu/drm/msm/msm_gpu.h           | 27 ++++++++++++++++++++
 drivers/gpu/drm/msm/msm_submitqueue.c   | 34 +++++++++++++++++++++++++
 include/uapi/drm/msm_drm.h              |  1 +
 5 files changed, 74 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 6a37d409653b..c91ea363c373 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -287,6 +287,10 @@ int adreno_set_param(struct msm_gpu *gpu, struct msm_file_private *ctx,
 		     uint32_t param, uint64_t value)
 {
 	switch (param) {
+	case MSM_PARAM_SYSPROF:
+		if (!capable(CAP_SYS_ADMIN))
+			return -EPERM;
+		return msm_file_private_set_sysprof(ctx, gpu, value);
 	default:
 		DBG("%s: invalid param: %u", gpu->name, param);
 		return -EINVAL;
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index ca9a8a866292..780f9748aaaf 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -559,8 +559,16 @@ static void context_close(struct msm_file_private *ctx)
 
 static void msm_postclose(struct drm_device *dev, struct drm_file *file)
 {
+	struct msm_drm_private *priv = dev->dev_private;
 	struct msm_file_private *ctx = file->driver_priv;
 
+	/*
+	 * It is not possible to set sysprof param to non-zero if gpu
+	 * is not initialized:
+	 */
+	if (priv->gpu)
+		msm_file_private_set_sysprof(ctx, priv->gpu, 0);
+
 	context_close(ctx);
 }
 
diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
index fde9a29f884e..0ba1dbd4e50f 100644
--- a/drivers/gpu/drm/msm/msm_gpu.h
+++ b/drivers/gpu/drm/msm/msm_gpu.h
@@ -160,6 +160,13 @@ struct msm_gpu {
 	struct msm_ringbuffer *rb[MSM_GPU_MAX_RINGS];
 	int nr_rings;
 
+	/**
+	 * sysprof_active:
+	 *
+	 * The count of contexts that have enabled system profiling.
+	 */
+	refcount_t sysprof_active;
+
 	/**
 	 * cur_ctx_seqno:
 	 *
@@ -330,6 +337,24 @@ struct msm_file_private {
 	struct kref ref;
 	int seqno;
 
+	/**
+	 * sysprof:
+	 *
+	 * The value of MSM_PARAM_SYSPROF set by userspace.  This is
+	 * intended to be used by system profiling tools like Mesa's
+	 * pps-producer (perfetto), and restricted to CAP_SYS_ADMIN.
+	 *
+	 * Setting a value of 1 will preserve performance counters across
+	 * context switches.  Setting a value of 2 will in addition
+	 * suppress suspend.  (Performance counters loose  state across
+	 * power collapse, which is undesirable for profiling in some
+	 * cases.)
+	 *
+	 * The value automatically reverts to zero when the drm device
+	 * file is closed.
+	 */
+	int sysprof;
+
 	/**
 	 * elapsed:
 	 *
@@ -545,6 +570,8 @@ void msm_submitqueue_close(struct msm_file_private *ctx);
 
 void msm_submitqueue_destroy(struct kref *kref);
 
+int msm_file_private_set_sysprof(struct msm_file_private *ctx,
+				 struct msm_gpu *gpu, int sysprof);
 void __msm_file_private_destroy(struct kref *kref);
 
 static inline void msm_file_private_put(struct msm_file_private *ctx)
diff --git a/drivers/gpu/drm/msm/msm_submitqueue.c b/drivers/gpu/drm/msm/msm_submitqueue.c
index 7cb158bcbcf6..4179db54ac93 100644
--- a/drivers/gpu/drm/msm/msm_submitqueue.c
+++ b/drivers/gpu/drm/msm/msm_submitqueue.c
@@ -7,6 +7,40 @@
 
 #include "msm_gpu.h"
 
+int msm_file_private_set_sysprof(struct msm_file_private *ctx,
+				 struct msm_gpu *gpu, int sysprof)
+{
+	/* unwind old value first: */
+	switch (ctx->sysprof) {
+	case 2:
+		pm_runtime_put_autosuspend(&gpu->pdev->dev);
+		fallthrough;
+	case 1:
+		refcount_dec(&gpu->sysprof_active);
+		fallthrough;
+	case 0:
+		break;
+	}
+
+	/* then apply new value: */
+	switch (sysprof) {
+	default:
+		return -EINVAL;
+	case 2:
+		pm_runtime_get_sync(&gpu->pdev->dev);
+		fallthrough;
+	case 1:
+		refcount_inc(&gpu->sysprof_active);
+		fallthrough;
+	case 0:
+		break;
+	}
+
+	ctx->sysprof = sysprof;
+
+	return 0;
+}
+
 void __msm_file_private_destroy(struct kref *kref)
 {
 	struct msm_file_private *ctx = container_of(kref,
diff --git a/include/uapi/drm/msm_drm.h b/include/uapi/drm/msm_drm.h
index a1bb2a17a8b9..07efc8033492 100644
--- a/include/uapi/drm/msm_drm.h
+++ b/include/uapi/drm/msm_drm.h
@@ -81,6 +81,7 @@ struct drm_msm_timespec {
 #define MSM_PARAM_PP_PGTABLE 0x08  /* RO: Deprecated, always returns zero */
 #define MSM_PARAM_FAULTS     0x09  /* RO */
 #define MSM_PARAM_SUSPENDS   0x0a  /* RO */
+#define MSM_PARAM_SYSPROF    0x0b  /* WO: 1 preserves perfcntrs, 2 also disables suspend */
 
 /* For backwards compat.  The original support for preemption was based on
  * a single ring per priority level so # of priority levels equals the #
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH 4/4] drm/msm/a6xx: Zap counters across context switch
  2022-03-03 19:46 [PATCH 0/4] drm/msm: Clear perf counters across context switch Rob Clark
  2022-03-03 19:46 ` [PATCH 2/4] drm/msm: Add SET_PARAM ioctl Rob Clark
  2022-03-03 19:46 ` [PATCH 3/4] drm/msm: Add SYSPROF param Rob Clark
@ 2022-03-03 19:46 ` Rob Clark
       [not found] ` <20220303194758.710358-2-robdclark@gmail.com>
  3 siblings, 0 replies; 11+ messages in thread
From: Rob Clark @ 2022-03-03 19:46 UTC (permalink / raw)
  To: dri-devel
  Cc: freedreno, linux-arm-msm, Jordan Crouse, Akhil P Oommen,
	Rob Clark, Rob Clark, Sean Paul, Abhinav Kumar, David Airlie,
	Daniel Vetter, Dmitry Baryshkov, Jonathan Marek, open list

From: Rob Clark <robdclark@chromium.org>

Any app controlled perfcntr collection (GL_AMD_performance_monitor, etc)
does not require counters to maintain state across context switches.  So
clear them if systemwide profiling is not active.

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 29 +++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 237c2e7a7baa..98b3282a117b 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -101,6 +101,7 @@ static void get_stats_counter(struct msm_ringbuffer *ring, u32 counter,
 static void a6xx_set_pagetable(struct a6xx_gpu *a6xx_gpu,
 		struct msm_ringbuffer *ring, struct msm_file_private *ctx)
 {
+	bool sysprof = refcount_read(&a6xx_gpu->base.base.sysprof_active) > 0;
 	phys_addr_t ttbr;
 	u32 asid;
 	u64 memptr = rbmemptr(ring, ttbr0);
@@ -111,6 +112,15 @@ static void a6xx_set_pagetable(struct a6xx_gpu *a6xx_gpu,
 	if (msm_iommu_pagetable_params(ctx->aspace->mmu, &ttbr, &asid))
 		return;
 
+	if (!sysprof) {
+		/* Turn off protected mode to write to special registers */
+		OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1);
+		OUT_RING(ring, 0);
+
+		OUT_PKT4(ring, REG_A6XX_RBBM_PERFCTR_SRAM_INIT_CMD, 1);
+		OUT_RING(ring, 1);
+	}
+
 	/* Execute the table update */
 	OUT_PKT7(ring, CP_SMMU_TABLE_UPDATE, 4);
 	OUT_RING(ring, CP_SMMU_TABLE_UPDATE_0_TTBR0_LO(lower_32_bits(ttbr)));
@@ -137,6 +147,25 @@ static void a6xx_set_pagetable(struct a6xx_gpu *a6xx_gpu,
 
 	OUT_PKT7(ring, CP_EVENT_WRITE, 1);
 	OUT_RING(ring, 0x31);
+
+	if (!sysprof) {
+		/*
+		 * Wait for SRAM clear after the pgtable update, so the
+		 * two can happen in parallel:
+		 */
+		OUT_PKT7(ring, CP_WAIT_REG_MEM, 6);
+		OUT_RING(ring, CP_WAIT_REG_MEM_0_FUNCTION(WRITE_EQ));
+		OUT_RING(ring, CP_WAIT_REG_MEM_1_POLL_ADDR_LO(
+				REG_A6XX_RBBM_PERFCTR_SRAM_INIT_STATUS));
+		OUT_RING(ring, CP_WAIT_REG_MEM_2_POLL_ADDR_HI(0));
+		OUT_RING(ring, CP_WAIT_REG_MEM_3_REF(0x1));
+		OUT_RING(ring, CP_WAIT_REG_MEM_4_MASK(0x1));
+		OUT_RING(ring, CP_WAIT_REG_MEM_5_DELAY_LOOP_CYCLES(0));
+
+		/* Re-enable protected mode: */
+		OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1);
+		OUT_RING(ring, 1);
+	}
 }
 
 static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/4] drm/msm: Add SYSPROF param
  2022-03-03 19:46 ` [PATCH 3/4] drm/msm: Add SYSPROF param Rob Clark
@ 2022-03-03 20:47   ` Stephen Boyd
  2022-03-03 21:17     ` Rob Clark
  0 siblings, 1 reply; 11+ messages in thread
From: Stephen Boyd @ 2022-03-03 20:47 UTC (permalink / raw)
  To: Rob Clark, dri-devel
  Cc: freedreno, linux-arm-msm, Jordan Crouse, Akhil P Oommen,
	Rob Clark, Sean Paul, Abhinav Kumar, David Airlie, Daniel Vetter,
	Jonathan Marek, Emma Anholt, Yangtao Li, linux-kernel

Quoting Rob Clark (2022-03-03 11:46:47)
> diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
> index fde9a29f884e..0ba1dbd4e50f 100644
> --- a/drivers/gpu/drm/msm/msm_gpu.h
> +++ b/drivers/gpu/drm/msm/msm_gpu.h
> @@ -330,6 +337,24 @@ struct msm_file_private {
>         struct kref ref;
>         int seqno;
>
> +       /**
> +        * sysprof:
> +        *
> +        * The value of MSM_PARAM_SYSPROF set by userspace.  This is
> +        * intended to be used by system profiling tools like Mesa's
> +        * pps-producer (perfetto), and restricted to CAP_SYS_ADMIN.
> +        *
> +        * Setting a value of 1 will preserve performance counters across
> +        * context switches.  Setting a value of 2 will in addition
> +        * suppress suspend.  (Performance counters loose  state across

s/loose  /lose/

> +        * power collapse, which is undesirable for profiling in some
> +        * cases.)
> +        *
> +        * The value automatically reverts to zero when the drm device
> +        * file is closed.
> +        */
> +       int sysprof;
> +
>         /**
>          * elapsed:
>          *
> diff --git a/drivers/gpu/drm/msm/msm_submitqueue.c b/drivers/gpu/drm/msm/msm_submitqueue.c
> index 7cb158bcbcf6..4179db54ac93 100644
> --- a/drivers/gpu/drm/msm/msm_submitqueue.c
> +++ b/drivers/gpu/drm/msm/msm_submitqueue.c
> @@ -7,6 +7,40 @@
>
>  #include "msm_gpu.h"
>
> +int msm_file_private_set_sysprof(struct msm_file_private *ctx,
> +                                struct msm_gpu *gpu, int sysprof)
> +{
> +       /* unwind old value first: */
> +       switch (ctx->sysprof) {
> +       case 2:
> +               pm_runtime_put_autosuspend(&gpu->pdev->dev);
> +               fallthrough;
> +       case 1:
> +               refcount_dec(&gpu->sysprof_active);
> +               fallthrough;
> +       case 0:
> +               break;
> +       }
> +
> +       /* then apply new value: */

It would be safer to swap this. Otherwise a set when the values are at
"1" would drop to "zero" here and potentially trigger some glitch,
whereas incrementing one more time and then dropping the previous state
would avoid that short blip.

> +       switch (sysprof) {
> +       default:
> +               return -EINVAL;

This will become more complicated though.

> +       case 2:
> +               pm_runtime_get_sync(&gpu->pdev->dev);
> +               fallthrough;
> +       case 1:
> +               refcount_inc(&gpu->sysprof_active);
> +               fallthrough;
> +       case 0:
> +               break;
> +       }
> +
> +       ctx->sysprof = sysprof;
> +
> +       return 0;
> +}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/4] drm/msm: Update generated headers
       [not found]   ` <1a42ff3e-154a-b2b8-9c99-8d5fba9a38e5@quicinc.com>
@ 2022-03-03 21:11     ` Rob Clark
  2022-03-03 21:22       ` Abhinav Kumar
  0 siblings, 1 reply; 11+ messages in thread
From: Rob Clark @ 2022-03-03 21:11 UTC (permalink / raw)
  To: Abhinav Kumar, Dmitry Baryshkov
  Cc: dri-devel, freedreno, linux-arm-msm, Jordan Crouse,
	Akhil P Oommen, Rob Clark, Sean Paul, David Airlie, Daniel Vetter,
	Stephen Boyd, open list

On Thu, Mar 3, 2022 at 12:42 PM Abhinav Kumar <quic_abhinavk@quicinc.com> wrote:
>
> Hi Rob
>
> On 3/3/2022 11:46 AM, Rob Clark wrote:
> > From: Rob Clark <robdclark@chromium.org>
> >
> > Update headers from mesa commit:
> >
> >    commit 7e63fa2bb13cf14b765ad06d046789ee1879b5ef
> >    Author:     Rob Clark <robclark@freedesktop.org>
> >    AuthorDate: Wed Mar 2 17:11:10 2022 -0800
> >
> >        freedreno/registers: Add a couple regs we need for kernel
> >
> >        Signed-off-by: Rob Clark <robdclark@chromium.org>
> >        Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15221>
> >
> > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > ---
> >   drivers/gpu/drm/msm/adreno/a2xx.xml.h         |  26 +-
> >   drivers/gpu/drm/msm/adreno/a3xx.xml.h         |  30 +-
> >   drivers/gpu/drm/msm/adreno/a4xx.xml.h         | 112 ++-
> >   drivers/gpu/drm/msm/adreno/a5xx.xml.h         |  63 +-
> >   drivers/gpu/drm/msm/adreno/a6xx.xml.h         | 674 +++++++++++-------
> >   drivers/gpu/drm/msm/adreno/a6xx_gmu.xml.h     |  26 +-
> >   .../gpu/drm/msm/adreno/adreno_common.xml.h    |  31 +-
> >   drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h   |  46 +-
> >   drivers/gpu/drm/msm/disp/mdp4/mdp4.xml.h      |  37 +-
> >   drivers/gpu/drm/msm/disp/mdp5/mdp5.xml.h      |  37 +-
> >   drivers/gpu/drm/msm/disp/mdp_common.xml.h     |  37 +-
> >   drivers/gpu/drm/msm/dsi/dsi.xml.h             |  37 +-
> >   drivers/gpu/drm/msm/dsi/dsi_phy_10nm.xml.h    |  37 +-
> >   drivers/gpu/drm/msm/dsi/dsi_phy_14nm.xml.h    |  37 +-
> >   drivers/gpu/drm/msm/dsi/dsi_phy_20nm.xml.h    |  37 +-
> >   drivers/gpu/drm/msm/dsi/dsi_phy_28nm.xml.h    |  37 +-
> >   .../gpu/drm/msm/dsi/dsi_phy_28nm_8960.xml.h   |  37 +-
> >   drivers/gpu/drm/msm/dsi/dsi_phy_5nm.xml.h     | 480 -------------
> Why is the 5nm PHY removed? Am i missing something?

Dmitry removed it in mesa, because it was identical to 7nm

BR,
-R

>
> Thanks
>
> Abhinav

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/4] drm/msm: Add SYSPROF param
  2022-03-03 20:47   ` Stephen Boyd
@ 2022-03-03 21:17     ` Rob Clark
  2022-03-03 21:47       ` Rob Clark
  0 siblings, 1 reply; 11+ messages in thread
From: Rob Clark @ 2022-03-03 21:17 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: dri-devel, freedreno, linux-arm-msm, Jordan Crouse,
	Akhil P Oommen, Rob Clark, Sean Paul, Abhinav Kumar, David Airlie,
	Daniel Vetter, Jonathan Marek, Emma Anholt, Yangtao Li,
	Linux Kernel Mailing List

On Thu, Mar 3, 2022 at 12:47 PM Stephen Boyd <swboyd@chromium.org> wrote:
>
> Quoting Rob Clark (2022-03-03 11:46:47)
> > diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
> > index fde9a29f884e..0ba1dbd4e50f 100644
> > --- a/drivers/gpu/drm/msm/msm_gpu.h
> > +++ b/drivers/gpu/drm/msm/msm_gpu.h
> > @@ -330,6 +337,24 @@ struct msm_file_private {
> >         struct kref ref;
> >         int seqno;
> >
> > +       /**
> > +        * sysprof:
> > +        *
> > +        * The value of MSM_PARAM_SYSPROF set by userspace.  This is
> > +        * intended to be used by system profiling tools like Mesa's
> > +        * pps-producer (perfetto), and restricted to CAP_SYS_ADMIN.
> > +        *
> > +        * Setting a value of 1 will preserve performance counters across
> > +        * context switches.  Setting a value of 2 will in addition
> > +        * suppress suspend.  (Performance counters loose  state across
>
> s/loose  /lose/

fixed locally

> > +        * power collapse, which is undesirable for profiling in some
> > +        * cases.)
> > +        *
> > +        * The value automatically reverts to zero when the drm device
> > +        * file is closed.
> > +        */
> > +       int sysprof;
> > +
> >         /**
> >          * elapsed:
> >          *
> > diff --git a/drivers/gpu/drm/msm/msm_submitqueue.c b/drivers/gpu/drm/msm/msm_submitqueue.c
> > index 7cb158bcbcf6..4179db54ac93 100644
> > --- a/drivers/gpu/drm/msm/msm_submitqueue.c
> > +++ b/drivers/gpu/drm/msm/msm_submitqueue.c
> > @@ -7,6 +7,40 @@
> >
> >  #include "msm_gpu.h"
> >
> > +int msm_file_private_set_sysprof(struct msm_file_private *ctx,
> > +                                struct msm_gpu *gpu, int sysprof)
> > +{
> > +       /* unwind old value first: */
> > +       switch (ctx->sysprof) {
> > +       case 2:
> > +               pm_runtime_put_autosuspend(&gpu->pdev->dev);
> > +               fallthrough;
> > +       case 1:
> > +               refcount_dec(&gpu->sysprof_active);
> > +               fallthrough;
> > +       case 0:
> > +               break;
> > +       }
> > +
> > +       /* then apply new value: */
>
> It would be safer to swap this. Otherwise a set when the values are at
> "1" would drop to "zero" here and potentially trigger some glitch,
> whereas incrementing one more time and then dropping the previous state
> would avoid that short blip.
>
> > +       switch (sysprof) {
> > +       default:
> > +               return -EINVAL;
>
> This will become more complicated though.

Right, that is why I took the "unwind first and then re-apply"
approach.. in practice I expect userspace to set the value before it
starts sampling counter values, so I wasn't too concerned about this
racing with a submit and clearing the counters.  (Plus any glitch if
userspace did decide to change it dynamically would just be transient
and not really a big deal.)

BR,
-R

> > +       case 2:
> > +               pm_runtime_get_sync(&gpu->pdev->dev);
> > +               fallthrough;
> > +       case 1:
> > +               refcount_inc(&gpu->sysprof_active);
> > +               fallthrough;
> > +       case 0:
> > +               break;
> > +       }
> > +
> > +       ctx->sysprof = sysprof;
> > +
> > +       return 0;
> > +}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/4] drm/msm: Update generated headers
  2022-03-03 21:11     ` [PATCH 1/4] drm/msm: Update generated headers Rob Clark
@ 2022-03-03 21:22       ` Abhinav Kumar
  0 siblings, 0 replies; 11+ messages in thread
From: Abhinav Kumar @ 2022-03-03 21:22 UTC (permalink / raw)
  To: Rob Clark, Dmitry Baryshkov
  Cc: dri-devel, freedreno, linux-arm-msm, Jordan Crouse,
	Akhil P Oommen, Rob Clark, Sean Paul, David Airlie, Daniel Vetter,
	Stephen Boyd, open list



On 3/3/2022 1:11 PM, Rob Clark wrote:
> On Thu, Mar 3, 2022 at 12:42 PM Abhinav Kumar <quic_abhinavk@quicinc.com> wrote:
>>
>> Hi Rob
>>
>> On 3/3/2022 11:46 AM, Rob Clark wrote:
>>> From: Rob Clark <robdclark@chromium.org>
>>>
>>> Update headers from mesa commit:
>>>
>>>     commit 7e63fa2bb13cf14b765ad06d046789ee1879b5ef
>>>     Author:     Rob Clark <robclark@freedesktop.org>
>>>     AuthorDate: Wed Mar 2 17:11:10 2022 -0800
>>>
>>>         freedreno/registers: Add a couple regs we need for kernel
>>>
>>>         Signed-off-by: Rob Clark <robdclark@chromium.org>
>>>         Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15221>
>>>
>>> Signed-off-by: Rob Clark <robdclark@chromium.org>
>>> ---
>>>    drivers/gpu/drm/msm/adreno/a2xx.xml.h         |  26 +-
>>>    drivers/gpu/drm/msm/adreno/a3xx.xml.h         |  30 +-
>>>    drivers/gpu/drm/msm/adreno/a4xx.xml.h         | 112 ++-
>>>    drivers/gpu/drm/msm/adreno/a5xx.xml.h         |  63 +-
>>>    drivers/gpu/drm/msm/adreno/a6xx.xml.h         | 674 +++++++++++-------
>>>    drivers/gpu/drm/msm/adreno/a6xx_gmu.xml.h     |  26 +-
>>>    .../gpu/drm/msm/adreno/adreno_common.xml.h    |  31 +-
>>>    drivers/gpu/drm/msm/adreno/adreno_pm4.xml.h   |  46 +-
>>>    drivers/gpu/drm/msm/disp/mdp4/mdp4.xml.h      |  37 +-
>>>    drivers/gpu/drm/msm/disp/mdp5/mdp5.xml.h      |  37 +-
>>>    drivers/gpu/drm/msm/disp/mdp_common.xml.h     |  37 +-
>>>    drivers/gpu/drm/msm/dsi/dsi.xml.h             |  37 +-
>>>    drivers/gpu/drm/msm/dsi/dsi_phy_10nm.xml.h    |  37 +-
>>>    drivers/gpu/drm/msm/dsi/dsi_phy_14nm.xml.h    |  37 +-
>>>    drivers/gpu/drm/msm/dsi/dsi_phy_20nm.xml.h    |  37 +-
>>>    drivers/gpu/drm/msm/dsi/dsi_phy_28nm.xml.h    |  37 +-
>>>    .../gpu/drm/msm/dsi/dsi_phy_28nm_8960.xml.h   |  37 +-
>>>    drivers/gpu/drm/msm/dsi/dsi_phy_5nm.xml.h     | 480 -------------
>> Why is the 5nm PHY removed? Am i missing something?
> 
> Dmitry removed it in mesa, because it was identical to 7nm
> 
> BR,
> -R
Alright got it, for the display bits,
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
> 
>>
>> Thanks
>>
>> Abhinav

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/4] drm/msm: Add SYSPROF param
  2022-03-03 21:17     ` Rob Clark
@ 2022-03-03 21:47       ` Rob Clark
  2022-03-03 22:36         ` Stephen Boyd
  0 siblings, 1 reply; 11+ messages in thread
From: Rob Clark @ 2022-03-03 21:47 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: dri-devel, freedreno, linux-arm-msm, Jordan Crouse,
	Akhil P Oommen, Rob Clark, Sean Paul, Abhinav Kumar, David Airlie,
	Daniel Vetter, Jonathan Marek, Emma Anholt, Yangtao Li,
	Linux Kernel Mailing List

On Thu, Mar 3, 2022 at 1:17 PM Rob Clark <robdclark@gmail.com> wrote:
>
> On Thu, Mar 3, 2022 at 12:47 PM Stephen Boyd <swboyd@chromium.org> wrote:
> >
> > Quoting Rob Clark (2022-03-03 11:46:47)
> > > diff --git a/drivers/gpu/drm/msm/msm_gpu.h b/drivers/gpu/drm/msm/msm_gpu.h
> > > index fde9a29f884e..0ba1dbd4e50f 100644
> > > --- a/drivers/gpu/drm/msm/msm_gpu.h
> > > +++ b/drivers/gpu/drm/msm/msm_gpu.h
> > > @@ -330,6 +337,24 @@ struct msm_file_private {
> > >         struct kref ref;
> > >         int seqno;
> > >
> > > +       /**
> > > +        * sysprof:
> > > +        *
> > > +        * The value of MSM_PARAM_SYSPROF set by userspace.  This is
> > > +        * intended to be used by system profiling tools like Mesa's
> > > +        * pps-producer (perfetto), and restricted to CAP_SYS_ADMIN.
> > > +        *
> > > +        * Setting a value of 1 will preserve performance counters across
> > > +        * context switches.  Setting a value of 2 will in addition
> > > +        * suppress suspend.  (Performance counters loose  state across
> >
> > s/loose  /lose/
>
> fixed locally
>
> > > +        * power collapse, which is undesirable for profiling in some
> > > +        * cases.)
> > > +        *
> > > +        * The value automatically reverts to zero when the drm device
> > > +        * file is closed.
> > > +        */
> > > +       int sysprof;
> > > +
> > >         /**
> > >          * elapsed:
> > >          *
> > > diff --git a/drivers/gpu/drm/msm/msm_submitqueue.c b/drivers/gpu/drm/msm/msm_submitqueue.c
> > > index 7cb158bcbcf6..4179db54ac93 100644
> > > --- a/drivers/gpu/drm/msm/msm_submitqueue.c
> > > +++ b/drivers/gpu/drm/msm/msm_submitqueue.c
> > > @@ -7,6 +7,40 @@
> > >
> > >  #include "msm_gpu.h"
> > >
> > > +int msm_file_private_set_sysprof(struct msm_file_private *ctx,
> > > +                                struct msm_gpu *gpu, int sysprof)
> > > +{
> > > +       /* unwind old value first: */
> > > +       switch (ctx->sysprof) {
> > > +       case 2:
> > > +               pm_runtime_put_autosuspend(&gpu->pdev->dev);
> > > +               fallthrough;
> > > +       case 1:
> > > +               refcount_dec(&gpu->sysprof_active);
> > > +               fallthrough;
> > > +       case 0:
> > > +               break;
> > > +       }
> > > +
> > > +       /* then apply new value: */
> >
> > It would be safer to swap this. Otherwise a set when the values are at
> > "1" would drop to "zero" here and potentially trigger some glitch,
> > whereas incrementing one more time and then dropping the previous state
> > would avoid that short blip.
> >
> > > +       switch (sysprof) {
> > > +       default:
> > > +               return -EINVAL;
> >
> > This will become more complicated though.
>
> Right, that is why I took the "unwind first and then re-apply"
> approach.. in practice I expect userspace to set the value before it
> starts sampling counter values, so I wasn't too concerned about this
> racing with a submit and clearing the counters.  (Plus any glitch if
> userspace did decide to change it dynamically would just be transient
> and not really a big deal.)

Actually I could just swap the two switch's.. the result would be that
an EINVAL would not change the state instead of dropping the state to
zero.  Maybe that is better anyways

BR,
-R


> BR,
> -R
>
> > > +       case 2:
> > > +               pm_runtime_get_sync(&gpu->pdev->dev);
> > > +               fallthrough;
> > > +       case 1:
> > > +               refcount_inc(&gpu->sysprof_active);
> > > +               fallthrough;
> > > +       case 0:
> > > +               break;
> > > +       }
> > > +
> > > +       ctx->sysprof = sysprof;
> > > +
> > > +       return 0;
> > > +}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/4] drm/msm: Add SYSPROF param
  2022-03-03 21:47       ` Rob Clark
@ 2022-03-03 22:36         ` Stephen Boyd
  0 siblings, 0 replies; 11+ messages in thread
From: Stephen Boyd @ 2022-03-03 22:36 UTC (permalink / raw)
  To: Rob Clark
  Cc: dri-devel, freedreno, linux-arm-msm, Jordan Crouse,
	Akhil P Oommen, Rob Clark, Sean Paul, Abhinav Kumar, David Airlie,
	Daniel Vetter, Jonathan Marek, Emma Anholt, Yangtao Li,
	Linux Kernel Mailing List

Quoting Rob Clark (2022-03-03 13:47:14)
> On Thu, Mar 3, 2022 at 1:17 PM Rob Clark <robdclark@gmail.com> wrote:
> >
> > On Thu, Mar 3, 2022 at 12:47 PM Stephen Boyd <swboyd@chromium.org> wrote:
> > >
> > > Quoting Rob Clark (2022-03-03 11:46:47)
> > > > +
> > > > +       /* then apply new value: */
> > >
> > > It would be safer to swap this. Otherwise a set when the values are at
> > > "1" would drop to "zero" here and potentially trigger some glitch,
> > > whereas incrementing one more time and then dropping the previous state
> > > would avoid that short blip.
> > >
> > > > +       switch (sysprof) {
> > > > +       default:
> > > > +               return -EINVAL;
> > >
> > > This will become more complicated though.
> >
> > Right, that is why I took the "unwind first and then re-apply"
> > approach.. in practice I expect userspace to set the value before it
> > starts sampling counter values, so I wasn't too concerned about this
> > racing with a submit and clearing the counters.  (Plus any glitch if
> > userspace did decide to change it dynamically would just be transient
> > and not really a big deal.)
>
> Actually I could just swap the two switch's.. the result would be that
> an EINVAL would not change the state instead of dropping the state to
> zero.  Maybe that is better anyways
>

Yeah it isn't clear to me what should happen if the new state is
invalid. Outright rejection is probably better than replacing the
previous state with an invalid state.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 4/4] drm/msm/a6xx: Zap counters across context switch
  2022-03-04  0:52 [PATCH 0/4] drm/msm: Clear perf counters across context switch Rob Clark
@ 2022-03-04  0:52 ` Rob Clark
  0 siblings, 0 replies; 11+ messages in thread
From: Rob Clark @ 2022-03-04  0:52 UTC (permalink / raw)
  To: dri-devel
  Cc: freedreno, linux-arm-msm, Jordan Crouse, Akhil P Oommen,
	Rob Clark, Rob Clark, Sean Paul, Abhinav Kumar, David Airlie,
	Daniel Vetter, Dmitry Baryshkov, Jonathan Marek, open list

From: Rob Clark <robdclark@chromium.org>

Any app controlled perfcntr collection (GL_AMD_performance_monitor, etc)
does not require counters to maintain state across context switches.  So
clear them if systemwide profiling is not active.

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 29 +++++++++++++++++++++++++++
 1 file changed, 29 insertions(+)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index 237c2e7a7baa..02b47977b5c3 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -101,6 +101,7 @@ static void get_stats_counter(struct msm_ringbuffer *ring, u32 counter,
 static void a6xx_set_pagetable(struct a6xx_gpu *a6xx_gpu,
 		struct msm_ringbuffer *ring, struct msm_file_private *ctx)
 {
+	bool sysprof = refcount_read(&a6xx_gpu->base.base.sysprof_active) > 1;
 	phys_addr_t ttbr;
 	u32 asid;
 	u64 memptr = rbmemptr(ring, ttbr0);
@@ -111,6 +112,15 @@ static void a6xx_set_pagetable(struct a6xx_gpu *a6xx_gpu,
 	if (msm_iommu_pagetable_params(ctx->aspace->mmu, &ttbr, &asid))
 		return;
 
+	if (!sysprof) {
+		/* Turn off protected mode to write to special registers */
+		OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1);
+		OUT_RING(ring, 0);
+
+		OUT_PKT4(ring, REG_A6XX_RBBM_PERFCTR_SRAM_INIT_CMD, 1);
+		OUT_RING(ring, 1);
+	}
+
 	/* Execute the table update */
 	OUT_PKT7(ring, CP_SMMU_TABLE_UPDATE, 4);
 	OUT_RING(ring, CP_SMMU_TABLE_UPDATE_0_TTBR0_LO(lower_32_bits(ttbr)));
@@ -137,6 +147,25 @@ static void a6xx_set_pagetable(struct a6xx_gpu *a6xx_gpu,
 
 	OUT_PKT7(ring, CP_EVENT_WRITE, 1);
 	OUT_RING(ring, 0x31);
+
+	if (!sysprof) {
+		/*
+		 * Wait for SRAM clear after the pgtable update, so the
+		 * two can happen in parallel:
+		 */
+		OUT_PKT7(ring, CP_WAIT_REG_MEM, 6);
+		OUT_RING(ring, CP_WAIT_REG_MEM_0_FUNCTION(WRITE_EQ));
+		OUT_RING(ring, CP_WAIT_REG_MEM_1_POLL_ADDR_LO(
+				REG_A6XX_RBBM_PERFCTR_SRAM_INIT_STATUS));
+		OUT_RING(ring, CP_WAIT_REG_MEM_2_POLL_ADDR_HI(0));
+		OUT_RING(ring, CP_WAIT_REG_MEM_3_REF(0x1));
+		OUT_RING(ring, CP_WAIT_REG_MEM_4_MASK(0x1));
+		OUT_RING(ring, CP_WAIT_REG_MEM_5_DELAY_LOOP_CYCLES(0));
+
+		/* Re-enable protected mode: */
+		OUT_PKT7(ring, CP_SET_PROTECTED_MODE, 1);
+		OUT_RING(ring, 1);
+	}
 }
 
 static void a6xx_submit(struct msm_gpu *gpu, struct msm_gem_submit *submit)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-03-04  0:59 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-03-03 19:46 [PATCH 0/4] drm/msm: Clear perf counters across context switch Rob Clark
2022-03-03 19:46 ` [PATCH 2/4] drm/msm: Add SET_PARAM ioctl Rob Clark
2022-03-03 19:46 ` [PATCH 3/4] drm/msm: Add SYSPROF param Rob Clark
2022-03-03 20:47   ` Stephen Boyd
2022-03-03 21:17     ` Rob Clark
2022-03-03 21:47       ` Rob Clark
2022-03-03 22:36         ` Stephen Boyd
2022-03-03 19:46 ` [PATCH 4/4] drm/msm/a6xx: Zap counters across context switch Rob Clark
     [not found] ` <20220303194758.710358-2-robdclark@gmail.com>
     [not found]   ` <1a42ff3e-154a-b2b8-9c99-8d5fba9a38e5@quicinc.com>
2022-03-03 21:11     ` [PATCH 1/4] drm/msm: Update generated headers Rob Clark
2022-03-03 21:22       ` Abhinav Kumar
  -- strict thread matches above, loose matches on Subject: below --
2022-03-04  0:52 [PATCH 0/4] drm/msm: Clear perf counters across context switch Rob Clark
2022-03-04  0:52 ` [PATCH 4/4] drm/msm/a6xx: Zap " Rob Clark

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox