public inbox for intel-gfx@lists.freedesktop.org
 help / color / mirror / Atom feed
* [PATCH 0/7] drm-intel-collector - update
@ 2014-11-24 16:29 Rodrigo Vivi
  2014-11-24 16:29 ` [PATCH 1/7] drm/i915: Specify bsd rings through exec flag Rodrigo Vivi
                   ` (6 more replies)
  0 siblings, 7 replies; 18+ messages in thread
From: Rodrigo Vivi @ 2014-11-24 16:29 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rodrigo Vivi

This is another drm-intel-collector updated notice:
http://cgit.freedesktop.org/~vivijim/drm-intel/log/?h=drm-intel-collector

Here goes the update list in order for better reviewers assignment:

Patch     drm/i915: Specify bsd rings through exec flag - Reviewed
Patch     drm/i915: add I915_PARAM_HAS_BSD2 to i915_getparam - Reviewed
Patch     drm/i915: Move the ban period onto the context - Reviewer: Mika
Patch     drm/i915: Add ioctl to set per-context parameters - Reviewer: Rodrigo
Patch     drm/i915: Put logical pipe_control emission into a helper. - Reviewer: Mika
Patch     drm/i915: Add WaCsStallBeforeStateCacheInvalidate:bdw, chv to logical ring - Reviewer: Mika
Patch     drm/i915: Broaden application of set-domain(GTT) - Reviewed

This round is from discussion on small patches finished between Oct 24 and Nov 07.

Thanks in advance,
Rodrigo.


Chris Wilson (3):
  drm/i915: Move the ban period onto the context
  drm/i915: Add ioctl to set per-context parameters
  drm/i915: Broaden application of set-domain(GTT)

Rodrigo Vivi (2):
  drm/i915: Put logical pipe_control emission into a helper.
  drm/i915: Add WaCsStallBeforeStateCacheInvalidate:bdw, chv to logical
    ring

Zhipeng Gong (2):
  drm/i915: Specify bsd rings through exec flag
  drm/i915: add I915_PARAM_HAS_BSD2 to i915_getparam

 drivers/gpu/drm/i915/i915_dma.c            |  5 +++
 drivers/gpu/drm/i915/i915_drv.h            |  9 ++++
 drivers/gpu/drm/i915/i915_gem.c            | 46 +++++++++----------
 drivers/gpu/drm/i915/i915_gem_context.c    | 71 ++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 19 +++++++-
 drivers/gpu/drm/i915/intel_lrc.c           | 41 ++++++++++++-----
 include/uapi/drm/i915_drm.h                | 21 ++++++++-
 7 files changed, 172 insertions(+), 40 deletions(-)

-- 
1.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/7] drm/i915: Specify bsd rings through exec flag
  2014-11-24 16:29 [PATCH 0/7] drm-intel-collector - update Rodrigo Vivi
@ 2014-11-24 16:29 ` Rodrigo Vivi
  2014-11-25 13:04   ` Daniel Vetter
  2014-11-24 16:29 ` [PATCH 2/7] drm/i915: add I915_PARAM_HAS_BSD2 to i915_getparam Rodrigo Vivi
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 18+ messages in thread
From: Rodrigo Vivi @ 2014-11-24 16:29 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rodrigo Vivi

From: Zhipeng Gong <zhipeng.gong@intel.com>

On Broadwell GT3 we have 2 Video Command Streamers (VCS), but userspace
has no control when using VCS1 or VCS2. This patch introduces a mechanism
to avoid the default ping-pong mode and use one specific ring through
execution flag.

v2: fix whitespace (Rodrigo)
v3: remove incorrect chunk that came on -collector rebase. (Rodrigo)

Signed-off-by: Zhipeng Gong <zhipeng.gong@intel.com>
Reviewed-by-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 19 +++++++++++++++++--
 include/uapi/drm/i915_drm.h                |  8 +++++++-
 2 files changed, 24 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index f06027b..2ef60c3 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1317,8 +1317,23 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	else if ((args->flags & I915_EXEC_RING_MASK) == I915_EXEC_BSD) {
 		if (HAS_BSD2(dev)) {
 			int ring_id;
-			ring_id = gen8_dispatch_bsd_ring(dev, file);
-			ring = &dev_priv->ring[ring_id];
+
+			switch (args->flags & I915_EXEC_BSD_MASK) {
+			case I915_EXEC_BSD_DEFAULT:
+				ring_id = gen8_dispatch_bsd_ring(dev, file);
+				ring = &dev_priv->ring[ring_id];
+				break;
+			case I915_EXEC_BSD_RING1:
+				ring = &dev_priv->ring[VCS];
+				break;
+			case I915_EXEC_BSD_RING2:
+				ring = &dev_priv->ring[VCS2];
+				break;
+			default:
+				DRM_DEBUG("execbuf with unknown bsd ring: %d\n",
+					  (int)(args->flags & I915_EXEC_BSD_MASK));
+				return -EINVAL;
+			}
 		} else
 			ring = &dev_priv->ring[VCS];
 	} else
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 2502622..fcb16bf 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -737,7 +737,13 @@ struct drm_i915_gem_execbuffer2 {
  */
 #define I915_EXEC_HANDLE_LUT		(1<<12)
 
-#define __I915_EXEC_UNKNOWN_FLAGS -(I915_EXEC_HANDLE_LUT<<1)
+/** Used for switching BSD rings on the platforms with two BSD rings */
+#define I915_EXEC_BSD_MASK		(3<<13)
+#define I915_EXEC_BSD_DEFAULT		(0<<13) /* default ping-pong mode */
+#define I915_EXEC_BSD_RING1		(1<<13)
+#define I915_EXEC_BSD_RING2		(2<<13)
+
+#define __I915_EXEC_UNKNOWN_FLAGS -(1<<15)
 
 #define I915_EXEC_CONTEXT_ID_MASK	(0xffffffff)
 #define i915_execbuffer2_set_context_id(eb2, context) \
-- 
1.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/7] drm/i915: add I915_PARAM_HAS_BSD2 to i915_getparam
  2014-11-24 16:29 [PATCH 0/7] drm-intel-collector - update Rodrigo Vivi
  2014-11-24 16:29 ` [PATCH 1/7] drm/i915: Specify bsd rings through exec flag Rodrigo Vivi
@ 2014-11-24 16:29 ` Rodrigo Vivi
  2014-11-24 16:29 ` [PATCH 3/7] drm/i915: Move the ban period onto the context Rodrigo Vivi
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Rodrigo Vivi @ 2014-11-24 16:29 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rodrigo Vivi

From: Zhipeng Gong <zhipeng.gong@intel.com>

This will let userland only try to use the new ring
when the appropriate kernel is present

Signed-off-by: Zhipeng Gong <zhipeng.gong@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed--by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c | 3 +++
 include/uapi/drm/i915_drm.h     | 1 +
 2 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index ecee3bc..9db5f56 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -92,6 +92,9 @@ static int i915_getparam(struct drm_device *dev, void *data,
 	case I915_PARAM_HAS_VEBOX:
 		value = intel_ring_initialized(&dev_priv->ring[VECS]);
 		break;
+	case I915_PARAM_HAS_BSD2:
+		value = intel_ring_initialized(&dev_priv->ring[VCS2]);
+		break;
 	case I915_PARAM_HAS_RELAXED_FENCING:
 		value = 1;
 		break;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index fcb16bf..fa99129 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -341,6 +341,7 @@ typedef struct drm_i915_irq_wait {
 #define I915_PARAM_HAS_WT     	 	 27
 #define I915_PARAM_CMD_PARSER_VERSION	 28
 #define I915_PARAM_HAS_COHERENT_PHYS_GTT 29
+#define I915_PARAM_HAS_BSD2		 30
 
 typedef struct drm_i915_getparam {
 	int param;
-- 
1.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/7] drm/i915: Move the ban period onto the context
  2014-11-24 16:29 [PATCH 0/7] drm-intel-collector - update Rodrigo Vivi
  2014-11-24 16:29 ` [PATCH 1/7] drm/i915: Specify bsd rings through exec flag Rodrigo Vivi
  2014-11-24 16:29 ` [PATCH 2/7] drm/i915: add I915_PARAM_HAS_BSD2 to i915_getparam Rodrigo Vivi
@ 2014-11-24 16:29 ` Rodrigo Vivi
  2014-11-24 16:29 ` [PATCH 4/7] drm/i915: Add ioctl to set per-context parameters Rodrigo Vivi
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Rodrigo Vivi @ 2014-11-24 16:29 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rodrigo Vivi

From: Chris Wilson <chris@chris-wilson.co.uk>

This will allow us to set per-file, or even per-context, periods in the
future.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         | 5 +++++
 drivers/gpu/drm/i915/i915_gem.c         | 3 ++-
 drivers/gpu/drm/i915/i915_gem_context.c | 2 ++
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2bf4ba0..e683cf6 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -637,6 +637,11 @@ struct i915_ctx_hang_stats {
 	/* Time when this context was last blamed for a GPU reset */
 	unsigned long guilty_ts;
 
+	/* If the contexts causes a second GPU hang within this time,
+	 * it is permanently banned from submitting any more work.
+	 */
+	unsigned long ban_period_seconds;
+
 	/* This context is banned to submit more work */
 	bool banned;
 };
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index fd17cca..1d58bc5 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2533,7 +2533,8 @@ static bool i915_context_is_banned(struct drm_i915_private *dev_priv,
 	if (ctx->hang_stats.banned)
 		return true;
 
-	if (elapsed <= DRM_I915_CTX_BAN_PERIOD) {
+	if (ctx->hang_stats.ban_period_seconds &&
+	    elapsed <= ctx->hang_stats.ban_period_seconds) {
 		if (!i915_gem_context_is_default(ctx)) {
 			DRM_DEBUG("context hanging too fast, banning!\n");
 			return true;
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index d17ff43..6ca3a1c 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -222,6 +222,8 @@ __create_hw_context(struct drm_device *dev,
 	 * is no remap info, it will be a NOP. */
 	ctx->remap_slice = (1 << NUM_L3_SLICES(dev)) - 1;
 
+	ctx->hang_stats.ban_period_seconds = DRM_I915_CTX_BAN_PERIOD;
+
 	return ctx;
 
 err_out:
-- 
1.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/7] drm/i915: Add ioctl to set per-context parameters
  2014-11-24 16:29 [PATCH 0/7] drm-intel-collector - update Rodrigo Vivi
                   ` (2 preceding siblings ...)
  2014-11-24 16:29 ` [PATCH 3/7] drm/i915: Move the ban period onto the context Rodrigo Vivi
@ 2014-11-24 16:29 ` Rodrigo Vivi
  2014-11-24 16:29 ` [PATCH 5/7] drm/i915: Put logical pipe_control emission into a helper Rodrigo Vivi
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Rodrigo Vivi @ 2014-11-24 16:29 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rodrigo Vivi

From: Chris Wilson <chris@chris-wilson.co.uk>

Sometimes we wish to tweak how an individual context behaves. Since we
always create a context for every filp, this means that individual
processes can fine tune their behaviour even if they do not explicitly
create a context.

The first example parameter here is to enable multi-process GPU testing,
but the interface should be able to cope with passing arbitrarily complex
parameters.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c         |  2 +
 drivers/gpu/drm/i915/i915_drv.h         |  4 ++
 drivers/gpu/drm/i915/i915_gem_context.c | 69 +++++++++++++++++++++++++++++++++
 include/uapi/drm/i915_drm.h             | 12 ++++++
 4 files changed, 87 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 9db5f56..bb4367a 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1058,6 +1058,8 @@ const struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_REG_READ, i915_reg_read_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GET_RESET_STATS, i915_get_reset_stats_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_USERPTR, i915_gem_userptr_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_GETPARAM, i915_gem_context_getparam_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_SETPARAM, i915_gem_context_setparam_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
 };
 
 int i915_max_ioctl = ARRAY_SIZE(i915_ioctls);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e683cf6..a499e7a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2727,6 +2727,10 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
 				  struct drm_file *file);
 int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 				   struct drm_file *file);
+int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
+				    struct drm_file *file_priv);
+int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
+				    struct drm_file *file_priv);
 
 /* i915_gem_evict.c */
 int __must_check i915_gem_evict_something(struct drm_device *dev,
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 6ca3a1c..2b19a31 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -749,3 +749,72 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 	DRM_DEBUG_DRIVER("HW context %d destroyed\n", args->ctx_id);
 	return 0;
 }
+
+int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
+				    struct drm_file *file)
+{
+	struct drm_i915_file_private *file_priv = file->driver_priv;
+	struct drm_i915_gem_context_param *args = data;
+	struct intel_context *ctx;
+	int ret;
+
+	ret = i915_mutex_lock_interruptible(dev);
+	if (ret)
+		return ret;
+
+	ctx = i915_gem_context_get(file_priv, args->ctx_id);
+	if (IS_ERR(ctx)) {
+		mutex_unlock(&dev->struct_mutex);
+		return PTR_ERR(ctx);
+	}
+
+	args->size = 0;
+	switch (args->param) {
+	case I915_CONTEXT_PARAM_BAN_PERIOD:
+		args->value = ctx->hang_stats.ban_period_seconds;
+		break;
+	default:
+		ret = -EINVAL;
+		break;
+	}
+	mutex_unlock(&dev->struct_mutex);
+
+	return ret;
+}
+
+int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
+				    struct drm_file *file)
+{
+	struct drm_i915_file_private *file_priv = file->driver_priv;
+	struct drm_i915_gem_context_param *args = data;
+	struct intel_context *ctx;
+	int ret;
+
+	ret = i915_mutex_lock_interruptible(dev);
+	if (ret)
+		return ret;
+
+	ctx = i915_gem_context_get(file_priv, args->ctx_id);
+	if (IS_ERR(ctx)) {
+		mutex_unlock(&dev->struct_mutex);
+		return PTR_ERR(ctx);
+	}
+
+	switch (args->param) {
+	case I915_CONTEXT_PARAM_BAN_PERIOD:
+		if (args->size)
+			ret = -EINVAL;
+		else if (args->value < ctx->hang_stats.ban_period_seconds &&
+			 !capable(CAP_SYS_ADMIN))
+			ret = -EPERM;
+		else
+			ctx->hang_stats.ban_period_seconds = args->value;
+		break;
+	default:
+		ret = -EINVAL;
+		break;
+	}
+	mutex_unlock(&dev->struct_mutex);
+
+	return ret;
+}
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index fa99129..d253c85 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -224,6 +224,8 @@ typedef struct _drm_i915_sarea {
 #define DRM_I915_REG_READ		0x31
 #define DRM_I915_GET_RESET_STATS	0x32
 #define DRM_I915_GEM_USERPTR		0x33
+#define DRM_I915_GEM_CONTEXT_GETPARAM	0x34
+#define DRM_I915_GEM_CONTEXT_SETPARAM	0x35
 
 #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
 #define DRM_IOCTL_I915_FLUSH		DRM_IO ( DRM_COMMAND_BASE + DRM_I915_FLUSH)
@@ -275,6 +277,8 @@ typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_REG_READ			DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_REG_READ, struct drm_i915_reg_read)
 #define DRM_IOCTL_I915_GET_RESET_STATS		DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GET_RESET_STATS, struct drm_i915_reset_stats)
 #define DRM_IOCTL_I915_GEM_USERPTR			DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_USERPTR, struct drm_i915_gem_userptr)
+#define DRM_IOCTL_I915_GEM_CONTEXT_GETPARAM	DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_GETPARAM, struct drm_i915_gem_context_param)
+#define DRM_IOCTL_I915_GEM_CONTEXT_SETPARAM	DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_SETPARAM, struct drm_i915_gem_context_param)
 
 /* Allow drivers to submit batchbuffers directly to hardware, relying
  * on the security mechanisms provided by hardware.
@@ -1080,4 +1084,12 @@ struct drm_i915_gem_userptr {
 	__u32 handle;
 };
 
+struct drm_i915_gem_context_param {
+	__u32 ctx_id;
+	__u32 size;
+	__u64 param;
+#define I915_CONTEXT_PARAM_BAN_PERIOD 0x1
+	__u64 value;
+};
+
 #endif /* _UAPI_I915_DRM_H_ */
-- 
1.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 5/7] drm/i915: Put logical pipe_control emission into a helper.
  2014-11-24 16:29 [PATCH 0/7] drm-intel-collector - update Rodrigo Vivi
                   ` (3 preceding siblings ...)
  2014-11-24 16:29 ` [PATCH 4/7] drm/i915: Add ioctl to set per-context parameters Rodrigo Vivi
@ 2014-11-24 16:29 ` Rodrigo Vivi
  2014-11-24 16:29 ` [PATCH 6/7] drm/i915: Add WaCsStallBeforeStateCacheInvalidate:bdw, chv to logical ring Rodrigo Vivi
  2014-11-24 16:29 ` [PATCH 7/7] drm/i915: Broaden application of set-domain(GTT) Rodrigo Vivi
  6 siblings, 0 replies; 18+ messages in thread
From: Rodrigo Vivi @ 2014-11-24 16:29 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rodrigo Vivi

To be used for a Workaroud. Similar to:

commit 884ceacee308f0e4616d0c933518af2639f7b1d8
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Sat Jun 28 02:04:20 2014 +0300

    drm/i915: Refactor Broadwell PIPE_CONTROL emission into a helper.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 35 +++++++++++++++++++++--------------
 1 file changed, 21 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index e588376..7506cde 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1259,6 +1259,26 @@ static int gen8_emit_flush(struct intel_ringbuffer *ringbuf,
 	return 0;
 }
 
+static int gen8_emit_pipe_control(struct intel_ringbuffer *ringbuf,
+				  u32 flags, u32 scratch_addr)
+{
+	int ret;
+
+	ret = intel_logical_ring_begin(ringbuf, 6);
+	if (ret)
+		return ret;
+
+	intel_logical_ring_emit(ringbuf, GFX_OP_PIPE_CONTROL(6));
+	intel_logical_ring_emit(ringbuf, flags);
+	intel_logical_ring_emit(ringbuf, scratch_addr);
+	intel_logical_ring_emit(ringbuf, 0);
+	intel_logical_ring_emit(ringbuf, 0);
+	intel_logical_ring_emit(ringbuf, 0);
+	intel_logical_ring_advance(ringbuf);
+
+	return 0;
+}
+
 static int gen8_emit_flush_render(struct intel_ringbuffer *ringbuf,
 				  u32 invalidate_domains,
 				  u32 flush_domains)
@@ -1266,7 +1286,6 @@ static int gen8_emit_flush_render(struct intel_ringbuffer *ringbuf,
 	struct intel_engine_cs *ring = ringbuf->ring;
 	u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES;
 	u32 flags = 0;
-	int ret;
 
 	flags |= PIPE_CONTROL_CS_STALL;
 
@@ -1286,19 +1305,7 @@ static int gen8_emit_flush_render(struct intel_ringbuffer *ringbuf,
 		flags |= PIPE_CONTROL_GLOBAL_GTT_IVB;
 	}
 
-	ret = intel_logical_ring_begin(ringbuf, 6);
-	if (ret)
-		return ret;
-
-	intel_logical_ring_emit(ringbuf, GFX_OP_PIPE_CONTROL(6));
-	intel_logical_ring_emit(ringbuf, flags);
-	intel_logical_ring_emit(ringbuf, scratch_addr);
-	intel_logical_ring_emit(ringbuf, 0);
-	intel_logical_ring_emit(ringbuf, 0);
-	intel_logical_ring_emit(ringbuf, 0);
-	intel_logical_ring_advance(ringbuf);
-
-	return 0;
+	return gen8_emit_pipe_control(ringbuf, flags, scratch_addr);
 }
 
 static u32 gen8_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency)
-- 
1.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 6/7] drm/i915: Add WaCsStallBeforeStateCacheInvalidate:bdw, chv to logical ring
  2014-11-24 16:29 [PATCH 0/7] drm-intel-collector - update Rodrigo Vivi
                   ` (4 preceding siblings ...)
  2014-11-24 16:29 ` [PATCH 5/7] drm/i915: Put logical pipe_control emission into a helper Rodrigo Vivi
@ 2014-11-24 16:29 ` Rodrigo Vivi
  2014-11-24 16:29 ` [PATCH 7/7] drm/i915: Broaden application of set-domain(GTT) Rodrigo Vivi
  6 siblings, 0 replies; 18+ messages in thread
From: Rodrigo Vivi @ 2014-11-24 16:29 UTC (permalink / raw)
  To: intel-gfx; +Cc: Rodrigo Vivi

Similar to:

commit 02c9f7e3cfe76a7f54ef03438c36aade86cc1c8b
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Mon Jan 27 14:20:16 2014 -0800

    drm/i915: Add the WaCsStallBeforeStateCacheInvalidate:bdw workaround.

    On Broadwell, any PIPE_CONTROL with the "State Cache Invalidate" bit set
    must be preceded by a PIPE_CONTROL with the "CS Stall" bit set.

    Documented on the BSpec 3D workarounds page.

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 7506cde..10c1a35 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1286,6 +1286,7 @@ static int gen8_emit_flush_render(struct intel_ringbuffer *ringbuf,
 	struct intel_engine_cs *ring = ringbuf->ring;
 	u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES;
 	u32 flags = 0;
+	int ret;
 
 	flags |= PIPE_CONTROL_CS_STALL;
 
@@ -1303,6 +1304,15 @@ static int gen8_emit_flush_render(struct intel_ringbuffer *ringbuf,
 		flags |= PIPE_CONTROL_STATE_CACHE_INVALIDATE;
 		flags |= PIPE_CONTROL_QW_WRITE;
 		flags |= PIPE_CONTROL_GLOBAL_GTT_IVB;
+
+
+		/* WaCsStallBeforeStateCacheInvalidate:bdw,chv */
+		ret = gen8_emit_pipe_control(ring,
+					     PIPE_CONTROL_CS_STALL |
+					     PIPE_CONTROL_STALL_AT_SCOREBOARD,
+					     0);
+		if (ret)
+			return ret;
 	}
 
 	return gen8_emit_pipe_control(ringbuf, flags, scratch_addr);
-- 
1.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 7/7] drm/i915: Broaden application of set-domain(GTT)
  2014-11-24 16:29 [PATCH 0/7] drm-intel-collector - update Rodrigo Vivi
                   ` (5 preceding siblings ...)
  2014-11-24 16:29 ` [PATCH 6/7] drm/i915: Add WaCsStallBeforeStateCacheInvalidate:bdw, chv to logical ring Rodrigo Vivi
@ 2014-11-24 16:29 ` Rodrigo Vivi
  2014-11-25 12:43   ` [PATCH 7/7] drm/i915: Broaden application of shuang.he
  2014-11-25 13:05   ` [PATCH 7/7] drm/i915: Broaden application of set-domain(GTT) Daniel Vetter
  6 siblings, 2 replies; 18+ messages in thread
From: Rodrigo Vivi @ 2014-11-24 16:29 UTC (permalink / raw)
  To: intel-gfx; +Cc: Akash Goel, Rodrigo Vivi

From: Chris Wilson <chris@chris-wilson.co.uk>

Previously, this was restricted to only operate on bound objects - to
make pointer access through the GTT to the object coherent with writes
to and from the GPU. A second usecase is drm_intel_bo_wait_rendering()
which at present does not function unless the object also happens to
be bound into the GGTT (on current systems that is becoming increasingly
rare, especially for the typical requests from mesa). A third usecase is
a future patch wishing to extend the coverage of the GTT domain to
include objects not bound into the GGTT but still in its coherent cache
domain. For the latter pair of requests, we need to operate on the
object regardless of its bind state.

v2: After discussion with Akash, we came to the conclusion that the
get-pages was required in order for accurate domain tracking in the
corner cases (like the shrinker) and also useful for ensuring memory
coherency with earlier cached CPU mmaps in case userspace uses exotic
cache bypass (non-temporal) instructions.

v3: Fix the inactive object check.

Cc: Akash Goel <akash.goel@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Akash Goel <akash.goel@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 43 ++++++++++++++++++-----------------------
 1 file changed, 19 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1d58bc5..1af042b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -153,12 +153,6 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
 	return 0;
 }
 
-static inline bool
-i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
-{
-	return i915_gem_obj_bound_any(obj) && !obj->active;
-}
-
 int
 i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
 			    struct drm_file *file)
@@ -1466,18 +1460,10 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data,
 	if (ret)
 		goto unref;
 
-	if (read_domains & I915_GEM_DOMAIN_GTT) {
+	if (read_domains & I915_GEM_DOMAIN_GTT)
 		ret = i915_gem_object_set_to_gtt_domain(obj, write_domain != 0);
-
-		/* Silently promote "you're not bound, there was nothing to do"
-		 * to success, since the client was just asking us to
-		 * make sure everything was done.
-		 */
-		if (ret == -EINVAL)
-			ret = 0;
-	} else {
+	else
 		ret = i915_gem_object_set_to_cpu_domain(obj, write_domain != 0);
-	}
 
 unref:
 	drm_gem_object_unreference(&obj->base);
@@ -3685,15 +3671,10 @@ i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj,
 int
 i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 {
-	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
-	struct i915_vma *vma = i915_gem_obj_to_ggtt(obj);
 	uint32_t old_write_domain, old_read_domains;
+	struct i915_vma *vma;
 	int ret;
 
-	/* Not valid to be called on unbound objects. */
-	if (vma == NULL)
-		return -EINVAL;
-
 	if (obj->base.write_domain == I915_GEM_DOMAIN_GTT)
 		return 0;
 
@@ -3702,6 +3683,19 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 		return ret;
 
 	i915_gem_object_retire(obj);
+
+	/* Flush and acquire obj->pages so that we are coherent through
+	 * direct access in memory with previous cached writes through
+	 * shmemfs and that our cache domain tracking remains valid.
+	 * For example, if the obj->filp was moved to swap without us
+	 * being notified and releasing the pages, we would mistakenly
+	 * continue to assume that the obj remained out of the CPU cached
+	 * domain.
+	 */
+	ret = i915_gem_object_get_pages(obj);
+	if (ret)
+		return ret;
+
 	i915_gem_object_flush_cpu_write_domain(obj, false);
 
 	/* Serialise direct access to this object with the barriers for
@@ -3733,9 +3727,10 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 					    old_write_domain);
 
 	/* And bump the LRU for this access */
-	if (i915_gem_object_is_inactive(obj))
+	vma = i915_gem_obj_to_ggtt(obj);
+	if (vma && drm_mm_node_allocated(&vma->node) && !obj->active)
 		list_move_tail(&vma->mm_list,
-			       &dev_priv->gtt.base.inactive_list);
+			       &to_i915(obj->base.dev)->gtt.base.inactive_list);
 
 	return 0;
 }
-- 
1.9.3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 7/7] drm/i915: Broaden application of
  2014-11-24 16:29 ` [PATCH 7/7] drm/i915: Broaden application of set-domain(GTT) Rodrigo Vivi
@ 2014-11-25 12:43   ` shuang.he
  2014-11-25 13:05   ` [PATCH 7/7] drm/i915: Broaden application of set-domain(GTT) Daniel Vetter
  1 sibling, 0 replies; 18+ messages in thread
From: shuang.he @ 2014-11-25 12:43 UTC (permalink / raw)
  To: shuang.he, intel-gfx, rodrigo.vivi

Tested-By: PRC QA PRTS (Patch Regression Test System Contact: shuang.he@intel.com)
-------------------------------------Summary-------------------------------------
Platform          Delta          drm-intel-nightly          Series Applied
PNV                 -2              367/367              365/367
ILK                 -7              375/375              368/375
SNB                 -1              450/450              449/450
IVB                 -3              503/503              500/503
BYT                 -1              289/289              288/289
HSW                 -4              567/567              563/567
BDW                 -1              417/417              416/417
-------------------------------------Detailed-------------------------------------
Platform  Test                                drm-intel-nightly          Series Applied
*PNV  igt_gem_exec_big      PASS(2, M23M7)      FAIL(1, M7)
*PNV  igt_gem_exec_params_invalid-flag      PASS(2, M23M7)      FAIL(1, M7)
*ILK  igt_gem_exec_params_invalid-flag      PASS(2, M37)      FAIL(1, M37)
 ILK  igt_gem_reset_stats_close-pending-fork-render      TIMEOUT(14, M37M26)PASS(1, M26)      TIMEOUT(1, M37)
*ILK  igt_drv_suspend_debugfs-reader      PASS(3, M37)      DMESG_WARN(1, M37)
*ILK  igt_drv_suspend_fence-restore-tiled2untiled      PASS(4, M37)      DMESG_WARN(1, M37)
*ILK  igt_drv_suspend_fence-restore-untiled      PASS(4, M37)      DMESG_WARN(1, M37)
*ILK  igt_drv_suspend_forcewake      PASS(4, M37)      DMESG_WARN(1, M37)
 ILK  igt_kms_flip_vblank-vs-hang      TIMEOUT(14, M37M26)PASS(1, M26)      TIMEOUT(1, M37)
*SNB  igt_gem_exec_params_invalid-flag      PASS(2, M22)      FAIL(1, M22)
 IVB  igt_gem_bad_reloc_negative-reloc      NSPT(14, M34M21M4)PASS(1, M21)      NSPT(1, M21)
 IVB  igt_gem_bad_reloc_negative-reloc-lut      NSPT(3, M21M34M4)PASS(15, M21M34M4)      NSPT(1, M21)
*IVB  igt_gem_exec_params_invalid-flag      PASS(2, M21)      FAIL(1, M21)
*BYT  igt_gem_exec_params_invalid-flag      PASS(2, M36M31)      FAIL(1, M31)
 HSW  igt_gem_bad_reloc_negative-reloc-lut      NSPT(24, M40M20M19)PASS(1, M20)      NSPT(1, M40)
*HSW  igt_gem_exec_params_invalid-flag      PASS(2, M20M40)      FAIL(1, M40)
*HSW  igt_kms_rotation_crc_primary-rotation      PASS(23, M20M40M19)      DMESG_WARN(1, M40)
*HSW  igt_pm_rc6_residency_rc6-accuracy      PASS(25, M20M40M19)      FAIL(1, M40)
*BDW  igt_gem_exec_params_invalid-flag      PASS(2, M28)      FAIL(1, M28)
Note: You need to pay more attention to line start with '*'
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/7] drm/i915: Specify bsd rings through exec flag
  2014-11-24 16:29 ` [PATCH 1/7] drm/i915: Specify bsd rings through exec flag Rodrigo Vivi
@ 2014-11-25 13:04   ` Daniel Vetter
  2014-12-08 21:55     ` Rodrigo Vivi
  0 siblings, 1 reply; 18+ messages in thread
From: Daniel Vetter @ 2014-11-25 13:04 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx

On Mon, Nov 24, 2014 at 08:29:40AM -0800, Rodrigo Vivi wrote:
> From: Zhipeng Gong <zhipeng.gong@intel.com>
> 
> On Broadwell GT3 we have 2 Video Command Streamers (VCS), but userspace
> has no control when using VCS1 or VCS2. This patch introduces a mechanism
> to avoid the default ping-pong mode and use one specific ring through
> execution flag.
> 
> v2: fix whitespace (Rodrigo)
> v3: remove incorrect chunk that came on -collector rebase. (Rodrigo)
> 
> Signed-off-by: Zhipeng Gong <zhipeng.gong@intel.com>
> Reviewed-by-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

There's review from me pending on testcases and stuff, but I get the
impression that's lost now. Is it?
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 19 +++++++++++++++++--
>  include/uapi/drm/i915_drm.h                |  8 +++++++-
>  2 files changed, 24 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index f06027b..2ef60c3 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -1317,8 +1317,23 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  	else if ((args->flags & I915_EXEC_RING_MASK) == I915_EXEC_BSD) {
>  		if (HAS_BSD2(dev)) {
>  			int ring_id;
> -			ring_id = gen8_dispatch_bsd_ring(dev, file);
> -			ring = &dev_priv->ring[ring_id];
> +
> +			switch (args->flags & I915_EXEC_BSD_MASK) {
> +			case I915_EXEC_BSD_DEFAULT:
> +				ring_id = gen8_dispatch_bsd_ring(dev, file);
> +				ring = &dev_priv->ring[ring_id];
> +				break;
> +			case I915_EXEC_BSD_RING1:
> +				ring = &dev_priv->ring[VCS];
> +				break;
> +			case I915_EXEC_BSD_RING2:
> +				ring = &dev_priv->ring[VCS2];
> +				break;
> +			default:
> +				DRM_DEBUG("execbuf with unknown bsd ring: %d\n",
> +					  (int)(args->flags & I915_EXEC_BSD_MASK));
> +				return -EINVAL;
> +			}
>  		} else
>  			ring = &dev_priv->ring[VCS];
>  	} else
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 2502622..fcb16bf 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -737,7 +737,13 @@ struct drm_i915_gem_execbuffer2 {
>   */
>  #define I915_EXEC_HANDLE_LUT		(1<<12)
>  
> -#define __I915_EXEC_UNKNOWN_FLAGS -(I915_EXEC_HANDLE_LUT<<1)
> +/** Used for switching BSD rings on the platforms with two BSD rings */
> +#define I915_EXEC_BSD_MASK		(3<<13)
> +#define I915_EXEC_BSD_DEFAULT		(0<<13) /* default ping-pong mode */
> +#define I915_EXEC_BSD_RING1		(1<<13)
> +#define I915_EXEC_BSD_RING2		(2<<13)
> +
> +#define __I915_EXEC_UNKNOWN_FLAGS -(1<<15)
>  
>  #define I915_EXEC_CONTEXT_ID_MASK	(0xffffffff)
>  #define i915_execbuffer2_set_context_id(eb2, context) \
> -- 
> 1.9.3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 7/7] drm/i915: Broaden application of set-domain(GTT)
  2014-11-24 16:29 ` [PATCH 7/7] drm/i915: Broaden application of set-domain(GTT) Rodrigo Vivi
  2014-11-25 12:43   ` [PATCH 7/7] drm/i915: Broaden application of shuang.he
@ 2014-11-25 13:05   ` Daniel Vetter
  1 sibling, 0 replies; 18+ messages in thread
From: Daniel Vetter @ 2014-11-25 13:05 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx, Akash Goel

On Mon, Nov 24, 2014 at 08:29:46AM -0800, Rodrigo Vivi wrote:
> From: Chris Wilson <chris@chris-wilson.co.uk>
> 
> Previously, this was restricted to only operate on bound objects - to
> make pointer access through the GTT to the object coherent with writes
> to and from the GPU. A second usecase is drm_intel_bo_wait_rendering()
> which at present does not function unless the object also happens to
> be bound into the GGTT (on current systems that is becoming increasingly
> rare, especially for the typical requests from mesa). A third usecase is
> a future patch wishing to extend the coverage of the GTT domain to
> include objects not bound into the GGTT but still in its coherent cache
> domain. For the latter pair of requests, we need to operate on the
> object regardless of its bind state.
> 
> v2: After discussion with Akash, we came to the conclusion that the
> get-pages was required in order for accurate domain tracking in the
> corner cases (like the shrinker) and also useful for ensuring memory
> coherency with earlier cached CPU mmaps in case userspace uses exotic
> cache bypass (non-temporal) instructions.
> 
> v3: Fix the inactive object check.
> 
> Cc: Akash Goel <akash.goel@intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Reviewed-by: Akash Goel <akash.goel@intel.com>
> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

This is part of the wc mmap series, you can probably drop this one here.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_gem.c | 43 ++++++++++++++++++-----------------------
>  1 file changed, 19 insertions(+), 24 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 1d58bc5..1af042b 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -153,12 +153,6 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
>  	return 0;
>  }
>  
> -static inline bool
> -i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
> -{
> -	return i915_gem_obj_bound_any(obj) && !obj->active;
> -}
> -
>  int
>  i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
>  			    struct drm_file *file)
> @@ -1466,18 +1460,10 @@ i915_gem_set_domain_ioctl(struct drm_device *dev, void *data,
>  	if (ret)
>  		goto unref;
>  
> -	if (read_domains & I915_GEM_DOMAIN_GTT) {
> +	if (read_domains & I915_GEM_DOMAIN_GTT)
>  		ret = i915_gem_object_set_to_gtt_domain(obj, write_domain != 0);
> -
> -		/* Silently promote "you're not bound, there was nothing to do"
> -		 * to success, since the client was just asking us to
> -		 * make sure everything was done.
> -		 */
> -		if (ret == -EINVAL)
> -			ret = 0;
> -	} else {
> +	else
>  		ret = i915_gem_object_set_to_cpu_domain(obj, write_domain != 0);
> -	}
>  
>  unref:
>  	drm_gem_object_unreference(&obj->base);
> @@ -3685,15 +3671,10 @@ i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj,
>  int
>  i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
>  {
> -	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> -	struct i915_vma *vma = i915_gem_obj_to_ggtt(obj);
>  	uint32_t old_write_domain, old_read_domains;
> +	struct i915_vma *vma;
>  	int ret;
>  
> -	/* Not valid to be called on unbound objects. */
> -	if (vma == NULL)
> -		return -EINVAL;
> -
>  	if (obj->base.write_domain == I915_GEM_DOMAIN_GTT)
>  		return 0;
>  
> @@ -3702,6 +3683,19 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
>  		return ret;
>  
>  	i915_gem_object_retire(obj);
> +
> +	/* Flush and acquire obj->pages so that we are coherent through
> +	 * direct access in memory with previous cached writes through
> +	 * shmemfs and that our cache domain tracking remains valid.
> +	 * For example, if the obj->filp was moved to swap without us
> +	 * being notified and releasing the pages, we would mistakenly
> +	 * continue to assume that the obj remained out of the CPU cached
> +	 * domain.
> +	 */
> +	ret = i915_gem_object_get_pages(obj);
> +	if (ret)
> +		return ret;
> +
>  	i915_gem_object_flush_cpu_write_domain(obj, false);
>  
>  	/* Serialise direct access to this object with the barriers for
> @@ -3733,9 +3727,10 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
>  					    old_write_domain);
>  
>  	/* And bump the LRU for this access */
> -	if (i915_gem_object_is_inactive(obj))
> +	vma = i915_gem_obj_to_ggtt(obj);
> +	if (vma && drm_mm_node_allocated(&vma->node) && !obj->active)
>  		list_move_tail(&vma->mm_list,
> -			       &dev_priv->gtt.base.inactive_list);
> +			       &to_i915(obj->base.dev)->gtt.base.inactive_list);
>  
>  	return 0;
>  }
> -- 
> 1.9.3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/7] drm/i915: Specify bsd rings through exec flag
  2014-11-25 13:04   ` Daniel Vetter
@ 2014-12-08 21:55     ` Rodrigo Vivi
  2014-12-09  9:46       ` Daniel Vetter
  0 siblings, 1 reply; 18+ messages in thread
From: Rodrigo Vivi @ 2014-12-08 21:55 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx, Rodrigo Vivi

On Tue, Nov 25, 2014 at 5:04 AM, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Mon, Nov 24, 2014 at 08:29:40AM -0800, Rodrigo Vivi wrote:
>> From: Zhipeng Gong <zhipeng.gong@intel.com>
>>
>> On Broadwell GT3 we have 2 Video Command Streamers (VCS), but userspace
>> has no control when using VCS1 or VCS2. This patch introduces a mechanism
>> to avoid the default ping-pong mode and use one specific ring through
>> execution flag.
>>
>> v2: fix whitespace (Rodrigo)
>> v3: remove incorrect chunk that came on -collector rebase. (Rodrigo)
>>
>> Signed-off-by: Zhipeng Gong <zhipeng.gong@intel.com>
>> Reviewed-by-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
>
> There's review from me pending on testcases and stuff, but I get the
> impression that's lost now. Is it?

tests are ready as well, I've tested and reviewed them.
imho this is ready for merge. Anyway I'm going to submit again on next
-collector round

> -Daniel
>
>> ---
>>  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 19 +++++++++++++++++--
>>  include/uapi/drm/i915_drm.h                |  8 +++++++-
>>  2 files changed, 24 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> index f06027b..2ef60c3 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> @@ -1317,8 +1317,23 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>>       else if ((args->flags & I915_EXEC_RING_MASK) == I915_EXEC_BSD) {
>>               if (HAS_BSD2(dev)) {
>>                       int ring_id;
>> -                     ring_id = gen8_dispatch_bsd_ring(dev, file);
>> -                     ring = &dev_priv->ring[ring_id];
>> +
>> +                     switch (args->flags & I915_EXEC_BSD_MASK) {
>> +                     case I915_EXEC_BSD_DEFAULT:
>> +                             ring_id = gen8_dispatch_bsd_ring(dev, file);
>> +                             ring = &dev_priv->ring[ring_id];
>> +                             break;
>> +                     case I915_EXEC_BSD_RING1:
>> +                             ring = &dev_priv->ring[VCS];
>> +                             break;
>> +                     case I915_EXEC_BSD_RING2:
>> +                             ring = &dev_priv->ring[VCS2];
>> +                             break;
>> +                     default:
>> +                             DRM_DEBUG("execbuf with unknown bsd ring: %d\n",
>> +                                       (int)(args->flags & I915_EXEC_BSD_MASK));
>> +                             return -EINVAL;
>> +                     }
>>               } else
>>                       ring = &dev_priv->ring[VCS];
>>       } else
>> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
>> index 2502622..fcb16bf 100644
>> --- a/include/uapi/drm/i915_drm.h
>> +++ b/include/uapi/drm/i915_drm.h
>> @@ -737,7 +737,13 @@ struct drm_i915_gem_execbuffer2 {
>>   */
>>  #define I915_EXEC_HANDLE_LUT         (1<<12)
>>
>> -#define __I915_EXEC_UNKNOWN_FLAGS -(I915_EXEC_HANDLE_LUT<<1)
>> +/** Used for switching BSD rings on the platforms with two BSD rings */
>> +#define I915_EXEC_BSD_MASK           (3<<13)
>> +#define I915_EXEC_BSD_DEFAULT                (0<<13) /* default ping-pong mode */
>> +#define I915_EXEC_BSD_RING1          (1<<13)
>> +#define I915_EXEC_BSD_RING2          (2<<13)
>> +
>> +#define __I915_EXEC_UNKNOWN_FLAGS -(1<<15)
>>
>>  #define I915_EXEC_CONTEXT_ID_MASK    (0xffffffff)
>>  #define i915_execbuffer2_set_context_id(eb2, context) \
>> --
>> 1.9.3
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx



-- 
Rodrigo Vivi
Blog: http://blog.vivi.eng.br
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/7] drm/i915: Specify bsd rings through exec flag
  2014-12-08 21:55     ` Rodrigo Vivi
@ 2014-12-09  9:46       ` Daniel Vetter
  2014-12-10  2:18         ` Gong, Zhipeng
  0 siblings, 1 reply; 18+ messages in thread
From: Daniel Vetter @ 2014-12-09  9:46 UTC (permalink / raw)
  To: Rodrigo Vivi; +Cc: intel-gfx, Rodrigo Vivi

On Mon, Dec 08, 2014 at 01:55:56PM -0800, Rodrigo Vivi wrote:
> On Tue, Nov 25, 2014 at 5:04 AM, Daniel Vetter <daniel@ffwll.ch> wrote:
> > On Mon, Nov 24, 2014 at 08:29:40AM -0800, Rodrigo Vivi wrote:
> >> From: Zhipeng Gong <zhipeng.gong@intel.com>
> >>
> >> On Broadwell GT3 we have 2 Video Command Streamers (VCS), but userspace
> >> has no control when using VCS1 or VCS2. This patch introduces a mechanism
> >> to avoid the default ping-pong mode and use one specific ring through
> >> execution flag.
> >>
> >> v2: fix whitespace (Rodrigo)
> >> v3: remove incorrect chunk that came on -collector rebase. (Rodrigo)
> >>
> >> Signed-off-by: Zhipeng Gong <zhipeng.gong@intel.com>
> >> Reviewed-by-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> >> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> >
> > There's review from me pending on testcases and stuff, but I get the
> > impression that's lost now. Is it?
> 
> tests are ready as well, I've tested and reviewed them.
> imho this is ready for merge. Anyway I'm going to submit again on next
> -collector round

Last time I've looked it doesn't really address my review. Quoting
relevant parts:

> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index e1ed85a..d9081ec 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -1273,8 +1273,23 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>       else if ((args->flags & I915_EXEC_RING_MASK) == I915_EXEC_BSD) {
>               if (HAS_BSD2(dev)) {
>                       int ring_id;
> -                     ring_id = gen8_dispatch_bsd_ring(dev, file);
> -                     ring = &dev_priv->ring[ring_id];
> +
> +                     switch (args->flags & I915_EXEC_BSD_MASK) {
> +                     case I915_EXEC_BSD_DEFAULT:
> +                             ring_id = gen8_dispatch_bsd_ring(dev, file);
> +                             ring = &dev_priv->ring[ring_id];
> +                             break;
> +                     case I915_EXEC_BSD_RING1:
> +                             ring = &dev_priv->ring[VCS];

Do we have any use-case for selecting ring1 specifically? I've thought
it's only ring2 that is special?

> +                             break;
> +                     case I915_EXEC_BSD_RING2:

This needs a if (!IS_SKL(dev) return -EINVAL; check since the flag isnt
valid anywhere else. Also you must add code to reject these flags if
userspace selects a ring other than bsd.

And all these new -EINVAL cases need new subtests to validate them in
gem_exec_params.c.

And I might have missed some case, ioctl validation is hard ;-) So please
double-check that really no insane combination that userspace might try to
abuse is caught and has a testcase in gem_exec_params.

/endquote

The testcase also doesn't check for these nasty condiditions (it would
fail with the current patch).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/7] drm/i915: Specify bsd rings through exec flag
  2014-12-09  9:46       ` Daniel Vetter
@ 2014-12-10  2:18         ` Gong, Zhipeng
  2014-12-10  9:11           ` Daniel Vetter
  0 siblings, 1 reply; 18+ messages in thread
From: Gong, Zhipeng @ 2014-12-10  2:18 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx, Vivi, Rodrigo

On Tue, 2014-12-09 at 10:46 +0100, Daniel Vetter wrote:
> On Mon, Dec 08, 2014 at 01:55:56PM -0800, Rodrigo Vivi wrote:
> > On Tue, Nov 25, 2014 at 5:04 AM, Daniel Vetter <daniel@ffwll.ch> wrote:
> > > On Mon, Nov 24, 2014 at 08:29:40AM -0800, Rodrigo Vivi wrote:
> > >> From: Zhipeng Gong <zhipeng.gong@intel.com>
> > >>
> > >> On Broadwell GT3 we have 2 Video Command Streamers (VCS), but userspace
> > >> has no control when using VCS1 or VCS2. This patch introduces a mechanism
> > >> to avoid the default ping-pong mode and use one specific ring through
> > >> execution flag.
> > >>
> > >> v2: fix whitespace (Rodrigo)
> > >> v3: remove incorrect chunk that came on -collector rebase. (Rodrigo)
> > >>
> > >> Signed-off-by: Zhipeng Gong <zhipeng.gong@intel.com>
> > >> Reviewed-by-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > >> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > >
> > > There's review from me pending on testcases and stuff, but I get the
> > > impression that's lost now. Is it?
> > 
> > tests are ready as well, I've tested and reviewed them.
> > imho this is ready for merge. Anyway I'm going to submit again on next
> > -collector round
> 
> Last time I've looked it doesn't really address my review. Quoting
> relevant parts:
> 
> > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > index e1ed85a..d9081ec 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > @@ -1273,8 +1273,23 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >       else if ((args->flags & I915_EXEC_RING_MASK) == I915_EXEC_BSD) {
> >               if (HAS_BSD2(dev)) {
> >                       int ring_id;
> > -                     ring_id = gen8_dispatch_bsd_ring(dev, file);
> > -                     ring = &dev_priv->ring[ring_id];
> > +
> > +                     switch (args->flags & I915_EXEC_BSD_MASK) {
> > +                     case I915_EXEC_BSD_DEFAULT:
> > +                             ring_id = gen8_dispatch_bsd_ring(dev, file);
> > +                             ring = &dev_priv->ring[ring_id];
> > +                             break;
> > +                     case I915_EXEC_BSD_RING1:
> > +                             ring = &dev_priv->ring[VCS];
> 
> Do we have any use-case for selecting ring1 specifically? I've thought
> it's only ring2 that is special?
The HEVC GPU commands should be dispatched to BSD RING 1 instead of BSD
RING2 as the two rings are asymmetrical. 
For the H264 decoding/encoding either ring is OK.
> 
> > +                             break;
> > +                     case I915_EXEC_BSD_RING2:
> 
> This needs a if (!IS_SKL(dev) return -EINVAL; check since the flag isnt
> valid anywhere else. 
The override flag is required for the asymmetrical BSD rings in SKL. But
it will be reasonable that user-space driver should have an opportunity
to determine which ring is to dispatch the BSD GPU command even though
the two rings are symmetrical, like BDW.
I will send out a new version of the patch to add more comments.
> Also you must add code to reject these flags if
> userspace selects a ring other than bsd.
"if ((args->flags & I915_EXEC_RING_MASK) == I915_EXEC_BSD) " condition
checks already make sure that userspace would not select a ring other
than bsd.
> 
> And all these new -EINVAL cases need new subtests to validate them in
> gem_exec_params.c.
> 
> And I might have missed some case, ioctl validation is hard ;-) So please
> double-check that really no insane combination that userspace might try to
> abuse is caught and has a testcase in gem_exec_params.
> 
> /endquote
> 
> The testcase also doesn't check for these nasty condiditions (it would
> fail with the current patch).
Could you please give examples for "nasty conditions", I will add them.
> -Daniel

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/7] drm/i915: Specify bsd rings through exec flag
  2014-12-10  2:18         ` Gong, Zhipeng
@ 2014-12-10  9:11           ` Daniel Vetter
  2014-12-10 15:55             ` Dave Gordon
  0 siblings, 1 reply; 18+ messages in thread
From: Daniel Vetter @ 2014-12-10  9:11 UTC (permalink / raw)
  To: Gong, Zhipeng; +Cc: intel-gfx, Vivi, Rodrigo

On Wed, Dec 10, 2014 at 02:18:15AM +0000, Gong, Zhipeng wrote:
> On Tue, 2014-12-09 at 10:46 +0100, Daniel Vetter wrote:
> > On Mon, Dec 08, 2014 at 01:55:56PM -0800, Rodrigo Vivi wrote:
> > > On Tue, Nov 25, 2014 at 5:04 AM, Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > On Mon, Nov 24, 2014 at 08:29:40AM -0800, Rodrigo Vivi wrote:
> > > >> From: Zhipeng Gong <zhipeng.gong@intel.com>
> > > >>
> > > >> On Broadwell GT3 we have 2 Video Command Streamers (VCS), but userspace
> > > >> has no control when using VCS1 or VCS2. This patch introduces a mechanism
> > > >> to avoid the default ping-pong mode and use one specific ring through
> > > >> execution flag.
> > > >>
> > > >> v2: fix whitespace (Rodrigo)
> > > >> v3: remove incorrect chunk that came on -collector rebase. (Rodrigo)
> > > >>
> > > >> Signed-off-by: Zhipeng Gong <zhipeng.gong@intel.com>
> > > >> Reviewed-by-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > > >> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
> > > >
> > > > There's review from me pending on testcases and stuff, but I get the
> > > > impression that's lost now. Is it?
> > > 
> > > tests are ready as well, I've tested and reviewed them.
> > > imho this is ready for merge. Anyway I'm going to submit again on next
> > > -collector round
> > 
> > Last time I've looked it doesn't really address my review. Quoting
> > relevant parts:
> > 
> > > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > index e1ed85a..d9081ec 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > > @@ -1273,8 +1273,23 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> > >       else if ((args->flags & I915_EXEC_RING_MASK) == I915_EXEC_BSD) {
> > >               if (HAS_BSD2(dev)) {
> > >                       int ring_id;
> > > -                     ring_id = gen8_dispatch_bsd_ring(dev, file);
> > > -                     ring = &dev_priv->ring[ring_id];
> > > +
> > > +                     switch (args->flags & I915_EXEC_BSD_MASK) {
> > > +                     case I915_EXEC_BSD_DEFAULT:
> > > +                             ring_id = gen8_dispatch_bsd_ring(dev, file);
> > > +                             ring = &dev_priv->ring[ring_id];
> > > +                             break;
> > > +                     case I915_EXEC_BSD_RING1:
> > > +                             ring = &dev_priv->ring[VCS];
> > 
> > Do we have any use-case for selecting ring1 specifically? I've thought
> > it's only ring2 that is special?
> The HEVC GPU commands should be dispatched to BSD RING 1 instead of BSD
> RING2 as the two rings are asymmetrical. 
> For the H264 decoding/encoding either ring is OK.

Well then same arguments applies with ring2 since only ring1 is special?
It's just to minimize abi and reduce the amount of rope we hand to
userspace.

> > > +                             break;
> > > +                     case I915_EXEC_BSD_RING2:
> > 
> > This needs a if (!IS_SKL(dev) return -EINVAL; check since the flag isnt
> > valid anywhere else. 
> The override flag is required for the asymmetrical BSD rings in SKL. But
> it will be reasonable that user-space driver should have an opportunity
> to determine which ring is to dispatch the BSD GPU command even though
> the two rings are symmetrical, like BDW.
> I will send out a new version of the patch to add more comments.

Well, I need a real user for this and it needs to be opensource. Given
that the real user on bdw currently is closed source this is a bit tricky
and one of these things to send Dave Airlie into ballistic modes. So I
don't want to try even.

> > Also you must add code to reject these flags if
> > userspace selects a ring other than bsd.
> "if ((args->flags & I915_EXEC_RING_MASK) == I915_EXEC_BSD) " condition
> checks already make sure that userspace would not select a ring other
> than bsd.

Yeah, but userspace could submit e.g. EXEC_BLITTER | EXEC_BSD_RING1 which
doesn't make sense and must be rejected.

> > And all these new -EINVAL cases need new subtests to validate them in
> > gem_exec_params.c.
> > 
> > And I might have missed some case, ioctl validation is hard ;-) So please
> > double-check that really no insane combination that userspace might try to
> > abuse is caught and has a testcase in gem_exec_params.
> > 
> > /endquote
> > 
> > The testcase also doesn't check for these nasty condiditions (it would
> > fail with the current patch).
> Could you please give examples for "nasty conditions", I will add them.

Two things essentially:
- all flag combinations that don't make sense _must_ be rejected with
  -EINVAL.
- you must have an igt testcase for each such case

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/7] drm/i915: Specify bsd rings through exec flag
  2014-12-10  9:11           ` Daniel Vetter
@ 2014-12-10 15:55             ` Dave Gordon
  2014-12-11  3:50               ` Zhao, Yakui
  0 siblings, 1 reply; 18+ messages in thread
From: Dave Gordon @ 2014-12-10 15:55 UTC (permalink / raw)
  To: Daniel Vetter, Gong, Zhipeng; +Cc: intel-gfx, Vivi, Rodrigo

On 10/12/14 09:11, Daniel Vetter wrote:
> On Wed, Dec 10, 2014 at 02:18:15AM +0000, Gong, Zhipeng wrote:
>> On Tue, 2014-12-09 at 10:46 +0100, Daniel Vetter wrote:
>>> On Mon, Dec 08, 2014 at 01:55:56PM -0800, Rodrigo Vivi wrote:

[snip]

>>>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>>> index e1ed85a..d9081ec 100644
>>>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>>> @@ -1273,8 +1273,23 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>>>>       else if ((args->flags & I915_EXEC_RING_MASK) == I915_EXEC_BSD) {
>>>>               if (HAS_BSD2(dev)) {
>>>>                       int ring_id;
>>>> -                     ring_id = gen8_dispatch_bsd_ring(dev, file);
>>>> -                     ring = &dev_priv->ring[ring_id];
>>>> +
>>>> +                     switch (args->flags & I915_EXEC_BSD_MASK) {
>>>> +                     case I915_EXEC_BSD_DEFAULT:
>>>> +                             ring_id = gen8_dispatch_bsd_ring(dev, file);
>>>> +                             ring = &dev_priv->ring[ring_id];
>>>> +                             break;
>>>> +                     case I915_EXEC_BSD_RING1:
>>>> +                             ring = &dev_priv->ring[VCS];
>>>
>>> Do we have any use-case for selecting ring1 specifically? I've thought
>>> it's only ring2 that is special?
>> The HEVC GPU commands should be dispatched to BSD RING 1 instead of BSD
>> RING2 as the two rings are asymmetrical. 
>> For the H264 decoding/encoding either ring is OK.
> 
> Well then same arguments applies with ring2 since only ring1 is special?
> It's just to minimize abi and reduce the amount of rope we hand to
> userspace.

Anyone who knows to use any of these flags is taking responsibility for
doing explicit engine allocation, so why not give them all the options
-- if for no other reason, more symmetry is good.

As an examle, there could be a case where userspace knows better than
the kernel how long each batch will take, and can predict an optimal
allocation pattern rather than just flip-flopping. So even when a batch
*can* run on either engine, there might be a reason to pick a specific one.

e.g.	short-1 -> ring 1
	short-2 -> ring 1
	long-1  -> ring 2
	short-3 -> ring 1
	long-2  -> ring 1

because the program knows that the three short batches together will
take less time than the one first long one.

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/7] drm/i915: Specify bsd rings through exec flag
  2014-12-10 15:55             ` Dave Gordon
@ 2014-12-11  3:50               ` Zhao, Yakui
  2014-12-24 21:32                 ` Rodrigo Vivi
  0 siblings, 1 reply; 18+ messages in thread
From: Zhao, Yakui @ 2014-12-11  3:50 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx, Vivi, Rodrigo

On Wed, 2014-12-10 at 08:55 -0700, Dave Gordon wrote:
> On 10/12/14 09:11, Daniel Vetter wrote:
> > On Wed, Dec 10, 2014 at 02:18:15AM +0000, Gong, Zhipeng wrote:
> >> On Tue, 2014-12-09 at 10:46 +0100, Daniel Vetter wrote:
> >>> On Mon, Dec 08, 2014 at 01:55:56PM -0800, Rodrigo Vivi wrote:
> 
> [snip]
> 
> >>>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> >>>> index e1ed85a..d9081ec 100644
> >>>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> >>>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> >>>> @@ -1273,8 +1273,23 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
> >>>>       else if ((args->flags & I915_EXEC_RING_MASK) == I915_EXEC_BSD) {
> >>>>               if (HAS_BSD2(dev)) {
> >>>>                       int ring_id;
> >>>> -                     ring_id = gen8_dispatch_bsd_ring(dev, file);
> >>>> -                     ring = &dev_priv->ring[ring_id];
> >>>> +
> >>>> +                     switch (args->flags & I915_EXEC_BSD_MASK) {
> >>>> +                     case I915_EXEC_BSD_DEFAULT:
> >>>> +                             ring_id = gen8_dispatch_bsd_ring(dev, file);
> >>>> +                             ring = &dev_priv->ring[ring_id];
> >>>> +                             break;
> >>>> +                     case I915_EXEC_BSD_RING1:
> >>>> +                             ring = &dev_priv->ring[VCS];
> >>>
> >>> Do we have any use-case for selecting ring1 specifically? I've thought
> >>> it's only ring2 that is special?
> >> The HEVC GPU commands should be dispatched to BSD RING 1 instead of BSD
> >> RING2 as the two rings are asymmetrical. 
> >> For the H264 decoding/encoding either ring is OK.
> > 
> > Well then same arguments applies with ring2 since only ring1 is special?
> > It's just to minimize abi and reduce the amount of rope we hand to
> > userspace.
> 
> Anyone who knows to use any of these flags is taking responsibility for
> doing explicit engine allocation, so why not give them all the options
> -- if for no other reason, more symmetry is good.

Agree with Dave's point. The override flag is initiated by the SKL GT3
platform, which requires that the HEVC GPU command can only be
dispatched to the BSD ring1 explicitly as the two BSD rings are not
symmetric. And the override flag can also provide the user-space
app/driver with more flexibility to explicitly determine which BSD ring
should be used to dispatch video GPU command instead of kernel ping-pong
mode. And it benefits the platform with two BSD rings.

> 
> As an examle, there could be a case where userspace knows better than
> the kernel how long each batch will take, and can predict an optimal
> allocation pattern rather than just flip-flopping. So even when a batch
> *can* run on either engine, there might be a reason to pick a specific one.
> 
> e.g.	short-1 -> ring 1
> 	short-2 -> ring 1
> 	long-1  -> ring 2
> 	short-3 -> ring 1
> 	long-2  -> ring 1
> 
> because the program knows that the three short batches together will
> take less time than the one first long one.
> 
> .Dave.
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/7] drm/i915: Specify bsd rings through exec flag
  2014-12-11  3:50               ` Zhao, Yakui
@ 2014-12-24 21:32                 ` Rodrigo Vivi
  0 siblings, 0 replies; 18+ messages in thread
From: Rodrigo Vivi @ 2014-12-24 21:32 UTC (permalink / raw)
  To: Zhao, Yakui; +Cc: intel-gfx, Vivi, Rodrigo

to be honest I got lost on the discussion here.

I like the simplicity Zhipeng used so I'm keeping it on collector and
still +1 for the merge up.

We can improve later and provide ratio usage for userspace to take
care of balance work, but I don't see why to block progress here.

On Wed, Dec 10, 2014 at 7:50 PM, Zhao, Yakui <yakui.zhao@intel.com> wrote:
> On Wed, 2014-12-10 at 08:55 -0700, Dave Gordon wrote:
>> On 10/12/14 09:11, Daniel Vetter wrote:
>> > On Wed, Dec 10, 2014 at 02:18:15AM +0000, Gong, Zhipeng wrote:
>> >> On Tue, 2014-12-09 at 10:46 +0100, Daniel Vetter wrote:
>> >>> On Mon, Dec 08, 2014 at 01:55:56PM -0800, Rodrigo Vivi wrote:
>>
>> [snip]
>>
>> >>>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> >>>> index e1ed85a..d9081ec 100644
>> >>>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> >>>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> >>>> @@ -1273,8 +1273,23 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>> >>>>       else if ((args->flags & I915_EXEC_RING_MASK) == I915_EXEC_BSD) {
>> >>>>               if (HAS_BSD2(dev)) {
>> >>>>                       int ring_id;
>> >>>> -                     ring_id = gen8_dispatch_bsd_ring(dev, file);
>> >>>> -                     ring = &dev_priv->ring[ring_id];
>> >>>> +
>> >>>> +                     switch (args->flags & I915_EXEC_BSD_MASK) {
>> >>>> +                     case I915_EXEC_BSD_DEFAULT:
>> >>>> +                             ring_id = gen8_dispatch_bsd_ring(dev, file);
>> >>>> +                             ring = &dev_priv->ring[ring_id];
>> >>>> +                             break;
>> >>>> +                     case I915_EXEC_BSD_RING1:
>> >>>> +                             ring = &dev_priv->ring[VCS];
>> >>>
>> >>> Do we have any use-case for selecting ring1 specifically? I've thought
>> >>> it's only ring2 that is special?
>> >> The HEVC GPU commands should be dispatched to BSD RING 1 instead of BSD
>> >> RING2 as the two rings are asymmetrical.
>> >> For the H264 decoding/encoding either ring is OK.
>> >
>> > Well then same arguments applies with ring2 since only ring1 is special?
>> > It's just to minimize abi and reduce the amount of rope we hand to
>> > userspace.
>>
>> Anyone who knows to use any of these flags is taking responsibility for
>> doing explicit engine allocation, so why not give them all the options
>> -- if for no other reason, more symmetry is good.
>
> Agree with Dave's point. The override flag is initiated by the SKL GT3
> platform, which requires that the HEVC GPU command can only be
> dispatched to the BSD ring1 explicitly as the two BSD rings are not
> symmetric. And the override flag can also provide the user-space
> app/driver with more flexibility to explicitly determine which BSD ring
> should be used to dispatch video GPU command instead of kernel ping-pong
> mode. And it benefits the platform with two BSD rings.
>
>>
>> As an examle, there could be a case where userspace knows better than
>> the kernel how long each batch will take, and can predict an optimal
>> allocation pattern rather than just flip-flopping. So even when a batch
>> *can* run on either engine, there might be a reason to pick a specific one.
>>
>> e.g.  short-1 -> ring 1
>>       short-2 -> ring 1
>>       long-1  -> ring 2
>>       short-3 -> ring 1
>>       long-2  -> ring 1
>>
>> because the program knows that the three short batches together will
>> take less time than the one first long one.
>>
>> .Dave.
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx



-- 
Rodrigo Vivi
Blog: http://blog.vivi.eng.br
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2014-12-24 21:32 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-24 16:29 [PATCH 0/7] drm-intel-collector - update Rodrigo Vivi
2014-11-24 16:29 ` [PATCH 1/7] drm/i915: Specify bsd rings through exec flag Rodrigo Vivi
2014-11-25 13:04   ` Daniel Vetter
2014-12-08 21:55     ` Rodrigo Vivi
2014-12-09  9:46       ` Daniel Vetter
2014-12-10  2:18         ` Gong, Zhipeng
2014-12-10  9:11           ` Daniel Vetter
2014-12-10 15:55             ` Dave Gordon
2014-12-11  3:50               ` Zhao, Yakui
2014-12-24 21:32                 ` Rodrigo Vivi
2014-11-24 16:29 ` [PATCH 2/7] drm/i915: add I915_PARAM_HAS_BSD2 to i915_getparam Rodrigo Vivi
2014-11-24 16:29 ` [PATCH 3/7] drm/i915: Move the ban period onto the context Rodrigo Vivi
2014-11-24 16:29 ` [PATCH 4/7] drm/i915: Add ioctl to set per-context parameters Rodrigo Vivi
2014-11-24 16:29 ` [PATCH 5/7] drm/i915: Put logical pipe_control emission into a helper Rodrigo Vivi
2014-11-24 16:29 ` [PATCH 6/7] drm/i915: Add WaCsStallBeforeStateCacheInvalidate:bdw, chv to logical ring Rodrigo Vivi
2014-11-24 16:29 ` [PATCH 7/7] drm/i915: Broaden application of set-domain(GTT) Rodrigo Vivi
2014-11-25 12:43   ` [PATCH 7/7] drm/i915: Broaden application of shuang.he
2014-11-25 13:05   ` [PATCH 7/7] drm/i915: Broaden application of set-domain(GTT) Daniel Vetter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox