* [PATCH 0/3] execbuf cleanups
@ 2011-10-11 18:31 Ben Widawsky
2011-10-11 18:31 ` [PATCH 1/3] drm/i915: extract constant offset setting Ben Widawsky
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Ben Widawsky @ 2011-10-11 18:31 UTC (permalink / raw)
To: intel-gfx; +Cc: Ben Widawsky
The first two patches are a prelude to some more heavy duty work I am
doing on the execbuffer path. I think these are pretty benign, and
helpful to me so I wanted to get these out now.
The third patch is something Ken found recently which intermixes with
this cleanup, so I figured I'd put that in here as well.
Ben Widawsky (3):
drm/i915: constant offset setting
drm/i915: make eb structure do more
drm/i915: Force sync command ordering
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 186 +++++++++++++++-------------
drivers/gpu/drm/i915/i915_reg.h | 1 +
drivers/gpu/drm/i915/intel_ringbuffer.c | 2 +
3 files changed, 105 insertions(+), 84 deletions(-)
--
1.7.7
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/3] drm/i915: extract constant offset setting
2011-10-11 18:31 [PATCH 0/3] execbuf cleanups Ben Widawsky
@ 2011-10-11 18:31 ` Ben Widawsky
2011-10-11 19:09 ` Chris Wilson
2011-10-11 18:31 ` [PATCH 2/3] drm/i915: make eb structure do more Ben Widawsky
2011-10-11 18:31 ` [PATCH 3/3] drm/i915: Force sync command ordering Ben Widawsky
2 siblings, 1 reply; 12+ messages in thread
From: Ben Widawsky @ 2011-10-11 18:31 UTC (permalink / raw)
To: intel-gfx; +Cc: Ben Widawsky
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 74 ++++++++++++++++------------
1 files changed, 43 insertions(+), 31 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 89f7429..5a16f22 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -952,6 +952,46 @@ i915_gem_execbuffer_retire_commands(struct drm_device *dev,
kfree(request);
}
}
+static int
+i915_set_constant_offset(struct intel_ring_buffer *ring, int mode)
+{
+ struct drm_device *dev = ring->dev;
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ int ret;
+
+ switch (mode) {
+ case I915_EXEC_CONSTANTS_REL_GENERAL:
+ case I915_EXEC_CONSTANTS_ABSOLUTE:
+ case I915_EXEC_CONSTANTS_REL_SURFACE:
+ if (ring == &dev_priv->ring[RCS] &&
+ mode != dev_priv->relative_constants_mode) {
+ if (INTEL_INFO(dev)->gen < 4)
+ return -EINVAL;
+
+ if (INTEL_INFO(dev)->gen > 5 &&
+ mode == I915_EXEC_CONSTANTS_REL_SURFACE)
+ return -EINVAL;
+
+ ret = intel_ring_begin(ring, 4);
+ if (ret)
+ return ret;
+
+ intel_ring_emit(ring, MI_NOOP);
+ intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
+ intel_ring_emit(ring, INSTPM);
+ intel_ring_emit(ring,
+ I915_EXEC_CONSTANTS_MASK << 16 | mode);
+ intel_ring_advance(ring);
+
+ dev_priv->relative_constants_mode = mode;
+ }
+ return 0;
+ default:
+ DRM_ERROR("execbuf with unknown constants: %d\n", mode);
+ return -EINVAL;
+ }
+}
+
static int
i915_gem_do_execbuffer(struct drm_device *dev, void *data,
@@ -1005,37 +1045,9 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
}
mode = args->flags & I915_EXEC_CONSTANTS_MASK;
- switch (mode) {
- case I915_EXEC_CONSTANTS_REL_GENERAL:
- case I915_EXEC_CONSTANTS_ABSOLUTE:
- case I915_EXEC_CONSTANTS_REL_SURFACE:
- if (ring == &dev_priv->ring[RCS] &&
- mode != dev_priv->relative_constants_mode) {
- if (INTEL_INFO(dev)->gen < 4)
- return -EINVAL;
-
- if (INTEL_INFO(dev)->gen > 5 &&
- mode == I915_EXEC_CONSTANTS_REL_SURFACE)
- return -EINVAL;
-
- ret = intel_ring_begin(ring, 4);
- if (ret)
- return ret;
-
- intel_ring_emit(ring, MI_NOOP);
- intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
- intel_ring_emit(ring, INSTPM);
- intel_ring_emit(ring,
- I915_EXEC_CONSTANTS_MASK << 16 | mode);
- intel_ring_advance(ring);
-
- dev_priv->relative_constants_mode = mode;
- }
- break;
- default:
- DRM_ERROR("execbuf with unknown constants: %d\n", mode);
- return -EINVAL;
- }
+ ret = i915_set_constant_offset(ring, mode);
+ if (ret)
+ return ret;
if (args->buffer_count < 1) {
DRM_ERROR("execbuf with %d buffers\n", args->buffer_count);
--
1.7.7
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/3] drm/i915: make eb structure do more
2011-10-11 18:31 [PATCH 0/3] execbuf cleanups Ben Widawsky
2011-10-11 18:31 ` [PATCH 1/3] drm/i915: extract constant offset setting Ben Widawsky
@ 2011-10-11 18:31 ` Ben Widawsky
2011-10-11 19:11 ` Chris Wilson
2011-10-11 18:31 ` [PATCH 3/3] drm/i915: Force sync command ordering Ben Widawsky
2 siblings, 1 reply; 12+ messages in thread
From: Ben Widawsky @ 2011-10-11 18:31 UTC (permalink / raw)
To: intel-gfx; +Cc: Ben Widawsky
clean up some code using the existing eb structure.
There is a tiny hunk here related to setting eb->cliprects which is not
really useful yet, but will be in the future.
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 112 ++++++++++++++--------------
1 files changed, 57 insertions(+), 55 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 5a16f22..9572e52 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -40,6 +40,17 @@ struct change_domains {
uint32_t flips;
};
+struct eb_objects {
+ struct drm_i915_gem_object *batch_obj;
+ struct list_head objects;
+ int buffer_count;
+ struct drm_clip_rect *cliprects;
+ int mode;
+ int and;
+ struct hlist_head buckets[0];
+ /* DO NOT PUT ANYTHING HERE */
+};
+
/*
* Set the next domain for the specified object. This
* may not actually perform the necessary flushing/invaliding though,
@@ -207,11 +218,6 @@ i915_gem_object_set_to_gpu_domain(struct drm_i915_gem_object *obj,
cd->flush_rings |= ring->id;
}
-struct eb_objects {
- int and;
- struct hlist_head buckets[0];
-};
-
static struct eb_objects *
eb_create(int size)
{
@@ -225,6 +231,8 @@ eb_create(int size)
if (eb == NULL)
return eb;
+ INIT_LIST_HEAD(&eb->objects);
+
eb->and = count - 1;
return eb;
}
@@ -232,12 +240,22 @@ eb_create(int size)
static void
eb_reset(struct eb_objects *eb)
{
+ while (!list_empty(&eb->objects)) {
+ struct drm_i915_gem_object *obj;
+
+ obj = list_first_entry(&eb->objects,
+ struct drm_i915_gem_object,
+ exec_list);
+ list_del_init(&obj->exec_list);
+ drm_gem_object_unreference(&obj->base);
+ }
memset(eb->buckets, 0, (eb->and+1)*sizeof(struct hlist_head));
}
static void
eb_add_object(struct eb_objects *eb, struct drm_i915_gem_object *obj)
{
+ list_add_tail(&obj->exec_list, &eb->objects);
hlist_add_head(&obj->exec_node,
&eb->buckets[obj->exec_handle & eb->and]);
}
@@ -262,6 +280,16 @@ eb_get_object(struct eb_objects *eb, unsigned long handle)
static void
eb_destroy(struct eb_objects *eb)
{
+ while (!list_empty(&eb->objects)) {
+ struct drm_i915_gem_object *obj;
+
+ obj = list_first_entry(&eb->objects,
+ struct drm_i915_gem_object,
+ exec_list);
+ list_del_init(&obj->exec_list);
+ drm_gem_object_unreference(&obj->base);
+ }
+
kfree(eb);
}
@@ -436,8 +464,7 @@ i915_gem_execbuffer_relocate_object_slow(struct drm_i915_gem_object *obj,
static int
i915_gem_execbuffer_relocate(struct drm_device *dev,
- struct eb_objects *eb,
- struct list_head *objects)
+ struct eb_objects *eb)
{
struct drm_i915_gem_object *obj;
int ret = 0;
@@ -450,7 +477,7 @@ i915_gem_execbuffer_relocate(struct drm_device *dev,
* lockdep complains vehemently.
*/
pagefault_disable();
- list_for_each_entry(obj, objects, exec_list) {
+ list_for_each_entry(obj, &eb->objects, exec_list) {
ret = i915_gem_execbuffer_relocate_object(obj, eb);
if (ret)
break;
@@ -618,24 +645,18 @@ static int
i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
struct drm_file *file,
struct intel_ring_buffer *ring,
- struct list_head *objects,
struct eb_objects *eb,
struct drm_i915_gem_exec_object2 *exec,
int count)
{
struct drm_i915_gem_relocation_entry *reloc;
struct drm_i915_gem_object *obj;
+ struct list_head *objects = &eb->objects;
int *reloc_offset;
int i, total, ret;
/* We may process another execbuffer during the unlock... */
- while (!list_empty(objects)) {
- obj = list_first_entry(objects,
- struct drm_i915_gem_object,
- exec_list);
- list_del_init(&obj->exec_list);
- drm_gem_object_unreference(&obj->base);
- }
+ eb_reset(eb);
mutex_unlock(&dev->struct_mutex);
@@ -676,7 +697,6 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
}
/* reacquire the objects */
- eb_reset(eb);
for (i = 0; i < count; i++) {
obj = to_intel_bo(drm_gem_object_lookup(dev, file,
exec[i].handle));
@@ -687,7 +707,6 @@ i915_gem_execbuffer_relocate_slow(struct drm_device *dev,
goto err;
}
- list_add_tail(&obj->exec_list, objects);
obj->exec_handle = exec[i].handle;
obj->exec_entry = &exec[i];
eb_add_object(eb, obj);
@@ -1000,15 +1019,13 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
struct drm_i915_gem_exec_object2 *exec)
{
drm_i915_private_t *dev_priv = dev->dev_private;
- struct list_head objects;
- struct eb_objects *eb;
- struct drm_i915_gem_object *batch_obj;
- struct drm_i915_gem_object *obj;
+ struct eb_objects *eb = NULL;
+ struct drm_i915_gem_object *obj = NULL;
struct drm_clip_rect *cliprects = NULL;
struct intel_ring_buffer *ring;
u32 exec_start, exec_len;
u32 seqno;
- int ret, mode, i;
+ int ret, i;
if (!i915_gem_check_execbuffer(args)) {
DRM_ERROR("execbuf with invalid offset/length\n");
@@ -1044,8 +1061,12 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
return -EINVAL;
}
- mode = args->flags & I915_EXEC_CONSTANTS_MASK;
- ret = i915_set_constant_offset(ring, mode);
+ eb = eb_create(args->buffer_count);
+ if (eb == NULL)
+ return -ENOMEM;
+
+ eb->mode = args->flags & I915_EXEC_CONSTANTS_MASK;
+ ret = i915_set_constant_offset(ring, eb->mode);
if (ret)
return ret;
@@ -1066,6 +1087,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
ret = -ENOMEM;
goto pre_mutex_err;
}
+ eb->cliprects = cliprects;
if (copy_from_user(cliprects,
(struct drm_clip_rect __user *)(uintptr_t)
@@ -1086,15 +1108,6 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
goto pre_mutex_err;
}
- eb = eb_create(args->buffer_count);
- if (eb == NULL) {
- mutex_unlock(&dev->struct_mutex);
- ret = -ENOMEM;
- goto pre_mutex_err;
- }
-
- /* Look up object handles */
- INIT_LIST_HEAD(&objects);
for (i = 0; i < args->buffer_count; i++) {
obj = to_intel_bo(drm_gem_object_lookup(dev, file,
@@ -1114,26 +1127,25 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
goto err;
}
- list_add_tail(&obj->exec_list, &objects);
obj->exec_handle = exec[i].handle;
obj->exec_entry = &exec[i];
eb_add_object(eb, obj);
}
- /* The last object is the batch */
- batch_obj = obj;
+ /* The last object is the batch object */
+ eb->batch_obj = obj;
/* Move the objects en-masse into the GTT, evicting if necessary. */
- ret = i915_gem_execbuffer_reserve(ring, file, &objects);
+ ret = i915_gem_execbuffer_reserve(ring, file, &eb->objects);
if (ret)
goto err;
/* The objects are in their final locations, apply the relocations. */
- ret = i915_gem_execbuffer_relocate(dev, eb, &objects);
+ ret = i915_gem_execbuffer_relocate(dev, eb);
if (ret) {
if (ret == -EFAULT) {
ret = i915_gem_execbuffer_relocate_slow(dev, file, ring,
- &objects, eb,
+ eb,
exec,
args->buffer_count);
BUG_ON(!mutex_is_locked(&dev->struct_mutex));
@@ -1143,14 +1155,14 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
}
/* Set the pending read domains for the batch buffer to COMMAND */
- if (batch_obj->base.pending_write_domain) {
+ if (eb->batch_obj->base.pending_write_domain) {
DRM_ERROR("Attempting to use self-modifying batch buffer\n");
ret = -EINVAL;
goto err;
}
- batch_obj->base.pending_read_domains |= I915_GEM_DOMAIN_COMMAND;
+ eb->batch_obj->base.pending_read_domains |= I915_GEM_DOMAIN_COMMAND;
- ret = i915_gem_execbuffer_move_to_gpu(ring, &objects);
+ ret = i915_gem_execbuffer_move_to_gpu(ring, &eb->objects);
if (ret)
goto err;
@@ -1171,7 +1183,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
trace_i915_gem_ring_dispatch(ring, seqno);
- exec_start = batch_obj->gtt_offset + args->batch_start_offset;
+ exec_start = eb->batch_obj->gtt_offset + args->batch_start_offset;
exec_len = args->batch_len;
if (cliprects) {
for (i = 0; i < args->num_cliprects; i++) {
@@ -1191,21 +1203,11 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
goto err;
}
- i915_gem_execbuffer_move_to_active(&objects, ring, seqno);
+ i915_gem_execbuffer_move_to_active(&eb->objects, ring, seqno);
i915_gem_execbuffer_retire_commands(dev, file, ring);
err:
eb_destroy(eb);
- while (!list_empty(&objects)) {
- struct drm_i915_gem_object *obj;
-
- obj = list_first_entry(&objects,
- struct drm_i915_gem_object,
- exec_list);
- list_del_init(&obj->exec_list);
- drm_gem_object_unreference(&obj->base);
- }
-
mutex_unlock(&dev->struct_mutex);
pre_mutex_err:
--
1.7.7
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 3/3] drm/i915: Force sync command ordering
2011-10-11 18:31 [PATCH 0/3] execbuf cleanups Ben Widawsky
2011-10-11 18:31 ` [PATCH 1/3] drm/i915: extract constant offset setting Ben Widawsky
2011-10-11 18:31 ` [PATCH 2/3] drm/i915: make eb structure do more Ben Widawsky
@ 2011-10-11 18:31 ` Ben Widawsky
2011-10-11 18:44 ` Ben Widawsky
` (2 more replies)
2 siblings, 3 replies; 12+ messages in thread
From: Ben Widawsky @ 2011-10-11 18:31 UTC (permalink / raw)
To: intel-gfx; +Cc: Ben Widawsky
This will strongly order synchronization commands with respect to 3d
state and 3d primitive commands. AFAIK, this shouldn't impact anything
as these sync commands are all for privileged (or ppgtt) batches only,
so user space should not be relying on this, and the kernel wouldn't be
relying on 3d state or primitive commands.
This will help when we enable PPGTT, and perhaps this synchronization is
currently useful and I just don't realize it.
This was found through doc inspection by Ken and applies to Gen6+;
Reported-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
---
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 8 ++++++--
drivers/gpu/drm/i915/i915_reg.h | 1 +
drivers/gpu/drm/i915/intel_ringbuffer.c | 2 ++
3 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 9572e52..1d55842 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -976,6 +976,7 @@ i915_set_constant_offset(struct intel_ring_buffer *ring, int mode)
{
struct drm_device *dev = ring->dev;
struct drm_i915_private *dev_priv = dev->dev_private;
+ uint32_t mask = I915_EXEC_CONSTANTS_MASK;
int ret;
switch (mode) {
@@ -991,6 +992,10 @@ i915_set_constant_offset(struct intel_ring_buffer *ring, int mode)
mode == I915_EXEC_CONSTANTS_REL_SURFACE)
return -EINVAL;
+ /* Never clear this bit because of execbuffer */
+ if (INTEL_INFO(dev)->gen >= 6)
+ mask &= ~(INSTPM_FORCE_ORDERING);
+
ret = intel_ring_begin(ring, 4);
if (ret)
return ret;
@@ -998,8 +1003,7 @@ i915_set_constant_offset(struct intel_ring_buffer *ring, int mode)
intel_ring_emit(ring, MI_NOOP);
intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
intel_ring_emit(ring, INSTPM);
- intel_ring_emit(ring,
- I915_EXEC_CONSTANTS_MASK << 16 | mode);
+ intel_ring_emit(ring, mask << 16 | mode);
intel_ring_advance(ring);
dev_priv->relative_constants_mode = mode;
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 138eae1..51569f2 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -436,6 +436,7 @@
#define INSTPM_AGPBUSY_DIS (1<<11) /* gen3: when disabled, pending interrupts
will not assert AGPBUSY# and will only
be delivered when out of C3. */
+#define INSTPM_FORCE_ORDERING (1<<7) /* GEN6+ */
#define ACTHD 0x020c8
#define FW_BLC 0x020d8
#define FW_BLC2 0x020dc
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 0e99589..b1d312f 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -297,6 +297,8 @@ static int init_render_ring(struct intel_ring_buffer *ring)
}
if (INTEL_INFO(dev)->gen >= 6) {
+ I915_WRITE(INSTPM, INSTPM_FORCE_ORDERING << 16 |
+ INSTPM_FORCE_ORDERING);
} else if (IS_GEN5(dev)) {
ret = init_pipe_control(ring);
if (ret)
--
1.7.7
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] drm/i915: Force sync command ordering
2011-10-11 18:31 ` [PATCH 3/3] drm/i915: Force sync command ordering Ben Widawsky
@ 2011-10-11 18:44 ` Ben Widawsky
2011-10-11 19:16 ` Chris Wilson
2011-10-11 19:18 ` Kenneth Graunke
2 siblings, 0 replies; 12+ messages in thread
From: Ben Widawsky @ 2011-10-11 18:44 UTC (permalink / raw)
To: Ben Widawsky; +Cc: intel-gfx
On Tue, 11 Oct 2011 11:31:56 -0700
Ben Widawsky <ben@bwidawsk.net> wrote:
> This will strongly order synchronization commands with respect to 3d
> state and 3d primitive commands. AFAIK, this shouldn't impact anything
> as these sync commands are all for privileged (or ppgtt) batches only,
> so user space should not be relying on this, and the kernel wouldn't be
> relying on 3d state or primitive commands.
>
> This will help when we enable PPGTT, and perhaps this synchronization is
> currently useful and I just don't realize it.
>
> This was found through doc inspection by Ken and applies to Gen6+;
>
> Reported-by: Kenneth Graunke <kenneth@whitecape.org>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
What I forgot to mention is certain docs say this bit must be set.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/3] drm/i915: extract constant offset setting
2011-10-11 18:31 ` [PATCH 1/3] drm/i915: extract constant offset setting Ben Widawsky
@ 2011-10-11 19:09 ` Chris Wilson
0 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2011-10-11 19:09 UTC (permalink / raw)
To: intel-gfx; +Cc: Ben Widawsky
On Tue, 11 Oct 2011 11:31:54 -0700, Ben Widawsky <ben@bwidawsk.net> wrote:
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Nice clean up. Reveals we have a bug if the reservation is interrupted
though. The actual loading of the register should be delayed until we
are about to dispatch the execbuffer (and so won't be interferred by any
usurper.)
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] drm/i915: make eb structure do more
2011-10-11 18:31 ` [PATCH 2/3] drm/i915: make eb structure do more Ben Widawsky
@ 2011-10-11 19:11 ` Chris Wilson
0 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2011-10-11 19:11 UTC (permalink / raw)
To: intel-gfx; +Cc: Ben Widawsky
On Tue, 11 Oct 2011 11:31:55 -0700, Ben Widawsky <ben@bwidawsk.net> wrote:
> clean up some code using the existing eb structure.
>
> There is a tiny hunk here related to setting eb->cliprects which is not
> really useful yet, but will be in the future.
You're scaring me! In the current incarnation there is no way cliprects
can be used successfully as it interferes with the 3D setup that the
batch believes itself to be in full control over...
Is there something fun in the pipeline? ;-)
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Other than the obnoxious cliprects,
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] drm/i915: Force sync command ordering
2011-10-11 18:31 ` [PATCH 3/3] drm/i915: Force sync command ordering Ben Widawsky
2011-10-11 18:44 ` Ben Widawsky
@ 2011-10-11 19:16 ` Chris Wilson
2011-10-11 19:18 ` Kenneth Graunke
2 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2011-10-11 19:16 UTC (permalink / raw)
To: intel-gfx; +Cc: Ben Widawsky
On Tue, 11 Oct 2011 11:31:56 -0700, Ben Widawsky <ben@bwidawsk.net> wrote:
> This will strongly order synchronization commands with respect to 3d
> state and 3d primitive commands. AFAIK, this shouldn't impact anything
> as these sync commands are all for privileged (or ppgtt) batches only,
> so user space should not be relying on this, and the kernel wouldn't be
> relying on 3d state or primitive commands.
>
> This will help when we enable PPGTT, and perhaps this synchronization is
> currently useful and I just don't realize it.
>
> This was found through doc inspection by Ken and applies to Gen6+;
>
> Reported-by: Kenneth Graunke <kenneth@whitecape.org>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Ok, that bit does look quite important..., and the code also looks
correct!
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] drm/i915: Force sync command ordering
2011-10-11 18:31 ` [PATCH 3/3] drm/i915: Force sync command ordering Ben Widawsky
2011-10-11 18:44 ` Ben Widawsky
2011-10-11 19:16 ` Chris Wilson
@ 2011-10-11 19:18 ` Kenneth Graunke
2011-10-11 19:30 ` Ben Widawsky
2 siblings, 1 reply; 12+ messages in thread
From: Kenneth Graunke @ 2011-10-11 19:18 UTC (permalink / raw)
To: Ben Widawsky; +Cc: intel-gfx
On 10/11/2011 11:31 AM, Ben Widawsky wrote:
> This will strongly order synchronization commands with respect to 3d
> state and 3d primitive commands. AFAIK, this shouldn't impact anything
> as these sync commands are all for privileged (or ppgtt) batches only,
> so user space should not be relying on this, and the kernel wouldn't be
> relying on 3d state or primitive commands.
>
> This will help when we enable PPGTT, and perhaps this synchronization is
> currently useful and I just don't realize it.
>
> This was found through doc inspection by Ken and applies to Gen6+;
>
> Reported-by: Kenneth Graunke <kenneth@whitecape.org>
> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> ---
> drivers/gpu/drm/i915/i915_gem_execbuffer.c | 8 ++++++--
> drivers/gpu/drm/i915/i915_reg.h | 1 +
> drivers/gpu/drm/i915/intel_ringbuffer.c | 2 ++
> 3 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 9572e52..1d55842 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -976,6 +976,7 @@ i915_set_constant_offset(struct intel_ring_buffer *ring, int mode)
> {
> struct drm_device *dev = ring->dev;
> struct drm_i915_private *dev_priv = dev->dev_private;
> + uint32_t mask = I915_EXEC_CONSTANTS_MASK;
> int ret;
>
> switch (mode) {
> @@ -991,6 +992,10 @@ i915_set_constant_offset(struct intel_ring_buffer *ring, int mode)
> mode == I915_EXEC_CONSTANTS_REL_SURFACE)
> return -EINVAL;
>
> + /* Never clear this bit because of execbuffer */
> + if (INTEL_INFO(dev)->gen >= 6)
> + mask &= ~(INSTPM_FORCE_ORDERING);
> +
I might do:
/* This bit doesn't exist on Gen6+. */
if (INTEL_INFO(dev)->gen >= 6)
mask &= ~I915_EXEC_CONSTANTS_REL_SURFACE;
...but, I don't mind either way.
> ret = intel_ring_begin(ring, 4);
> if (ret)
> return ret;
> @@ -998,8 +1003,7 @@ i915_set_constant_offset(struct intel_ring_buffer *ring, int mode)
> intel_ring_emit(ring, MI_NOOP);
> intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> intel_ring_emit(ring, INSTPM);
> - intel_ring_emit(ring,
> - I915_EXEC_CONSTANTS_MASK << 16 | mode);
> + intel_ring_emit(ring, mask << 16 | mode);
> intel_ring_advance(ring);
>
> dev_priv->relative_constants_mode = mode;
> diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> index 138eae1..51569f2 100644
> --- a/drivers/gpu/drm/i915/i915_reg.h
> +++ b/drivers/gpu/drm/i915/i915_reg.h
> @@ -436,6 +436,7 @@
> #define INSTPM_AGPBUSY_DIS (1<<11) /* gen3: when disabled, pending interrupts
> will not assert AGPBUSY# and will only
> be delivered when out of C3. */
> +#define INSTPM_FORCE_ORDERING (1<<7) /* GEN6+ */
> #define ACTHD 0x020c8
> #define FW_BLC 0x020d8
> #define FW_BLC2 0x020dc
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 0e99589..b1d312f 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -297,6 +297,8 @@ static int init_render_ring(struct intel_ring_buffer *ring)
> }
>
> if (INTEL_INFO(dev)->gen >= 6) {
> + I915_WRITE(INSTPM, INSTPM_FORCE_ORDERING << 16 |
> + INSTPM_FORCE_ORDERING);
> } else if (IS_GEN5(dev)) {
> ret = init_pipe_control(ring);
> if (ret)
I might only enable this on Gen7 for now, unless it actually fixes
something on Sandybridge. It's not listed as required for Gen6.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] drm/i915: Force sync command ordering
2011-10-11 19:18 ` Kenneth Graunke
@ 2011-10-11 19:30 ` Ben Widawsky
2011-10-11 19:33 ` Ben Widawsky
0 siblings, 1 reply; 12+ messages in thread
From: Ben Widawsky @ 2011-10-11 19:30 UTC (permalink / raw)
To: Kenneth Graunke; +Cc: intel-gfx
On Tue, 11 Oct 2011 12:18:15 -0700
Kenneth Graunke <kenneth@whitecape.org> wrote:
> On 10/11/2011 11:31 AM, Ben Widawsky wrote:
> > This will strongly order synchronization commands with respect to 3d
> > state and 3d primitive commands. AFAIK, this shouldn't impact anything
> > as these sync commands are all for privileged (or ppgtt) batches only,
> > so user space should not be relying on this, and the kernel wouldn't be
> > relying on 3d state or primitive commands.
> >
> > This will help when we enable PPGTT, and perhaps this synchronization is
> > currently useful and I just don't realize it.
> >
> > This was found through doc inspection by Ken and applies to Gen6+;
> >
> > Reported-by: Kenneth Graunke <kenneth@whitecape.org>
> > Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
> > ---
> > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 8 ++++++--
> > drivers/gpu/drm/i915/i915_reg.h | 1 +
> > drivers/gpu/drm/i915/intel_ringbuffer.c | 2 ++
> > 3 files changed, 9 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > index 9572e52..1d55842 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> > @@ -976,6 +976,7 @@ i915_set_constant_offset(struct intel_ring_buffer *ring, int mode)
> > {
> > struct drm_device *dev = ring->dev;
> > struct drm_i915_private *dev_priv = dev->dev_private;
> > + uint32_t mask = I915_EXEC_CONSTANTS_MASK;
> > int ret;
> >
> > switch (mode) {
> > @@ -991,6 +992,10 @@ i915_set_constant_offset(struct intel_ring_buffer *ring, int mode)
> > mode == I915_EXEC_CONSTANTS_REL_SURFACE)
> > return -EINVAL;
> >
> > + /* Never clear this bit because of execbuffer */
> > + if (INTEL_INFO(dev)->gen >= 6)
> > + mask &= ~(INSTPM_FORCE_ORDERING);
> > +
>
> I might do:
>
> /* This bit doesn't exist on Gen6+. */
> if (INTEL_INFO(dev)->gen >= 6)
> mask &= ~I915_EXEC_CONSTANTS_REL_SURFACE;
>
> ...but, I don't mind either way.
I prefer this as well.
>
> > ret = intel_ring_begin(ring, 4);
> > if (ret)
> > return ret;
> > @@ -998,8 +1003,7 @@ i915_set_constant_offset(struct intel_ring_buffer *ring, int mode)
> > intel_ring_emit(ring, MI_NOOP);
> > intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1));
> > intel_ring_emit(ring, INSTPM);
> > - intel_ring_emit(ring,
> > - I915_EXEC_CONSTANTS_MASK << 16 | mode);
> > + intel_ring_emit(ring, mask << 16 | mode);
> > intel_ring_advance(ring);
> >
> > dev_priv->relative_constants_mode = mode;
> > diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
> > index 138eae1..51569f2 100644
> > --- a/drivers/gpu/drm/i915/i915_reg.h
> > +++ b/drivers/gpu/drm/i915/i915_reg.h
> > @@ -436,6 +436,7 @@
> > #define INSTPM_AGPBUSY_DIS (1<<11) /* gen3: when disabled, pending interrupts
> > will not assert AGPBUSY# and will only
> > be delivered when out of C3. */
> > +#define INSTPM_FORCE_ORDERING (1<<7) /* GEN6+ */
> > #define ACTHD 0x020c8
> > #define FW_BLC 0x020d8
> > #define FW_BLC2 0x020dc
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index 0e99589..b1d312f 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -297,6 +297,8 @@ static int init_render_ring(struct intel_ring_buffer *ring)
> > }
> >
> > if (INTEL_INFO(dev)->gen >= 6) {
> > + I915_WRITE(INSTPM, INSTPM_FORCE_ORDERING << 16 |
> > + INSTPM_FORCE_ORDERING);
> > } else if (IS_GEN5(dev)) {
> > ret = init_pipe_control(ring);
> > if (ret)
>
> I might only enable this on Gen7 for now, unless it actually fixes
> something on Sandybridge. It's not listed as required for Gen6.
I would prefer to keep for gen6 for two reasons:
1 - paranoia
2 - user space is going to be toggling a bit which it doesn't mean to
toggle and that has a who knows what impact on gen6.
Ben
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] drm/i915: Force sync command ordering
2011-10-11 19:30 ` Ben Widawsky
@ 2011-10-11 19:33 ` Ben Widawsky
2011-10-11 19:42 ` Chris Wilson
0 siblings, 1 reply; 12+ messages in thread
From: Ben Widawsky @ 2011-10-11 19:33 UTC (permalink / raw)
To: Ben Widawsky; +Cc: intel-gfx
On Tue, 11 Oct 2011 12:30:36 -0700
Ben Widawsky <ben@bwidawsk.net> wrote:
> On Tue, 11 Oct 2011 12:18:15 -0700
> Kenneth Graunke <kenneth@whitecape.org> wrote:
> >
> > I might only enable this on Gen7 for now, unless it actually fixes
> > something on Sandybridge. It's not listed as required for Gen6.
>
> I would prefer to keep for gen6 for two reasons:
> 1 - paranoia
> 2 - user space is going to be toggling a bit which it doesn't mean to
> toggle and that has a who knows what impact on gen6.
>
> Ben
I'm an idiot. Disregard this comment. Was thinking of the other hunk.
Just 1 reason then, paranoia? Any takers?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] drm/i915: Force sync command ordering
2011-10-11 19:33 ` Ben Widawsky
@ 2011-10-11 19:42 ` Chris Wilson
0 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2011-10-11 19:42 UTC (permalink / raw)
To: Ben Widawsky; +Cc: intel-gfx
On Tue, 11 Oct 2011 12:33:05 -0700, Ben Widawsky <ben@bwidawsk.net> wrote:
> On Tue, 11 Oct 2011 12:30:36 -0700
> Ben Widawsky <ben@bwidawsk.net> wrote:
>
> > On Tue, 11 Oct 2011 12:18:15 -0700
> > Kenneth Graunke <kenneth@whitecape.org> wrote:
> > >
> > > I might only enable this on Gen7 for now, unless it actually fixes
> > > something on Sandybridge. It's not listed as required for Gen6.
> >
> > I would prefer to keep for gen6 for two reasons:
> > 1 - paranoia
> > 2 - user space is going to be toggling a bit which it doesn't mean to
> > toggle and that has a who knows what impact on gen6.
> >
> > Ben
>
> I'm an idiot. Disregard this comment. Was thinking of the other hunk.
> Just 1 reason then, paranoia? Any takers?
Does PIPE_CONTROL qualify as a 3D state packet? If so without that bit
set, there is no barrier between writing to a register and attempting to
make use of it within the batch.
Sounds like justifiable paranoia to me.
-Chris
--
Chris Wilson, Intel Open Source Technology Centre
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2011-10-11 19:42 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-11 18:31 [PATCH 0/3] execbuf cleanups Ben Widawsky
2011-10-11 18:31 ` [PATCH 1/3] drm/i915: extract constant offset setting Ben Widawsky
2011-10-11 19:09 ` Chris Wilson
2011-10-11 18:31 ` [PATCH 2/3] drm/i915: make eb structure do more Ben Widawsky
2011-10-11 19:11 ` Chris Wilson
2011-10-11 18:31 ` [PATCH 3/3] drm/i915: Force sync command ordering Ben Widawsky
2011-10-11 18:44 ` Ben Widawsky
2011-10-11 19:16 ` Chris Wilson
2011-10-11 19:18 ` Kenneth Graunke
2011-10-11 19:30 ` Ben Widawsky
2011-10-11 19:33 ` Ben Widawsky
2011-10-11 19:42 ` Chris Wilson
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.