* [PATCH] drm/i915: Set DERRMR around batches required vblank events @ 2012-10-16 9:47 Chris Wilson 2012-10-16 14:15 ` Daniel Vetter 0 siblings, 1 reply; 14+ messages in thread From: Chris Wilson @ 2012-10-16 9:47 UTC (permalink / raw) To: intel-gfx In order for the Display Engine to send messages to the Render Engine, that event must be unmasked in the Display Engine Render Response Mask Register (DERRMR). By default all events are masked to prevent unwanted messages to conserve power, and it is strongly recommended that only single events be unmasked at any time. So in order to pipeline the register writes around individual batchbuffers, userspace must notify the kernel when it requires a vblank event, this patch implements an interface to do so with an single (pipe, event) request through the execbuffer flags. Note that another workaround is required for IVB, in that we must also prevent RC6 sleeps whilst waiting upon an event. To do that we also pipeline a forcewake into the second MT slot. There are a few unpleasant issues addressed by the patch. The first is to only allow a single ring to process vsync requests at any time. This is really a non-issue as the current hardware can only process those requests on a single ring, but by serialising between rings we should prevent our future selves from shooting their feet. The second ugly issue is to handle unwind failures by disabling signals whilst queueing a vsync request, perhaps a pleasant side-effect. The last issue is the trick to enable vblank irqs which relies on the deferred vblank put. References: https://bugs.freedesktop.org/show_bug.cgi?id=50244 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> --- I am still not convinced this fixes anything as at the moment I am simply unable to detect any tearing on my ivb setup if I apply any form of vblank throttling. The other issue is that I think if I had a SECURE (DRM_MASTER|DRM_ROOT_ONLY) batch buffer I could probably do all of this in userspace. -Chris --- drivers/gpu/drm/i915/i915_dma.c | 3 ++ drivers/gpu/drm/i915/i915_drv.h | 3 ++ drivers/gpu/drm/i915/i915_gem.c | 2 + drivers/gpu/drm/i915/i915_gem_execbuffer.c | 76 ++++++++++++++++++++++++++++ drivers/gpu/drm/i915/i915_reg.h | 1 + include/drm/i915_drm.h | 17 +++++++ 6 files changed, 102 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 5779e8f..759a5e6 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -1015,6 +1015,9 @@ static int i915_getparam(struct drm_device *dev, void *data, case I915_PARAM_HAS_PRIME_VMAP_FLUSH: value = 1; break; + case I915_PARAM_HAS_EXEC_VSYNC: + value = 1; + break; default: DRM_DEBUG_DRIVER("Unknown parameter %d\n", param->param); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index dda10c1..db2ca9f 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -434,6 +434,9 @@ typedef struct drm_i915_private { struct intel_ring_buffer ring[I915_NUM_RINGS]; uint32_t next_seqno; + struct intel_ring_buffer *vsync_ring; + uint32_t vsync_seqno; + drm_dma_handle_t *status_page_dmah; uint32_t counter; struct drm_i915_gem_object *pwrctx; diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index f740535..7a5ab2c 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -2643,6 +2643,8 @@ int i915_gpu_idle(struct drm_device *dev) return ret; } + dev_priv->vsync_ring = NULL; + dev_priv->vsync_seqno = 0; return 0; } diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index bcd5aa2..021b967 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -807,6 +807,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, struct drm_i915_gem_exec_object2 *exec) { drm_i915_private_t *dev_priv = dev->dev_private; + bool was_interruptible = dev_priv->mm.interruptible; struct list_head objects; struct eb_objects *eb; struct drm_i915_gem_object *batch_obj; @@ -1044,6 +1045,62 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, goto err; } + if (args->flags & I915_EXEC_VSYNC_ENABLE && INTEL_INFO(dev)->gen >= 6) { + u32 pipe, event; + + if (dev_priv->vsync_seqno && dev_priv->vsync_ring != ring) { + ret = i915_wait_seqno(dev_priv->vsync_ring, + dev_priv->vsync_seqno); + if (ret) + goto err; + + dev_priv->vsync_ring = NULL; + dev_priv->vsync_seqno = 0; + } + + pipe = I915_EXEC_VSYNC_GET_PIPE(args->flags); + event = I915_EXEC_VSYNC_GET_EVENT(args->flags); + if (pipe >= dev_priv->num_pipe || + event > I915_EXEC_VSYNC_VBLANK) { + ret = -EINVAL; + goto err; + } + if (event == I915_EXEC_VSYNC_VBLANK) { + ret = drm_vblank_get(dev, pipe); + if (ret) + goto err; + + /* As the release is deferred by a few frames + * we can safely do the release before the queueing. + */ + drm_vblank_put(dev, pipe); + } + event = ~(1 << (event + 8 * pipe)); + + dev_priv->mm.interruptible = false; + ret = intel_ring_begin(ring, 4); + if (ret) + goto err; + + intel_ring_emit(ring, MI_NOOP); + intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1)); + intel_ring_emit(ring, DERRMR); + intel_ring_emit(ring, event); + intel_ring_advance(ring); + + if (INTEL_INFO(dev)->gen >= 7 && + intel_ring_begin(ring, 4) == 0) { + intel_ring_emit(ring, MI_NOOP); + intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1)); + intel_ring_emit(ring, FORCEWAKE_MT); + intel_ring_emit(ring, _MASKED_BIT_ENABLE(2)); + intel_ring_advance(ring); + } + + dev_priv->vsync_ring = ring; + dev_priv->vsync_seqno = seqno; + } + trace_i915_gem_ring_dispatch(ring, seqno); exec_start = batch_obj->gtt_offset + args->batch_start_offset; @@ -1066,10 +1123,29 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, goto err; } + if (args->flags & I915_EXEC_VSYNC_ENABLE && INTEL_INFO(dev)->gen >= 6) { + if (INTEL_INFO(dev)->gen >= 7 && + intel_ring_begin(ring, 4) == 0) { + intel_ring_emit(ring, MI_NOOP); + intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1)); + intel_ring_emit(ring, FORCEWAKE_MT); + intel_ring_emit(ring, _MASKED_BIT_DISABLE(2)); + intel_ring_advance(ring); + } + if (intel_ring_begin(ring, 4) == 0) { + intel_ring_emit(ring, MI_NOOP); + intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1)); + intel_ring_emit(ring, DERRMR); + intel_ring_emit(ring, 0xffffffff); + intel_ring_advance(ring); + } + } + i915_gem_execbuffer_move_to_active(&objects, ring, seqno); i915_gem_execbuffer_retire_commands(dev, file, ring); err: + dev_priv->mm.interruptible = was_interruptible; eb_destroy(eb); while (!list_empty(&objects)) { struct drm_i915_gem_object *obj; diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 8120bf2..9897b09 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -3391,6 +3391,7 @@ #define DEIMR 0x44004 #define DEIIR 0x44008 #define DEIER 0x4400c +#define DERRMR 0x44050 /* GT interrupt. * Note that for gen6+ the ring-specific interrupt bits do alias with the diff --git a/include/drm/i915_drm.h b/include/drm/i915_drm.h index acfb377..f6e39bb 100644 --- a/include/drm/i915_drm.h +++ b/include/drm/i915_drm.h @@ -316,6 +316,7 @@ typedef struct drm_i915_irq_wait { #define I915_PARAM_HAS_SEMAPHORES 20 #define I915_PARAM_HAS_PRIME_VMAP_FLUSH 21 #define I915_PARAM_RSVD_FOR_FUTURE_USE 22 +#define I915_PARAM_HAS_EXEC_VSYNC 23 typedef struct drm_i915_getparam { int param; @@ -694,6 +695,22 @@ struct drm_i915_gem_execbuffer2 { /** Resets the SO write offset registers for transform feedback on gen7. */ #define I915_EXEC_GEN7_SOL_RESET (1<<8) +#define I915_EXEC_VSYNC_ENABLE (1<<9) +#define I915_EXEC_VSYNC_EVENT_SHIFT (10) +#define I915_EXEC_VSYNC_EVENT_MASK (7) +#define I915_EXEC_VSYNC_SCANLINE (0) +#define I915_EXEC_VSYNC_PRIMARY_FLIP (1) +#define I915_EXEC_VSYNC_SPRITE_FLIP (2) +#define I915_EXEC_VSYNC_VBLANK (3) +#define I915_EXEC_VSYNC_PIPE_SHIFT (13) +#define I915_EXEC_VSYNC_PIPE_MASK (7) + +#define I915_EXEC_VSYNC(pipe, event) \ + (I915_EXEC_VSYNC_ENABLE | (pipe) << I915_EXEC_VSYNC_PIPE_SHIFT | (event) << I915_EXEC_VSYNC_EVENT_SHIFT) + +#define I915_EXEC_VSYNC_GET_PIPE(F) (((F) >> I915_EXEC_VSYNC_PIPE_SHIFT) & I915_EXEC_VSYNC_PIPE_MASK) +#define I915_EXEC_VSYNC_GET_EVENT(F) (((F) >> I915_EXEC_VSYNC_EVENT_SHIFT) & I915_EXEC_VSYNC_EVENT_MASK) + #define I915_EXEC_CONTEXT_ID_MASK (0xffffffff) #define i915_execbuffer2_set_context_id(eb2, context) \ (eb2).rsvd1 = context & I915_EXEC_CONTEXT_ID_MASK -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] drm/i915: Set DERRMR around batches required vblank events 2012-10-16 9:47 [PATCH] drm/i915: Set DERRMR around batches required vblank events Chris Wilson @ 2012-10-16 14:15 ` Daniel Vetter 2012-10-17 11:09 ` [PATCH 1/2] drm/i915: Allow DRM_ROOT_ONLY|DRM_MASTER to submit privileged batchbuffers Chris Wilson 2012-10-17 13:58 ` [PATCH] drm/i915: Set DERRMR around batches required vblank events Chris Wilson 0 siblings, 2 replies; 14+ messages in thread From: Daniel Vetter @ 2012-10-16 14:15 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx On Tue, Oct 16, 2012 at 11:47 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote: > > I am still not convinced this fixes anything as at the moment I am > simply unable to detect any tearing on my ivb setup if I apply any form of > vblank throttling. > > The other issue is that I think if I had a SECURE > (DRM_MASTER|DRM_ROOT_ONLY) batch buffer I could probably do all of this > in userspace. Yeah, I'm a bit torn between this approach and just allowing SECURE batches. The big upside of secure batches is that it allows more flexibility (if the LRI really work from secure batches and not just from the ring). The only downside I can see is that we won't be able to arbitrage the derrmr bits (since only one is allowed to be set at any given time) between different userspace processes. But I don't see mutliple concurrent display servers (with cooperative owenership of the hw) happening anytime soon, so I won't worry about this before it happens. Syncing against modeset should still work with our MI_WAIT related waits on fbs before we disable the pipe. The other issue (only on gen7) is that this will keep the gpu out of rc6 (and hence the entire package) for long times, especially on mostly-idle video playback. I don't think that massively increased power consumption is what users will appreciated. Now with the Tearfree option and you saying that vblank render throttling mostly fixes this, do we have any unhappy bug reporters left? In that case I'd prefer to just can this entirely (and suggest to ppl to use a real compositor - the wasted bw issue on X also seems to be on track to be solved). Cheers, Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/2] drm/i915: Allow DRM_ROOT_ONLY|DRM_MASTER to submit privileged batchbuffers 2012-10-16 14:15 ` Daniel Vetter @ 2012-10-17 11:09 ` Chris Wilson 2012-10-17 11:09 ` [PATCH 2/2] drm/i915: Document the multi-threaded FORCEWAKE bits Chris Wilson ` (2 more replies) 2012-10-17 13:58 ` [PATCH] drm/i915: Set DERRMR around batches required vblank events Chris Wilson 1 sibling, 3 replies; 14+ messages in thread From: Chris Wilson @ 2012-10-17 11:09 UTC (permalink / raw) To: intel-gfx With the introduction of per-process GTT space, the hardware designers thought it wise to also limit the ability to write to MMIO space to only a "secure" batch buffer. The ability to rewrite registers is the only way to program the hardware to perform certain operations like scanline waits (required for tear-free windowed updates). So we either have a choice of adding an interface to perform those synchronized updates inside the kernel, or we permit certain processes the ability to write to the "safe" registers from within its command stream. This patch exposes the ability to submit a SECURE batch buffer to DRM_ROOT_ONLY|DRM_MASTER processes. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/i915_dma.c | 3 +++ drivers/gpu/drm/i915/i915_gem_execbuffer.c | 21 ++++++++++++++++++--- drivers/gpu/drm/i915/i915_trace.h | 10 ++++++---- drivers/gpu/drm/i915/intel_ringbuffer.c | 23 +++++++++++++++-------- drivers/gpu/drm/i915/intel_ringbuffer.h | 4 +++- include/drm/i915_drm.h | 6 ++++++ 6 files changed, 51 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 5779e8f..f92c849 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -1015,6 +1015,9 @@ static int i915_getparam(struct drm_device *dev, void *data, case I915_PARAM_HAS_PRIME_VMAP_FLUSH: value = 1; break; + case I915_PARAM_HAS_SECURE_BATCHES: + value = capable(CAP_SYS_ADMIN); + break; default: DRM_DEBUG_DRIVER("Unknown parameter %d\n", param->param); diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index bcd5aa2..4bbd088 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -816,6 +816,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, u32 exec_start, exec_len; u32 seqno; u32 mask; + u32 flags; int ret, mode, i; if (!i915_gem_check_execbuffer(args)) { @@ -827,6 +828,14 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, if (ret) return ret; + flags = 0; + if (args->flags & I915_EXEC_SECURE) { + if (!file->is_master || !capable(CAP_SYS_ADMIN)) + return -EPERM; + + flags |= I915_DISPATCH_SECURE; + } + switch (args->flags & I915_EXEC_RING_MASK) { case I915_EXEC_DEFAULT: case I915_EXEC_RENDER: @@ -999,6 +1008,9 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, } batch_obj->base.pending_read_domains |= I915_GEM_DOMAIN_COMMAND; + if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping) + i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level); + ret = i915_gem_execbuffer_move_to_gpu(ring, &objects); if (ret) goto err; @@ -1044,7 +1056,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, goto err; } - trace_i915_gem_ring_dispatch(ring, seqno); + trace_i915_gem_ring_dispatch(ring, seqno, flags); exec_start = batch_obj->gtt_offset + args->batch_start_offset; exec_len = args->batch_len; @@ -1056,12 +1068,15 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, goto err; ret = ring->dispatch_execbuffer(ring, - exec_start, exec_len); + exec_start, exec_len, + flags); if (ret) goto err; } } else { - ret = ring->dispatch_execbuffer(ring, exec_start, exec_len); + ret = ring->dispatch_execbuffer(ring, + exec_start, exec_len, + flags); if (ret) goto err; } diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h index 8134421..3db4a68 100644 --- a/drivers/gpu/drm/i915/i915_trace.h +++ b/drivers/gpu/drm/i915/i915_trace.h @@ -229,24 +229,26 @@ TRACE_EVENT(i915_gem_evict_everything, ); TRACE_EVENT(i915_gem_ring_dispatch, - TP_PROTO(struct intel_ring_buffer *ring, u32 seqno), - TP_ARGS(ring, seqno), + TP_PROTO(struct intel_ring_buffer *ring, u32 seqno, u32 flags), + TP_ARGS(ring, seqno, flags), TP_STRUCT__entry( __field(u32, dev) __field(u32, ring) __field(u32, seqno) + __field(u32, flags) ), TP_fast_assign( __entry->dev = ring->dev->primary->index; __entry->ring = ring->id; __entry->seqno = seqno; + __entry->flags = flags; i915_trace_irq_get(ring, seqno); ), - TP_printk("dev=%u, ring=%u, seqno=%u", - __entry->dev, __entry->ring, __entry->seqno) + TP_printk("dev=%u, ring=%u, seqno=%u, flags=%x", + __entry->dev, __entry->ring, __entry->seqno, __entry->flags) ); TRACE_EVENT(i915_gem_ring_flush, diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 133beb6..6a4b8fa 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -964,7 +964,9 @@ gen6_ring_put_irq(struct intel_ring_buffer *ring) } static int -i965_dispatch_execbuffer(struct intel_ring_buffer *ring, u32 offset, u32 length) +i965_dispatch_execbuffer(struct intel_ring_buffer *ring, + u32 offset, u32 length, + unsigned flags) { int ret; @@ -975,7 +977,7 @@ i965_dispatch_execbuffer(struct intel_ring_buffer *ring, u32 offset, u32 length) intel_ring_emit(ring, MI_BATCH_BUFFER_START | MI_BATCH_GTT | - MI_BATCH_NON_SECURE_I965); + (flags & I915_DISPATCH_SECURE ? 0 : MI_BATCH_NON_SECURE_I965)); intel_ring_emit(ring, offset); intel_ring_advance(ring); @@ -984,7 +986,8 @@ i965_dispatch_execbuffer(struct intel_ring_buffer *ring, u32 offset, u32 length) static int i830_dispatch_execbuffer(struct intel_ring_buffer *ring, - u32 offset, u32 len) + u32 offset, u32 len, + unsigned flags) { int ret; @@ -993,7 +996,7 @@ i830_dispatch_execbuffer(struct intel_ring_buffer *ring, return ret; intel_ring_emit(ring, MI_BATCH_BUFFER); - intel_ring_emit(ring, offset | MI_BATCH_NON_SECURE); + intel_ring_emit(ring, offset | (flags & I915_DISPATCH_SECURE ? 0 : MI_BATCH_NON_SECURE)); intel_ring_emit(ring, offset + len - 8); intel_ring_emit(ring, 0); intel_ring_advance(ring); @@ -1003,7 +1006,8 @@ i830_dispatch_execbuffer(struct intel_ring_buffer *ring, static int i915_dispatch_execbuffer(struct intel_ring_buffer *ring, - u32 offset, u32 len) + u32 offset, u32 len, + unsigned flags) { int ret; @@ -1012,7 +1016,7 @@ i915_dispatch_execbuffer(struct intel_ring_buffer *ring, return ret; intel_ring_emit(ring, MI_BATCH_BUFFER_START | MI_BATCH_GTT); - intel_ring_emit(ring, offset | MI_BATCH_NON_SECURE); + intel_ring_emit(ring, offset | (flags & I915_DISPATCH_SECURE ? 0 : MI_BATCH_NON_SECURE)); intel_ring_advance(ring); return 0; @@ -1410,7 +1414,8 @@ static int gen6_ring_flush(struct intel_ring_buffer *ring, static int gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring, - u32 offset, u32 len) + u32 offset, u32 len, + unsigned flags) { int ret; @@ -1418,7 +1423,9 @@ gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring, if (ret) return ret; - intel_ring_emit(ring, MI_BATCH_BUFFER_START | MI_BATCH_NON_SECURE_I965); + intel_ring_emit(ring, + MI_BATCH_BUFFER_START | + (flags & I915_DISPATCH_SECURE ? 0 : MI_BATCH_NON_SECURE_I965)); /* bit0-7 is the length on GEN6+ */ intel_ring_emit(ring, offset); intel_ring_advance(ring); diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index 40b252e..2667f33 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -81,7 +81,9 @@ struct intel_ring_buffer { u32 (*get_seqno)(struct intel_ring_buffer *ring, bool lazy_coherency); int (*dispatch_execbuffer)(struct intel_ring_buffer *ring, - u32 offset, u32 length); + u32 offset, u32 length, + unsigned flags); +#define I915_DISPATCH_SECURE 0x1 void (*cleanup)(struct intel_ring_buffer *ring); int (*sync_to)(struct intel_ring_buffer *ring, struct intel_ring_buffer *to, diff --git a/include/drm/i915_drm.h b/include/drm/i915_drm.h index acfb377..4f41f8d 100644 --- a/include/drm/i915_drm.h +++ b/include/drm/i915_drm.h @@ -316,6 +316,7 @@ typedef struct drm_i915_irq_wait { #define I915_PARAM_HAS_SEMAPHORES 20 #define I915_PARAM_HAS_PRIME_VMAP_FLUSH 21 #define I915_PARAM_RSVD_FOR_FUTURE_USE 22 +#define I915_PARAM_HAS_SECURE_BATCHES 23 typedef struct drm_i915_getparam { int param; @@ -694,6 +695,11 @@ struct drm_i915_gem_execbuffer2 { /** Resets the SO write offset registers for transform feedback on gen7. */ #define I915_EXEC_GEN7_SOL_RESET (1<<8) +/** Request a privileged ("secure") batch buffer. Note only available for + * DRM_ROOT_ONLY | DRM_MASTER processes. + */ +#define I915_EXEC_SECURE (1<<9) + #define I915_EXEC_CONTEXT_ID_MASK (0xffffffff) #define i915_execbuffer2_set_context_id(eb2, context) \ (eb2).rsvd1 = context & I915_EXEC_CONTEXT_ID_MASK -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 2/2] drm/i915: Document the multi-threaded FORCEWAKE bits 2012-10-17 11:09 ` [PATCH 1/2] drm/i915: Allow DRM_ROOT_ONLY|DRM_MASTER to submit privileged batchbuffers Chris Wilson @ 2012-10-17 11:09 ` Chris Wilson 2012-10-17 16:27 ` Jesse Barnes 2012-10-17 16:31 ` [PATCH 1/2] drm/i915: Allow DRM_ROOT_ONLY|DRM_MASTER to submit privileged batchbuffers Jesse Barnes 2012-10-17 18:47 ` Eric Anholt 2 siblings, 1 reply; 14+ messages in thread From: Chris Wilson @ 2012-10-17 11:09 UTC (permalink / raw) To: intel-gfx No functional change, but reserves 0x2 for use by userspace. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/i915_reg.h | 2 ++ drivers/gpu/drm/i915/intel_pm.c | 10 +++++----- 2 files changed, 7 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 8120bf2..d482c12 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -4115,6 +4115,8 @@ #define FORCEWAKE_ACK_HSW 0x130044 #define FORCEWAKE_ACK 0x130090 #define FORCEWAKE_MT 0xa188 /* multi-threaded */ +#define FORCEWAKE_KERNEL 0x1 +#define FORCEWAKE_USER 0x2 #define FORCEWAKE_MT_ACK 0x130040 #define ECOBUS 0xa180 #define FORCEWAKE_MT_ENABLE (1<<5) diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index 4e5fc33..9853f35 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -4011,7 +4011,7 @@ static void __gen6_gt_force_wake_get(struct drm_i915_private *dev_priv) FORCEWAKE_ACK_TIMEOUT_MS)) DRM_ERROR("Timed out waiting for forcewake old ack to clear.\n"); - I915_WRITE_NOTRACE(FORCEWAKE, 1); + I915_WRITE_NOTRACE(FORCEWAKE, FORCEWAKE_KERNEL); POSTING_READ(ECOBUS); /* something from same cacheline, but !FORCEWAKE */ if (wait_for_atomic((I915_READ_NOTRACE(forcewake_ack) & 1), @@ -4034,7 +4034,7 @@ static void __gen6_gt_force_wake_mt_get(struct drm_i915_private *dev_priv) FORCEWAKE_ACK_TIMEOUT_MS)) DRM_ERROR("Timed out waiting for forcewake old ack to clear.\n"); - I915_WRITE_NOTRACE(FORCEWAKE_MT, _MASKED_BIT_ENABLE(1)); + I915_WRITE_NOTRACE(FORCEWAKE_MT, _MASKED_BIT_ENABLE(FORCEWAKE_KERNEL)); POSTING_READ(ECOBUS); /* something from same cacheline, but !FORCEWAKE */ if (wait_for_atomic((I915_READ_NOTRACE(forcewake_ack) & 1), @@ -4078,7 +4078,7 @@ static void __gen6_gt_force_wake_put(struct drm_i915_private *dev_priv) static void __gen6_gt_force_wake_mt_put(struct drm_i915_private *dev_priv) { - I915_WRITE_NOTRACE(FORCEWAKE_MT, _MASKED_BIT_DISABLE(1)); + I915_WRITE_NOTRACE(FORCEWAKE_MT, _MASKED_BIT_DISABLE(FORCEWAKE_KERNEL)); /* gen6_gt_check_fifodbg doubles as the POSTING_READ */ gen6_gt_check_fifodbg(dev_priv); } @@ -4122,7 +4122,7 @@ static void vlv_force_wake_get(struct drm_i915_private *dev_priv) FORCEWAKE_ACK_TIMEOUT_MS)) DRM_ERROR("Timed out waiting for forcewake old ack to clear.\n"); - I915_WRITE_NOTRACE(FORCEWAKE_VLV, _MASKED_BIT_ENABLE(1)); + I915_WRITE_NOTRACE(FORCEWAKE_VLV, _MASKED_BIT_ENABLE(FORCEWAKE_KERNEL)); if (wait_for_atomic((I915_READ_NOTRACE(FORCEWAKE_ACK_VLV) & 1), FORCEWAKE_ACK_TIMEOUT_MS)) @@ -4133,7 +4133,7 @@ static void vlv_force_wake_get(struct drm_i915_private *dev_priv) static void vlv_force_wake_put(struct drm_i915_private *dev_priv) { - I915_WRITE_NOTRACE(FORCEWAKE_VLV, _MASKED_BIT_DISABLE(1)); + I915_WRITE_NOTRACE(FORCEWAKE_VLV, _MASKED_BIT_DISABLE(FORCEWAKE_KERNEL)); /* The below doubles as a POSTING_READ */ gen6_gt_check_fifodbg(dev_priv); } -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 2/2] drm/i915: Document the multi-threaded FORCEWAKE bits 2012-10-17 11:09 ` [PATCH 2/2] drm/i915: Document the multi-threaded FORCEWAKE bits Chris Wilson @ 2012-10-17 16:27 ` Jesse Barnes 2012-10-17 17:05 ` Jesse Barnes 0 siblings, 1 reply; 14+ messages in thread From: Jesse Barnes @ 2012-10-17 16:27 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx On Wed, 17 Oct 2012 12:09:55 +0100 Chris Wilson <chris@chris-wilson.co.uk> wrote: > No functional change, but reserves 0x2 for use by userspace. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > --- > drivers/gpu/drm/i915/i915_reg.h | 2 ++ > drivers/gpu/drm/i915/intel_pm.c | 10 +++++----- > 2 files changed, 7 insertions(+), 5 deletions(-) Each of the bits in this reg correspond to a specific agent that may need to wake the GT. Unfortunately I don't have docs about which bits are used by which agents. IIRC the BIOS uses one as does the AMT engine if present. I'll try to find out who owns each and maybe we can reserve our own for userspace. Thanks, -- Jesse Barnes, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/2] drm/i915: Document the multi-threaded FORCEWAKE bits 2012-10-17 16:27 ` Jesse Barnes @ 2012-10-17 17:05 ` Jesse Barnes 2012-10-17 19:10 ` Daniel Vetter 0 siblings, 1 reply; 14+ messages in thread From: Jesse Barnes @ 2012-10-17 17:05 UTC (permalink / raw) To: Jesse Barnes; +Cc: intel-gfx On Wed, 17 Oct 2012 09:27:24 -0700 Jesse Barnes <jbarnes@virtuousgeek.org> wrote: > On Wed, 17 Oct 2012 12:09:55 +0100 > Chris Wilson <chris@chris-wilson.co.uk> wrote: > > > No functional change, but reserves 0x2 for use by userspace. > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > --- > > drivers/gpu/drm/i915/i915_reg.h | 2 ++ > > drivers/gpu/drm/i915/intel_pm.c | 10 +++++----- > > 2 files changed, 7 insertions(+), 5 deletions(-) > > Each of the bits in this reg correspond to a specific agent that may > need to wake the GT. Unfortunately I don't have docs about which bits > are used by which agents. IIRC the BIOS uses one as does the AMT > engine if present. > > I'll try to find out who owns each and maybe we can reserve our own for > userspace. Ok we're in the clear here. I was afraid there might be other system components that would do sneaky things behind our back. But once the driver loads, we own the GT, so we can make these bits whatever we want. -- Jesse Barnes, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 2/2] drm/i915: Document the multi-threaded FORCEWAKE bits 2012-10-17 17:05 ` Jesse Barnes @ 2012-10-17 19:10 ` Daniel Vetter 0 siblings, 0 replies; 14+ messages in thread From: Daniel Vetter @ 2012-10-17 19:10 UTC (permalink / raw) To: Jesse Barnes; +Cc: intel-gfx On Wed, Oct 17, 2012 at 10:05:15AM -0700, Jesse Barnes wrote: > On Wed, 17 Oct 2012 09:27:24 -0700 > Jesse Barnes <jbarnes@virtuousgeek.org> wrote: > > > On Wed, 17 Oct 2012 12:09:55 +0100 > > Chris Wilson <chris@chris-wilson.co.uk> wrote: > > > > > No functional change, but reserves 0x2 for use by userspace. > > > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > > --- > > > drivers/gpu/drm/i915/i915_reg.h | 2 ++ > > > drivers/gpu/drm/i915/intel_pm.c | 10 +++++----- > > > 2 files changed, 7 insertions(+), 5 deletions(-) > > > > Each of the bits in this reg correspond to a specific agent that may > > need to wake the GT. Unfortunately I don't have docs about which bits > > are used by which agents. IIRC the BIOS uses one as does the AMT > > engine if present. > > > > I'll try to find out who owns each and maybe we can reserve our own for > > userspace. > > Ok we're in the clear here. I was afraid there might be other system > components that would do sneaky things behind our back. But once the > driver loads, we own the GT, so we can make these bits whatever we want. I guess that counts for an ack - patch merged to dinq. Thanks, Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] drm/i915: Allow DRM_ROOT_ONLY|DRM_MASTER to submit privileged batchbuffers 2012-10-17 11:09 ` [PATCH 1/2] drm/i915: Allow DRM_ROOT_ONLY|DRM_MASTER to submit privileged batchbuffers Chris Wilson 2012-10-17 11:09 ` [PATCH 2/2] drm/i915: Document the multi-threaded FORCEWAKE bits Chris Wilson @ 2012-10-17 16:31 ` Jesse Barnes 2012-10-17 19:09 ` Daniel Vetter 2012-10-17 18:47 ` Eric Anholt 2 siblings, 1 reply; 14+ messages in thread From: Jesse Barnes @ 2012-10-17 16:31 UTC (permalink / raw) To: Chris Wilson; +Cc: intel-gfx On Wed, 17 Oct 2012 12:09:54 +0100 Chris Wilson <chris@chris-wilson.co.uk> wrote: > With the introduction of per-process GTT space, the hardware designers > thought it wise to also limit the ability to write to MMIO space to only > a "secure" batch buffer. The ability to rewrite registers is the only > way to program the hardware to perform certain operations like scanline > waits (required for tear-free windowed updates). So we either have a > choice of adding an interface to perform those synchronized updates > inside the kernel, or we permit certain processes the ability to write > to the "safe" registers from within its command stream. This patch > exposes the ability to submit a SECURE batch buffer to > DRM_ROOT_ONLY|DRM_MASTER processes. > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > --- > drivers/gpu/drm/i915/i915_dma.c | 3 +++ > drivers/gpu/drm/i915/i915_gem_execbuffer.c | 21 ++++++++++++++++++--- > drivers/gpu/drm/i915/i915_trace.h | 10 ++++++---- > drivers/gpu/drm/i915/intel_ringbuffer.c | 23 +++++++++++++++-------- > drivers/gpu/drm/i915/intel_ringbuffer.h | 4 +++- > include/drm/i915_drm.h | 6 ++++++ > 6 files changed, 51 insertions(+), 16 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c > index 5779e8f..f92c849 100644 > --- a/drivers/gpu/drm/i915/i915_dma.c > +++ b/drivers/gpu/drm/i915/i915_dma.c > @@ -1015,6 +1015,9 @@ static int i915_getparam(struct drm_device *dev, void *data, > case I915_PARAM_HAS_PRIME_VMAP_FLUSH: > value = 1; > break; > + case I915_PARAM_HAS_SECURE_BATCHES: > + value = capable(CAP_SYS_ADMIN); > + break; > default: > DRM_DEBUG_DRIVER("Unknown parameter %d\n", > param->param); > diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c > index bcd5aa2..4bbd088 100644 > --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c > +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c > @@ -816,6 +816,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, > u32 exec_start, exec_len; > u32 seqno; > u32 mask; > + u32 flags; > int ret, mode, i; > > if (!i915_gem_check_execbuffer(args)) { > @@ -827,6 +828,14 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, > if (ret) > return ret; > > + flags = 0; > + if (args->flags & I915_EXEC_SECURE) { > + if (!file->is_master || !capable(CAP_SYS_ADMIN)) > + return -EPERM; > + > + flags |= I915_DISPATCH_SECURE; > + } > + > switch (args->flags & I915_EXEC_RING_MASK) { > case I915_EXEC_DEFAULT: > case I915_EXEC_RENDER: > @@ -999,6 +1008,9 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, > } > batch_obj->base.pending_read_domains |= I915_GEM_DOMAIN_COMMAND; > > + if (flags & I915_DISPATCH_SECURE && !batch_obj->has_global_gtt_mapping) > + i915_gem_gtt_bind_object(batch_obj, batch_obj->cache_level); > + > ret = i915_gem_execbuffer_move_to_gpu(ring, &objects); > if (ret) > goto err; > @@ -1044,7 +1056,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, > goto err; > } > > - trace_i915_gem_ring_dispatch(ring, seqno); > + trace_i915_gem_ring_dispatch(ring, seqno, flags); > > exec_start = batch_obj->gtt_offset + args->batch_start_offset; > exec_len = args->batch_len; > @@ -1056,12 +1068,15 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, > goto err; > > ret = ring->dispatch_execbuffer(ring, > - exec_start, exec_len); > + exec_start, exec_len, > + flags); > if (ret) > goto err; > } > } else { > - ret = ring->dispatch_execbuffer(ring, exec_start, exec_len); > + ret = ring->dispatch_execbuffer(ring, > + exec_start, exec_len, > + flags); > if (ret) > goto err; > } > diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h > index 8134421..3db4a68 100644 > --- a/drivers/gpu/drm/i915/i915_trace.h > +++ b/drivers/gpu/drm/i915/i915_trace.h > @@ -229,24 +229,26 @@ TRACE_EVENT(i915_gem_evict_everything, > ); > > TRACE_EVENT(i915_gem_ring_dispatch, > - TP_PROTO(struct intel_ring_buffer *ring, u32 seqno), > - TP_ARGS(ring, seqno), > + TP_PROTO(struct intel_ring_buffer *ring, u32 seqno, u32 flags), > + TP_ARGS(ring, seqno, flags), > > TP_STRUCT__entry( > __field(u32, dev) > __field(u32, ring) > __field(u32, seqno) > + __field(u32, flags) > ), > > TP_fast_assign( > __entry->dev = ring->dev->primary->index; > __entry->ring = ring->id; > __entry->seqno = seqno; > + __entry->flags = flags; > i915_trace_irq_get(ring, seqno); > ), > > - TP_printk("dev=%u, ring=%u, seqno=%u", > - __entry->dev, __entry->ring, __entry->seqno) > + TP_printk("dev=%u, ring=%u, seqno=%u, flags=%x", > + __entry->dev, __entry->ring, __entry->seqno, __entry->flags) > ); > > TRACE_EVENT(i915_gem_ring_flush, > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c > index 133beb6..6a4b8fa 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c > @@ -964,7 +964,9 @@ gen6_ring_put_irq(struct intel_ring_buffer *ring) > } > > static int > -i965_dispatch_execbuffer(struct intel_ring_buffer *ring, u32 offset, u32 length) > +i965_dispatch_execbuffer(struct intel_ring_buffer *ring, > + u32 offset, u32 length, > + unsigned flags) > { > int ret; > > @@ -975,7 +977,7 @@ i965_dispatch_execbuffer(struct intel_ring_buffer *ring, u32 offset, u32 length) > intel_ring_emit(ring, > MI_BATCH_BUFFER_START | > MI_BATCH_GTT | > - MI_BATCH_NON_SECURE_I965); > + (flags & I915_DISPATCH_SECURE ? 0 : MI_BATCH_NON_SECURE_I965)); > intel_ring_emit(ring, offset); > intel_ring_advance(ring); > > @@ -984,7 +986,8 @@ i965_dispatch_execbuffer(struct intel_ring_buffer *ring, u32 offset, u32 length) > > static int > i830_dispatch_execbuffer(struct intel_ring_buffer *ring, > - u32 offset, u32 len) > + u32 offset, u32 len, > + unsigned flags) > { > int ret; > > @@ -993,7 +996,7 @@ i830_dispatch_execbuffer(struct intel_ring_buffer *ring, > return ret; > > intel_ring_emit(ring, MI_BATCH_BUFFER); > - intel_ring_emit(ring, offset | MI_BATCH_NON_SECURE); > + intel_ring_emit(ring, offset | (flags & I915_DISPATCH_SECURE ? 0 : MI_BATCH_NON_SECURE)); > intel_ring_emit(ring, offset + len - 8); > intel_ring_emit(ring, 0); > intel_ring_advance(ring); > @@ -1003,7 +1006,8 @@ i830_dispatch_execbuffer(struct intel_ring_buffer *ring, > > static int > i915_dispatch_execbuffer(struct intel_ring_buffer *ring, > - u32 offset, u32 len) > + u32 offset, u32 len, > + unsigned flags) > { > int ret; > > @@ -1012,7 +1016,7 @@ i915_dispatch_execbuffer(struct intel_ring_buffer *ring, > return ret; > > intel_ring_emit(ring, MI_BATCH_BUFFER_START | MI_BATCH_GTT); > - intel_ring_emit(ring, offset | MI_BATCH_NON_SECURE); > + intel_ring_emit(ring, offset | (flags & I915_DISPATCH_SECURE ? 0 : MI_BATCH_NON_SECURE)); > intel_ring_advance(ring); > > return 0; > @@ -1410,7 +1414,8 @@ static int gen6_ring_flush(struct intel_ring_buffer *ring, > > static int > gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring, > - u32 offset, u32 len) > + u32 offset, u32 len, > + unsigned flags) > { > int ret; > > @@ -1418,7 +1423,9 @@ gen6_ring_dispatch_execbuffer(struct intel_ring_buffer *ring, > if (ret) > return ret; > > - intel_ring_emit(ring, MI_BATCH_BUFFER_START | MI_BATCH_NON_SECURE_I965); > + intel_ring_emit(ring, > + MI_BATCH_BUFFER_START | > + (flags & I915_DISPATCH_SECURE ? 0 : MI_BATCH_NON_SECURE_I965)); > /* bit0-7 is the length on GEN6+ */ > intel_ring_emit(ring, offset); > intel_ring_advance(ring); > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h > index 40b252e..2667f33 100644 > --- a/drivers/gpu/drm/i915/intel_ringbuffer.h > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h > @@ -81,7 +81,9 @@ struct intel_ring_buffer { > u32 (*get_seqno)(struct intel_ring_buffer *ring, > bool lazy_coherency); > int (*dispatch_execbuffer)(struct intel_ring_buffer *ring, > - u32 offset, u32 length); > + u32 offset, u32 length, > + unsigned flags); > +#define I915_DISPATCH_SECURE 0x1 > void (*cleanup)(struct intel_ring_buffer *ring); > int (*sync_to)(struct intel_ring_buffer *ring, > struct intel_ring_buffer *to, > diff --git a/include/drm/i915_drm.h b/include/drm/i915_drm.h > index acfb377..4f41f8d 100644 > --- a/include/drm/i915_drm.h > +++ b/include/drm/i915_drm.h > @@ -316,6 +316,7 @@ typedef struct drm_i915_irq_wait { > #define I915_PARAM_HAS_SEMAPHORES 20 > #define I915_PARAM_HAS_PRIME_VMAP_FLUSH 21 > #define I915_PARAM_RSVD_FOR_FUTURE_USE 22 > +#define I915_PARAM_HAS_SECURE_BATCHES 23 > > typedef struct drm_i915_getparam { > int param; > @@ -694,6 +695,11 @@ struct drm_i915_gem_execbuffer2 { > /** Resets the SO write offset registers for transform feedback on gen7. */ > #define I915_EXEC_GEN7_SOL_RESET (1<<8) > > +/** Request a privileged ("secure") batch buffer. Note only available for > + * DRM_ROOT_ONLY | DRM_MASTER processes. > + */ > +#define I915_EXEC_SECURE (1<<9) > + > #define I915_EXEC_CONTEXT_ID_MASK (0xffffffff) > #define i915_execbuffer2_set_context_id(eb2, context) \ > (eb2).rsvd1 = context & I915_EXEC_CONTEXT_ID_MASK Yeah looks good. Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> -- Jesse Barnes, Intel Open Source Technology Center ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] drm/i915: Allow DRM_ROOT_ONLY|DRM_MASTER to submit privileged batchbuffers 2012-10-17 16:31 ` [PATCH 1/2] drm/i915: Allow DRM_ROOT_ONLY|DRM_MASTER to submit privileged batchbuffers Jesse Barnes @ 2012-10-17 19:09 ` Daniel Vetter 0 siblings, 0 replies; 14+ messages in thread From: Daniel Vetter @ 2012-10-17 19:09 UTC (permalink / raw) To: Jesse Barnes; +Cc: intel-gfx On Wed, Oct 17, 2012 at 09:31:21AM -0700, Jesse Barnes wrote: > On Wed, 17 Oct 2012 12:09:54 +0100 > Chris Wilson <chris@chris-wilson.co.uk> wrote: > > > With the introduction of per-process GTT space, the hardware designers > > thought it wise to also limit the ability to write to MMIO space to only > > a "secure" batch buffer. The ability to rewrite registers is the only > > way to program the hardware to perform certain operations like scanline > > waits (required for tear-free windowed updates). So we either have a > > choice of adding an interface to perform those synchronized updates > > inside the kernel, or we permit certain processes the ability to write > > to the "safe" registers from within its command stream. This patch > > exposes the ability to submit a SECURE batch buffer to > > DRM_ROOT_ONLY|DRM_MASTER processes. > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Yeah looks good. > > Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> Merged, and upgraded Jesse's r-b to the full patch with the hsw changes - he wasn't on irc to deny it ;-). Thanks, Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] drm/i915: Allow DRM_ROOT_ONLY|DRM_MASTER to submit privileged batchbuffers 2012-10-17 11:09 ` [PATCH 1/2] drm/i915: Allow DRM_ROOT_ONLY|DRM_MASTER to submit privileged batchbuffers Chris Wilson 2012-10-17 11:09 ` [PATCH 2/2] drm/i915: Document the multi-threaded FORCEWAKE bits Chris Wilson 2012-10-17 16:31 ` [PATCH 1/2] drm/i915: Allow DRM_ROOT_ONLY|DRM_MASTER to submit privileged batchbuffers Jesse Barnes @ 2012-10-17 18:47 ` Eric Anholt 2012-10-17 19:04 ` Daniel Vetter 2 siblings, 1 reply; 14+ messages in thread From: Eric Anholt @ 2012-10-17 18:47 UTC (permalink / raw) To: Chris Wilson, intel-gfx [-- Attachment #1.1: Type: text/plain, Size: 779 bytes --] Chris Wilson <chris@chris-wilson.co.uk> writes: > With the introduction of per-process GTT space, the hardware designers > thought it wise to also limit the ability to write to MMIO space to only > a "secure" batch buffer. The ability to rewrite registers is the only > way to program the hardware to perform certain operations like scanline > waits (required for tear-free windowed updates). So we either have a > choice of adding an interface to perform those synchronized updates > inside the kernel, or we permit certain processes the ability to write > to the "safe" registers from within its command stream. This patch > exposes the ability to submit a SECURE batch buffer to > DRM_ROOT_ONLY|DRM_MASTER processes. This seems like a major blow to ever getting non-root X. [-- Attachment #1.2: Type: application/pgp-signature, Size: 197 bytes --] [-- Attachment #2: Type: text/plain, Size: 159 bytes --] _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH 1/2] drm/i915: Allow DRM_ROOT_ONLY|DRM_MASTER to submit privileged batchbuffers 2012-10-17 18:47 ` Eric Anholt @ 2012-10-17 19:04 ` Daniel Vetter 0 siblings, 0 replies; 14+ messages in thread From: Daniel Vetter @ 2012-10-17 19:04 UTC (permalink / raw) To: Eric Anholt; +Cc: intel-gfx On Wed, Oct 17, 2012 at 8:47 PM, Eric Anholt <eric@anholt.net> wrote: > Chris Wilson <chris@chris-wilson.co.uk> writes: > >> With the introduction of per-process GTT space, the hardware designers >> thought it wise to also limit the ability to write to MMIO space to only >> a "secure" batch buffer. The ability to rewrite registers is the only >> way to program the hardware to perform certain operations like scanline >> waits (required for tear-free windowed updates). So we either have a >> choice of adding an interface to perform those synchronized updates >> inside the kernel, or we permit certain processes the ability to write >> to the "safe" registers from within its command stream. This patch >> exposes the ability to submit a SECURE batch buffer to >> DRM_ROOT_ONLY|DRM_MASTER processes. > > This seems like a major blow to ever getting non-root X. Currently we only need this to get scanline waits going on snb+ and on ivb+ that will neatly keep the gpu out of rc6 while waiting for that scanline. So there rather clearly a massive legacy feature, ill supported by the hw. So that's the only thing that the X driver will lose, to which I highly suggest to simply install a pageflipping compositor of sorts and call it a day. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH] drm/i915: Set DERRMR around batches required vblank events 2012-10-16 14:15 ` Daniel Vetter 2012-10-17 11:09 ` [PATCH 1/2] drm/i915: Allow DRM_ROOT_ONLY|DRM_MASTER to submit privileged batchbuffers Chris Wilson @ 2012-10-17 13:58 ` Chris Wilson 1 sibling, 0 replies; 14+ messages in thread From: Chris Wilson @ 2012-10-17 13:58 UTC (permalink / raw) To: Daniel Vetter; +Cc: intel-gfx On Tue, 16 Oct 2012 16:15:40 +0200, Daniel Vetter <daniel@ffwll.ch> wrote: > On Tue, Oct 16, 2012 at 11:47 AM, Chris Wilson <chris@chris-wilson.co.uk> wrote: > > > > I am still not convinced this fixes anything as at the moment I am > > simply unable to detect any tearing on my ivb setup if I apply any form of > > vblank throttling. > > > > The other issue is that I think if I had a SECURE > > (DRM_MASTER|DRM_ROOT_ONLY) batch buffer I could probably do all of this > > in userspace. > > Yeah, I'm a bit torn between this approach and just allowing SECURE > batches. The big upside of secure batches is that it allows more > flexibility (if the LRI really work from secure batches and not just > from the ring). The only downside I can see is that we won't be able > to arbitrage the derrmr bits (since only one is allowed to be set at > any given time) between different userspace processes. But I don't see > mutliple concurrent display servers (with cooperative owenership of > the hw) happening anytime soon, so I won't worry about this before it > happens. Syncing against modeset should still work with our MI_WAIT > related waits on fbs before we disable the pipe. Ok, so it seems that with a SECURE batch buffer and the kernel fixing up the gGTT-vs-ppGTT mess, it is very easy to program a scanline wait. (I still haven't convinced myself that it does eliminate tearing, as even without vsync my usual tricks to reproduce tearing work fine.) Being the ego-centric user that I am, I have no qualms about making SECURE batches MASTER|ROOT_ONLY to limit the possibility of someone leaving the hardware in an undefined state. > The other issue (only on gen7) is that this will keep the gpu out of > rc6 (and hence the entire package) for long times, especially on > mostly-idle video playback. I don't think that massively increased > power consumption is what users will appreciated. Yes, not much I can do about that, other than hope everyone migrates over to a sane pageflipping architecture for their video playback. For the rest, we just have to make sure TearFree delivers low power consumption on future hardware. > Now with the Tearfree option and you saying that vblank render > throttling mostly fixes this, do we have any unhappy bug reporters > left? In that case I'd prefer to just can this entirely (and suggest > to ppl to use a real compositor - the wasted bw issue on X also seems > to be on track to be solved). Looking at the SECURE batches feature, I think that is a fair compromise for enabling legacy features on current platforms. It should also prove useful elsewhere, so I think it will stand the test of time. And in the meantime we can continue to encourage users to migrate away the power hungry applications. -Chris -- Chris Wilson, Intel Open Source Technology Centre ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] drm/i915: Set DERRMR around batches required vblank events @ 2012-07-26 20:30 Chris Wilson 2012-08-29 15:33 ` Chris Wilson 0 siblings, 1 reply; 14+ messages in thread From: Chris Wilson @ 2012-07-26 20:30 UTC (permalink / raw) To: intel-gfx In order for the Display Engine to send messages to the Render Engine, that event must be unmasked in the Display Engine Render Response Mask Register (DERRMR). By default all events are masked to prevent unwanted messages to conserve power, and it is strongly recommended that only single events be unmasked at any time. So in order to pipeline the register writes around individual batchbuffers, userspace must notify the kernel when it requires a vblank event, this patch implements an interface to do so with an single (pipe, event) request through the execbuffer flags. Note that another workaround is required for IVB, in that we must also prevent RC6 sleeps whilst waiting upon an event. To do that we also pipeline a forcewake into the second MT slot. Open issues: * How to handle failure to pipeline the register changes? Abort? Sync & undo? * Serialising DERRMR across rings. * Enabling vblank interrupts? References: https://bugs.freedesktop.org/show_bug.cgi?id=50244 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/i915/i915_dma.c | 3 ++ drivers/gpu/drm/i915/i915_gem_execbuffer.c | 65 ++++++++++++++++++++++++++++ drivers/gpu/drm/i915/i915_reg.h | 1 + include/drm/i915_drm.h | 17 ++++++++ 4 files changed, 86 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 8921c4f..5debf31 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -1009,6 +1009,9 @@ static int i915_getparam(struct drm_device *dev, void *data, case I915_PARAM_HAS_WAIT_TIMEOUT: value = 1; break; + case I915_PARAM_HAS_EXEC_VSYNC: + value = 1; + break; default: DRM_DEBUG_DRIVER("Unknown parameter %d\n", param->param); diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index d84ca98..90d8847 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1050,6 +1050,48 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, if (ret) goto err; + if (args->flags & I915_EXEC_VSYNC_ENABLE && + INTEL_INFO(dev)->gen >= 6) { + int pipe, event; + + printk(KERN_ERR "enabling sync\n"); + + /* XXX Inter-ring serialisation? */ + + event = I915_EXEC_VSYNC_GET_EVENT(args->flags); + if (event > I915_EXEC_VSYNC_VBLANK) { + ret = -EINVAL; + goto err; + } + pipe = I915_EXEC_VSYNC_GET_PIPE(args->flags); + if (pipe >= dev_priv->num_pipe) { + ret = -EINVAL; + goto err; + } + + printk(KERN_ERR "event=%x, pipe=%d\n", event, pipe); + event = 1 << (event + 8 * pipe); + + ret = intel_ring_begin(ring, 4); + if (ret) + goto err; + + intel_ring_emit(ring, MI_NOOP); + intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1)); + intel_ring_emit(ring, DERRMR); + intel_ring_emit(ring, ~event); + intel_ring_advance(ring); + + if (INTEL_INFO(dev)->gen >= 7 && + intel_ring_begin(ring, 4) == 0) { + intel_ring_emit(ring, MI_NOOP); + intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1)); + intel_ring_emit(ring, FORCEWAKE_MT); + intel_ring_emit(ring, _MASKED_BIT_ENABLE(2)); + intel_ring_advance(ring); + } + } + trace_i915_gem_ring_dispatch(ring, seqno); exec_start = batch_obj->gtt_offset + args->batch_start_offset; @@ -1072,6 +1114,29 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data, goto err; } + if (args->flags & I915_EXEC_VSYNC_ENABLE && + INTEL_INFO(dev)->gen >= 6) { + bool was_interruptible = dev_priv->mm.interruptible; + + dev_priv->mm.interruptible = false; + if (INTEL_INFO(dev)->gen >= 7 && + intel_ring_begin(ring, 4) == 0) { + intel_ring_emit(ring, MI_NOOP); + intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1)); + intel_ring_emit(ring, FORCEWAKE_MT); + intel_ring_emit(ring, _MASKED_BIT_DISABLE(2)); + intel_ring_advance(ring); + } + if (intel_ring_begin(ring, 4) == 0) { + intel_ring_emit(ring, MI_NOOP); + intel_ring_emit(ring, MI_LOAD_REGISTER_IMM(1)); + intel_ring_emit(ring, DERRMR); + intel_ring_emit(ring, 0xffffffff); + intel_ring_advance(ring); + } + dev_priv->mm.interruptible = was_interruptible; + } + i915_gem_execbuffer_move_to_active(&objects, ring, seqno); i915_gem_execbuffer_retire_commands(dev, file, ring); diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 1a22278..2f3a8f9 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -3369,6 +3369,7 @@ #define DEIMR 0x44004 #define DEIIR 0x44008 #define DEIER 0x4400c +#define DERRMR 0x44050 /* GT interrupt. * Note that for gen6+ the ring-specific interrupt bits do alias with the diff --git a/include/drm/i915_drm.h b/include/drm/i915_drm.h index 8cafced4..24f5895 100644 --- a/include/drm/i915_drm.h +++ b/include/drm/i915_drm.h @@ -313,6 +313,7 @@ typedef struct drm_i915_irq_wait { #define I915_PARAM_HAS_LLC 17 #define I915_PARAM_HAS_ALIASING_PPGTT 18 #define I915_PARAM_HAS_WAIT_TIMEOUT 19 +#define I915_PARAM_HAS_EXEC_VSYNC 20 typedef struct drm_i915_getparam { int param; @@ -691,6 +692,22 @@ struct drm_i915_gem_execbuffer2 { /** Resets the SO write offset registers for transform feedback on gen7. */ #define I915_EXEC_GEN7_SOL_RESET (1<<8) +#define I915_EXEC_VSYNC_ENABLE (1<<9) +#define I915_EXEC_VSYNC_EVENT_SHIFT (10) +#define I915_EXEC_VSYNC_EVENT_MASK (7) +#define I915_EXEC_VSYNC_SCANLINE (0) +#define I915_EXEC_VSYNC_PRIMARY_FLIP (1) +#define I915_EXEC_VSYNC_SPRITE_FLIP (2) +#define I915_EXEC_VSYNC_VBLANK (3) +#define I915_EXEC_VSYNC_PIPE_SHIFT (13) +#define I915_EXEC_VSYNC_PIPE_MASK (7) + +#define I915_EXEC_VSYNC(pipe, event) \ + (I915_EXEC_VSYNC_ENABLE | (pipe) << I915_EXEC_VSYNC_PIPE_SHIFT | (event) << I915_EXEC_VSYNC_EVENT_SHIFT) + +#define I915_EXEC_VSYNC_GET_PIPE(F) (((F) >> I915_EXEC_VSYNC_PIPE_SHIFT) & I915_EXEC_VSYNC_PIPE_MASK) +#define I915_EXEC_VSYNC_GET_EVENT(F) (((F) >> I915_EXEC_VSYNC_EVENT_SHIFT) & I915_EXEC_VSYNC_EVENT_MASK) + #define I915_EXEC_CONTEXT_ID_MASK (0xffffffff) #define i915_execbuffer2_set_context_id(eb2, context) \ (eb2).rsvd1 = context & I915_EXEC_CONTEXT_ID_MASK -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH] drm/i915: Set DERRMR around batches required vblank events 2012-07-26 20:30 Chris Wilson @ 2012-08-29 15:33 ` Chris Wilson 0 siblings, 0 replies; 14+ messages in thread From: Chris Wilson @ 2012-08-29 15:33 UTC (permalink / raw) To: intel-gfx On Thu, 26 Jul 2012 21:30:19 +0100, Chris Wilson <chris@chris-wilson.co.uk> wrote: > In order for the Display Engine to send messages to the Render Engine, > that event must be unmasked in the Display Engine Render Response Mask > Register (DERRMR). By default all events are masked to prevent unwanted > messages to conserve power, and it is strongly recommended that only > single events be unmasked at any time. So in order to pipeline the > register writes around individual batchbuffers, userspace must notify > the kernel when it requires a vblank event, this patch implements an > interface to do so with an single (pipe, event) request through the > execbuffer flags. > > Note that another workaround is required for IVB, in that we must also > prevent RC6 sleeps whilst waiting upon an event. To do that we also > pipeline a forcewake into the second MT slot. Anyone have any criticisms for this patch? -Chris -- Chris Wilson, Intel Open Source Technology Centre ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2012-10-17 19:09 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-10-16 9:47 [PATCH] drm/i915: Set DERRMR around batches required vblank events Chris Wilson 2012-10-16 14:15 ` Daniel Vetter 2012-10-17 11:09 ` [PATCH 1/2] drm/i915: Allow DRM_ROOT_ONLY|DRM_MASTER to submit privileged batchbuffers Chris Wilson 2012-10-17 11:09 ` [PATCH 2/2] drm/i915: Document the multi-threaded FORCEWAKE bits Chris Wilson 2012-10-17 16:27 ` Jesse Barnes 2012-10-17 17:05 ` Jesse Barnes 2012-10-17 19:10 ` Daniel Vetter 2012-10-17 16:31 ` [PATCH 1/2] drm/i915: Allow DRM_ROOT_ONLY|DRM_MASTER to submit privileged batchbuffers Jesse Barnes 2012-10-17 19:09 ` Daniel Vetter 2012-10-17 18:47 ` Eric Anholt 2012-10-17 19:04 ` Daniel Vetter 2012-10-17 13:58 ` [PATCH] drm/i915: Set DERRMR around batches required vblank events Chris Wilson -- strict thread matches above, loose matches on Subject: below -- 2012-07-26 20:30 Chris Wilson 2012-08-29 15:33 ` Chris Wilson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox