* [RFC] Updated Android sync patches
@ 2014-12-03 19:49 Jesse Barnes
2014-12-03 19:49 ` [PATCH 1/3] android: add sync_fence_create_dma Jesse Barnes
` (2 more replies)
0 siblings, 3 replies; 15+ messages in thread
From: Jesse Barnes @ 2014-12-03 19:49 UTC (permalink / raw)
To: intel-gfx; +Cc: Maarten Lankhorst
Still have a few remaining todo items, but I'd like some feedback on these
before adding a bunch of stuff on top.
Rough todo list:
- support for other rings
- support for display
- tests
- userspace usage (I have a Mesa patch that needs a refresh, but it's
trivial, the interesting bit will be using this in Wayland and
X/DDX)
Thanks,
Jesse
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply [flat|nested] 15+ messages in thread* [PATCH 1/3] android: add sync_fence_create_dma 2014-12-03 19:49 [RFC] Updated Android sync patches Jesse Barnes @ 2014-12-03 19:49 ` Jesse Barnes 2015-01-14 14:09 ` Tvrtko Ursulin 2014-12-03 19:49 ` [PATCH 2/3] drm/i915: Android sync points for i915 v3 Jesse Barnes 2014-12-03 19:49 ` [PATCH 3/3] drm/i915: add fences to the request struct Jesse Barnes 2 siblings, 1 reply; 15+ messages in thread From: Jesse Barnes @ 2014-12-03 19:49 UTC (permalink / raw) To: intel-gfx; +Cc: Maarten Lankhorst From: Maarten Lankhorst <maarten.lankhorst@canonical.com> This allows users of dma fences to create a android fence. Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> --- drivers/staging/android/sync.c | 13 +++++++++---- drivers/staging/android/sync.h | 3 ++- 2 files changed, 11 insertions(+), 5 deletions(-) diff --git a/drivers/staging/android/sync.c b/drivers/staging/android/sync.c index 7bdb62b..0ee333e 100644 --- a/drivers/staging/android/sync.c +++ b/drivers/staging/android/sync.c @@ -188,7 +188,7 @@ static void fence_check_cb_func(struct fence *f, struct fence_cb *cb) } /* TODO: implement a create which takes more that one sync_pt */ -struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt) +struct sync_fence *sync_fence_create_dma(const char *name, struct fence *pt) { struct sync_fence *fence; @@ -199,16 +199,21 @@ struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt) fence->num_fences = 1; atomic_set(&fence->status, 1); - fence->cbs[0].sync_pt = &pt->base; + fence->cbs[0].sync_pt = pt; fence->cbs[0].fence = fence; - if (fence_add_callback(&pt->base, &fence->cbs[0].cb, - fence_check_cb_func)) + if (fence_add_callback(pt, &fence->cbs[0].cb, fence_check_cb_func)) atomic_dec(&fence->status); sync_fence_debug_add(fence); return fence; } +EXPORT_SYMBOL(sync_fence_create_dma); + +struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt) +{ + return sync_fence_create_dma(name, &pt->base); +} EXPORT_SYMBOL(sync_fence_create); struct sync_fence *sync_fence_fdget(int fd) diff --git a/drivers/staging/android/sync.h b/drivers/staging/android/sync.h index a21b79f..50052e4 100644 --- a/drivers/staging/android/sync.h +++ b/drivers/staging/android/sync.h @@ -250,8 +250,9 @@ void sync_pt_free(struct sync_pt *pt); * @pt: sync_pt to add to the fence * * Creates a fence containg @pt. Once this is called, the fence takes - * ownership of @pt. + * a reference on @pt. */ +struct sync_fence *sync_fence_create_dma(const char *name, struct fence *pt); struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt); /* -- 1.9.1 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] android: add sync_fence_create_dma 2014-12-03 19:49 ` [PATCH 1/3] android: add sync_fence_create_dma Jesse Barnes @ 2015-01-14 14:09 ` Tvrtko Ursulin 2015-01-14 16:04 ` Maarten Lankhorst 0 siblings, 1 reply; 15+ messages in thread From: Tvrtko Ursulin @ 2015-01-14 14:09 UTC (permalink / raw) To: Jesse Barnes, intel-gfx; +Cc: Maarten Lankhorst Hi, On 12/03/2014 07:49 PM, Jesse Barnes wrote: > From: Maarten Lankhorst <maarten.lankhorst@canonical.com> > > This allows users of dma fences to create a android fence. I couldn't figure out the motivation here vs. just exporting sync_fence_create ? Regards, Tvrtko > Cc: Daniel Vetter <daniel@ffwll.ch> > Cc: Jesse Barnes <jbarnes@virtuousgeek.org> > Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com> > --- > drivers/staging/android/sync.c | 13 +++++++++---- > drivers/staging/android/sync.h | 3 ++- > 2 files changed, 11 insertions(+), 5 deletions(-) > > diff --git a/drivers/staging/android/sync.c b/drivers/staging/android/sync.c > index 7bdb62b..0ee333e 100644 > --- a/drivers/staging/android/sync.c > +++ b/drivers/staging/android/sync.c > @@ -188,7 +188,7 @@ static void fence_check_cb_func(struct fence *f, struct fence_cb *cb) > } > > /* TODO: implement a create which takes more that one sync_pt */ > -struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt) > +struct sync_fence *sync_fence_create_dma(const char *name, struct fence *pt) > { > struct sync_fence *fence; > > @@ -199,16 +199,21 @@ struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt) > fence->num_fences = 1; > atomic_set(&fence->status, 1); > > - fence->cbs[0].sync_pt = &pt->base; > + fence->cbs[0].sync_pt = pt; > fence->cbs[0].fence = fence; > - if (fence_add_callback(&pt->base, &fence->cbs[0].cb, > - fence_check_cb_func)) > + if (fence_add_callback(pt, &fence->cbs[0].cb, fence_check_cb_func)) > atomic_dec(&fence->status); > > sync_fence_debug_add(fence); > > return fence; > } > +EXPORT_SYMBOL(sync_fence_create_dma); > + > +struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt) > +{ > + return sync_fence_create_dma(name, &pt->base); > +} > EXPORT_SYMBOL(sync_fence_create); > > struct sync_fence *sync_fence_fdget(int fd) > diff --git a/drivers/staging/android/sync.h b/drivers/staging/android/sync.h > index a21b79f..50052e4 100644 > --- a/drivers/staging/android/sync.h > +++ b/drivers/staging/android/sync.h > @@ -250,8 +250,9 @@ void sync_pt_free(struct sync_pt *pt); > * @pt: sync_pt to add to the fence > * > * Creates a fence containg @pt. Once this is called, the fence takes > - * ownership of @pt. > + * a reference on @pt. > */ > +struct sync_fence *sync_fence_create_dma(const char *name, struct fence *pt); > struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt); > > /* > _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] android: add sync_fence_create_dma 2015-01-14 14:09 ` Tvrtko Ursulin @ 2015-01-14 16:04 ` Maarten Lankhorst 0 siblings, 0 replies; 15+ messages in thread From: Maarten Lankhorst @ 2015-01-14 16:04 UTC (permalink / raw) To: Tvrtko Ursulin, Jesse Barnes, intel-gfx Hey, On 14-01-15 15:09, Tvrtko Ursulin wrote: > > Hi, > > On 12/03/2014 07:49 PM, Jesse Barnes wrote: >> From: Maarten Lankhorst <maarten.lankhorst@canonical.com> >> >> This allows users of dma fences to create a android fence. > > I couldn't figure out the motivation here vs. just exporting sync_fence_create ? Keeping the api compatible with android. They were the first users of sync_fence_create, the conversion of android userspace fences to use fences in kernelspace was done later. ~Maarten _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 2/3] drm/i915: Android sync points for i915 v3 2014-12-03 19:49 [RFC] Updated Android sync patches Jesse Barnes 2014-12-03 19:49 ` [PATCH 1/3] android: add sync_fence_create_dma Jesse Barnes @ 2014-12-03 19:49 ` Jesse Barnes 2014-12-03 21:29 ` Maarten Lankhorst ` (2 more replies) 2014-12-03 19:49 ` [PATCH 3/3] drm/i915: add fences to the request struct Jesse Barnes 2 siblings, 3 replies; 15+ messages in thread From: Jesse Barnes @ 2014-12-03 19:49 UTC (permalink / raw) To: intel-gfx; +Cc: Maarten Lankhorst Expose an ioctl to create Android fences based on the Android sync point infrastructure (which in turn is based on DMA-buf fences). Just a sketch at this point, no testing has been done. There are a couple of goals here: 1) allow applications and libraries to create fences without an associated buffer 2) re-use a common API so userspace doesn't have to impedance mismatch between different driver implementations too much 3) allow applications and libraries to use explicit synchronization if they choose by exposing fences directly v2: use struct fence directly using Maarten's new interface v3: use new i915 request structure (Jesse) make i915 fences a compile time option since Android support is still in staging (Jesse) check for request complete after arming IRQs (Chris) add timeline id to ioctl (Tvrtko) Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> --- drivers/gpu/drm/i915/Kconfig | 14 ++ drivers/gpu/drm/i915/Makefile | 1 + drivers/gpu/drm/i915/i915_dma.c | 5 + drivers/gpu/drm/i915/i915_drv.h | 20 +++ drivers/gpu/drm/i915/i915_gem.c | 5 + drivers/gpu/drm/i915/i915_irq.c | 4 +- drivers/gpu/drm/i915/i915_sync.c | 325 +++++++++++++++++++++++++++++++++++++++ include/uapi/drm/i915_drm.h | 23 +++ 8 files changed, 396 insertions(+), 1 deletion(-) create mode 100644 drivers/gpu/drm/i915/i915_sync.c diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig index 4e39ab3..6b23d17 100644 --- a/drivers/gpu/drm/i915/Kconfig +++ b/drivers/gpu/drm/i915/Kconfig @@ -43,6 +43,20 @@ config DRM_I915_KMS If in doubt, say "Y". +config DRM_I915_SYNC + bool "Enable explicit sync support" + depends on DRM_I915 + default y + select STAGING + select ANDROID + select SYNC + help + Choose this option to enable Android native sync support and the + corresponding i915 driver code to expose it. Slightly increases + driver size and pulls in sync support from staging. + + If in doubt, say "Y". + config DRM_I915_FBDEV bool "Enable legacy fbdev support for the modesetting intel driver" depends on DRM_I915 diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index e4083e4..45e155f 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -56,6 +56,7 @@ i915-y += intel_audio.o \ intel_sprite.o i915-$(CONFIG_ACPI) += intel_acpi.o intel_opregion.o i915-$(CONFIG_DRM_I915_FBDEV) += intel_fbdev.o +i915-$(CONFIG_DRM_I915_SYNC) += i915_sync.o # modesetting output/encoder code i915-y += dvo_ch7017.o \ diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c index 887d88f..621bd7f 100644 --- a/drivers/gpu/drm/i915/i915_dma.c +++ b/drivers/gpu/drm/i915/i915_dma.c @@ -1062,6 +1062,11 @@ const struct drm_ioctl_desc i915_ioctls[] = { DRM_IOCTL_DEF_DRV(I915_REG_READ, i915_reg_read_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_GET_RESET_STATS, i915_get_reset_stats_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(I915_GEM_USERPTR, i915_gem_userptr_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), +#ifdef CONFIG_DRM_I915_SYNC + DRM_IOCTL_DEF_DRV(I915_GEM_FENCE, i915_sync_create_fence_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), +#else + DRM_IOCTL_DEF_DRV(I915_GEM_FENCE, drm_noop, DRM_UNLOCKED|DRM_RENDER_ALLOW), +#endif }; int i915_max_ioctl = ARRAY_SIZE(i915_ioctls); diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 049482f..367d95a 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1525,6 +1525,9 @@ struct i915_workarounds { u32 count; }; +struct i915_sync_timeline; + + struct drm_i915_private { struct drm_device *dev; struct kmem_cache *slab; @@ -1560,6 +1563,9 @@ struct drm_i915_private { uint32_t last_seqno, next_seqno; struct drm_dma_handle *status_page_dmah; + + struct i915_sync_timeline *sync_tl[I915_NUM_RINGS]; + struct resource mch_res; /* protects the irq masks */ @@ -2509,6 +2515,20 @@ void i915_init_vm(struct drm_i915_private *dev_priv, void i915_gem_free_object(struct drm_gem_object *obj); void i915_gem_vma_destroy(struct i915_vma *vma); +/* i915_sync.c */ +#ifdef CONFIG_DRM_I915_SYNC +int i915_sync_init(struct drm_i915_private *dev_priv); +void i915_sync_fini(struct drm_i915_private *dev_priv); +int i915_sync_create_fence_ioctl(struct drm_device *dev, void *data, + struct drm_file *file); +#else +static inline int i915_sync_init(struct drm_i915_private *dev_priv) +{ + return 0; +} +static inline void i915_sync_fini(struct drm_i915_private *dev_priv) { } +#endif + #define PIN_MAPPABLE 0x1 #define PIN_NONBLOCK 0x2 #define PIN_GLOBAL 0x4 diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 7a83a9f..39407b4 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4887,6 +4887,9 @@ int i915_gem_init(struct drm_device *dev) atomic_set_mask(I915_WEDGED, &dev_priv->gpu_error.reset_counter); ret = 0; } + + i915_sync_init(dev_priv); + mutex_unlock(&dev->struct_mutex); return ret; @@ -5011,6 +5014,8 @@ void i915_gem_release(struct drm_device *dev, struct drm_file *file) request->file_priv = NULL; } spin_unlock(&file_priv->mm.lock); + + i915_sync_fini(dev->dev_private); } static void diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 7913a72..d4ede83 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -33,6 +33,7 @@ #include <linux/circ_buf.h> #include <drm/drmP.h> #include <drm/i915_drm.h> +#include "../../../staging/android/sync.h" #include "i915_drv.h" #include "i915_trace.h" #include "intel_drv.h" @@ -2378,8 +2379,9 @@ static void i915_error_wake_up(struct drm_i915_private *dev_priv, */ /* Wake up __wait_seqno, potentially holding dev->struct_mutex. */ - for_each_ring(ring, dev_priv, i) + for_each_ring(ring, dev_priv, i) { wake_up_all(&ring->irq_queue); + } /* Wake up intel_crtc_wait_for_pending_flips, holding crtc->mutex. */ wake_up_all(&dev_priv->pending_flip_queue); diff --git a/drivers/gpu/drm/i915/i915_sync.c b/drivers/gpu/drm/i915/i915_sync.c new file mode 100644 index 0000000..4741b0c --- /dev/null +++ b/drivers/gpu/drm/i915/i915_sync.c @@ -0,0 +1,325 @@ +/* + * Copyright © 2014 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * Authors: + * Jesse Barnes <jbarnes@virtuousgeek.org> + * + */ + +#include <drm/drmP.h> +#include <drm/drm_vma_manager.h> +#include <drm/i915_drm.h> +#include "i915_drv.h" +#include "i915_trace.h" +#include "intel_drv.h" +#include <linux/oom.h> +#include <linux/shmem_fs.h> +#include <linux/slab.h> +#include <linux/swap.h> +#include <linux/pci.h> +#include <linux/dma-buf.h> +#include "../../../staging/android/sync.h" + +/* Nothing really to protect here... */ +static spinlock_t fence_lock; + +/* + * i915 fences on sync timelines + * + * We implement sync points in terms of i915 seqnos. They're exposed + * through the new DRM_I915_GEM_FENCE ioctl, and can be mixed and matched + * with other Android timelines and aggregated into sync_fences, etc. + * + * TODO: + * rebase on top of Chris's seqno/request stuff and use requests + * allow non-RCS fences (need ring/context association) + */ + +struct i915_fence { + struct fence base; + struct intel_engine_cs *ring; + struct intel_context *ctx; + wait_queue_t wait; + struct drm_i915_gem_request *request; +}; + +#define to_intel_fence(x) container_of(x, struct i915_fence, base) + +int __wait_seqno(struct intel_engine_cs *ring, u32 seqno, + unsigned reset_counter, + bool interruptible, + struct timespec *timeout, + struct drm_i915_file_private *file_priv); + +static const char *i915_fence_get_driver_name(struct fence *fence) +{ + return "i915"; +} + +static const char *i915_fence_get_timeline_name(struct fence *fence) +{ + struct i915_fence *intel_fence = to_intel_fence(fence); + + return intel_fence->ring->name; +} + +static int i915_fence_check(wait_queue_t *wait, unsigned mode, int flags, + void *key) +{ + struct i915_fence *intel_fence = wait->private; + struct intel_engine_cs *ring = intel_fence->ring; + + if (!i915_gem_request_completed(intel_fence->request, false)) + return 0; + + fence_signal_locked(&intel_fence->base); + + __remove_wait_queue(&ring->irq_queue, wait); + fence_put(&intel_fence->base); + ring->irq_put(ring); + + return 0; +} + +static bool i915_fence_enable_signaling(struct fence *fence) +{ + struct i915_fence *intel_fence = to_intel_fence(fence); + struct intel_engine_cs *ring = intel_fence->ring; + struct drm_i915_private *dev_priv = ring->dev->dev_private; + wait_queue_t *wait = &intel_fence->wait; + + /* queue fence wait queue on irq queue and get fence */ + if (i915_gem_request_completed(intel_fence->request, false) || + i915_terminally_wedged(&dev_priv->gpu_error)) + return false; + + if (!ring->irq_get(ring)) + return false; + + if (i915_gem_request_completed(intel_fence->request, false)) { + ring->irq_put(ring); + return true; + } + + wait->flags = 0; + wait->private = intel_fence; + wait->func = i915_fence_check; + + __add_wait_queue(&ring->irq_queue, wait); + fence_get(fence); + + return true; +} + +static bool i915_fence_signaled(struct fence *fence) +{ + struct i915_fence *intel_fence = to_intel_fence(fence); + + return i915_gem_request_completed(intel_fence->request, false); +} + +static signed long i915_fence_wait(struct fence *fence, bool intr, + signed long timeout_js) +{ + struct i915_fence *intel_fence = to_intel_fence(fence); + struct drm_i915_private *dev_priv = intel_fence->ring->dev->dev_private; + int ret; + s64 timeout; + + timeout = jiffies_to_nsecs(timeout_js); + + ret = __i915_wait_request(intel_fence->request, + atomic_read(&dev_priv->gpu_error.reset_counter), + intr, &timeout, NULL); + if (ret == -ETIME) + return nsecs_to_jiffies(timeout); + + return ret; +} + +static int i915_fence_fill_driver_data(struct fence *fence, void *data, + int size) +{ + struct i915_fence *intel_fence = to_intel_fence(fence); + + if (size < sizeof(intel_fence->request->seqno)) + return -ENOMEM; + + memcpy(data, &intel_fence->request->seqno, + sizeof(intel_fence->request->seqno)); + + return sizeof(intel_fence->request->seqno); +} + +static void i915_fence_value_str(struct fence *fence, char *str, int size) +{ + struct i915_fence *intel_fence = to_intel_fence(fence); + + snprintf(str, size, "%u", intel_fence->request->seqno); +} + +static void i915_fence_timeline_value_str(struct fence *fence, char *str, + int size) +{ + struct i915_fence *intel_fence = to_intel_fence(fence); + struct intel_engine_cs *ring = intel_fence->ring; + + snprintf(str, size, "%u", ring->get_seqno(ring, false)); +} + +static struct fence_ops i915_fence_ops = { + .get_driver_name = i915_fence_get_driver_name, + .get_timeline_name = i915_fence_get_timeline_name, + .enable_signaling = i915_fence_enable_signaling, + .signaled = i915_fence_signaled, + .wait = i915_fence_wait, + .fill_driver_data = i915_fence_fill_driver_data, + .fence_value_str = i915_fence_value_str, + .timeline_value_str = i915_fence_timeline_value_str, +}; + +static struct fence *i915_fence_create(struct intel_engine_cs *ring, + struct intel_context *ctx) +{ + struct i915_fence *fence; + int ret; + + fence = kzalloc(sizeof(*fence), GFP_KERNEL); + if (!fence) + return NULL; + + ret = ring->add_request(ring); + if (ret) { + DRM_ERROR("add_request failed\n"); + fence_free((struct fence *)fence); + return NULL; + } + + fence->ring = ring; + fence->ctx = ctx; + fence->request = ring->outstanding_lazy_request; + fence_init(&fence->base, &i915_fence_ops, &fence_lock, ctx->user_handle, + fence->request->seqno); + + return &fence->base; +} + +/** + * i915_sync_create_fence_ioctl - fence creation function + * @dev: drm device + * @data: ioctl data + * @file: file struct + * + * This function creates a fence given a context and ring, and returns + * it to the caller in the form of a file descriptor. + * + * The returned descriptor is a sync fence fd, and can be used with all + * the usual sync fence operations (poll, ioctl, etc). + * + * The process fd limit should prevent an overallocation of fence objects, + * which need to be destroyed manually with a close() call. + */ +int i915_sync_create_fence_ioctl(struct drm_device *dev, void *data, + struct drm_file *file) +{ + struct drm_i915_private *dev_priv = dev->dev_private; + struct drm_i915_gem_fence *fdata = data; + struct fence *fence; + struct sync_fence *sfence; + struct intel_engine_cs *ring; + struct intel_context *ctx; + u32 ctx_id = fdata->ctx_id; + int fd = get_unused_fd_flags(O_CLOEXEC); + int ret = 0; + + if (file == NULL) { + DRM_ERROR("no file priv?\n"); + return -EINVAL; + } + + ret = i915_mutex_lock_interruptible(dev); + if (ret) { + DRM_ERROR("mutex interrupted\n"); + goto out; + } + + ctx = i915_gem_context_get(file->driver_priv, ctx_id); + if (ctx == NULL) { + DRM_ERROR("context lookup failed\n"); + ret = -ENOENT; + goto err; + } + + ring = &dev_priv->ring[RCS]; + + if (!intel_ring_initialized(ring)) { + DRM_ERROR("ring not ready\n"); + ret = -EIO; + goto err; + } + + fence = i915_fence_create(ring, ctx); + if (!fence) { + ret = -ENOMEM; + goto err; + } + + fdata->name[sizeof(fdata->name) - 1] = '\0'; + sfence = sync_fence_create_dma(fdata->name, fence); + if (!sfence) { + ret = -ENOMEM; + goto err; + } + + fdata->fd = fd; + + sync_fence_install(sfence, fd); + + mutex_unlock(&dev->struct_mutex); +out: + return ret; + +err: + mutex_unlock(&dev->struct_mutex); + put_unused_fd(fd); + return ret; +} + +int i915_sync_init(struct drm_i915_private *dev_priv) +{ + struct intel_engine_cs *ring; + int i, ret = 0; + + for_each_ring(ring, dev_priv, i) { + /* FIXME: non-RCS fences */ + } + + return ret; +} + +void i915_sync_fini(struct drm_i915_private *dev_priv) +{ + int i; + + for (i = 0; i < I915_NUM_RINGS; i++) { + } +} diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h index 2502622..3127990 100644 --- a/include/uapi/drm/i915_drm.h +++ b/include/uapi/drm/i915_drm.h @@ -224,6 +224,7 @@ typedef struct _drm_i915_sarea { #define DRM_I915_REG_READ 0x31 #define DRM_I915_GET_RESET_STATS 0x32 #define DRM_I915_GEM_USERPTR 0x33 +#define DRM_I915_GEM_FENCE 0x34 #define DRM_IOCTL_I915_INIT DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t) #define DRM_IOCTL_I915_FLUSH DRM_IO ( DRM_COMMAND_BASE + DRM_I915_FLUSH) @@ -275,6 +276,7 @@ typedef struct _drm_i915_sarea { #define DRM_IOCTL_I915_REG_READ DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_REG_READ, struct drm_i915_reg_read) #define DRM_IOCTL_I915_GET_RESET_STATS DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GET_RESET_STATS, struct drm_i915_reset_stats) #define DRM_IOCTL_I915_GEM_USERPTR DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_USERPTR, struct drm_i915_gem_userptr) +#define DRM_IOCTL_I915_GEM_FENCE DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_FENCE, struct drm_i915_gem_fence) /* Allow drivers to submit batchbuffers directly to hardware, relying * on the security mechanisms provided by hardware. @@ -1073,4 +1075,25 @@ struct drm_i915_gem_userptr { __u32 handle; }; +/** + * drm_i915_gem_fence - create a fence + * @fd: fd for fence + * @ctx_id: context ID for fence + * @flags: flags for operation + * + * Creates a fence in @fd and returns it to the caller. This fd can be + * passed around between processes as any other fd, and can be poll'd + * and read for status. + * + * RETURNS: + * A valid fd in the @fd field or an errno on error. + */ +struct drm_i915_gem_fence { + __s32 fd; + __u32 ctx_id; + __u32 timeline_id; + __u32 flags; + char name[32]; +}; + #endif /* _UAPI_I915_DRM_H_ */ -- 1.9.1 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 2/3] drm/i915: Android sync points for i915 v3 2014-12-03 19:49 ` [PATCH 2/3] drm/i915: Android sync points for i915 v3 Jesse Barnes @ 2014-12-03 21:29 ` Maarten Lankhorst 2014-12-04 8:10 ` Chris Wilson 2014-12-04 11:21 ` Daniel Vetter 2 siblings, 0 replies; 15+ messages in thread From: Maarten Lankhorst @ 2014-12-03 21:29 UTC (permalink / raw) To: Jesse Barnes, intel-gfx On 03-12-14 20:49, Jesse Barnes wrote: > Expose an ioctl to create Android fences based on the Android sync point > infrastructure (which in turn is based on DMA-buf fences). Just a > sketch at this point, no testing has been done. > > There are a couple of goals here: > 1) allow applications and libraries to create fences without an > associated buffer > 2) re-use a common API so userspace doesn't have to impedance mismatch > between different driver implementations too much > 3) allow applications and libraries to use explicit synchronization if > they choose by exposing fences directly > > v2: use struct fence directly using Maarten's new interface > v3: use new i915 request structure (Jesse) > make i915 fences a compile time option since Android support is still > in staging (Jesse) > check for request complete after arming IRQs (Chris) > add timeline id to ioctl (Tvrtko) > > Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> > --- > drivers/gpu/drm/i915/Kconfig | 14 ++ > drivers/gpu/drm/i915/Makefile | 1 + > drivers/gpu/drm/i915/i915_dma.c | 5 + > drivers/gpu/drm/i915/i915_drv.h | 20 +++ > drivers/gpu/drm/i915/i915_gem.c | 5 + > drivers/gpu/drm/i915/i915_irq.c | 4 +- > drivers/gpu/drm/i915/i915_sync.c | 325 +++++++++++++++++++++++++++++++++++++++ > include/uapi/drm/i915_drm.h | 23 +++ > 8 files changed, 396 insertions(+), 1 deletion(-) > create mode 100644 drivers/gpu/drm/i915/i915_sync.c > > diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig > index 4e39ab3..6b23d17 100644 > --- a/drivers/gpu/drm/i915/Kconfig > +++ b/drivers/gpu/drm/i915/Kconfig > @@ -43,6 +43,20 @@ config DRM_I915_KMS > > If in doubt, say "Y". > > +config DRM_I915_SYNC > + bool "Enable explicit sync support" > + depends on DRM_I915 > + default y default n > + select STAGING shouldn't that be depends on STAGING rather? :P > + select ANDROID > + select SYNC > + help > + Choose this option to enable Android native sync support and the > + corresponding i915 driver code to expose it. Slightly increases > + driver size and pulls in sync support from staging. > + > + If in doubt, say "Y". "N" > + > config DRM_I915_FBDEV > bool "Enable legacy fbdev support for the modesetting intel driver" > depends on DRM_I915 > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile > index e4083e4..45e155f 100644 > --- a/drivers/gpu/drm/i915/Makefile > +++ b/drivers/gpu/drm/i915/Makefile > @@ -56,6 +56,7 @@ i915-y += intel_audio.o \ > intel_sprite.o > i915-$(CONFIG_ACPI) += intel_acpi.o intel_opregion.o > i915-$(CONFIG_DRM_I915_FBDEV) += intel_fbdev.o > +i915-$(CONFIG_DRM_I915_SYNC) += i915_sync.o > > # modesetting output/encoder code > i915-y += dvo_ch7017.o \ > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c > index 887d88f..621bd7f 100644 > --- a/drivers/gpu/drm/i915/i915_dma.c > +++ b/drivers/gpu/drm/i915/i915_dma.c > @@ -1062,6 +1062,11 @@ const struct drm_ioctl_desc i915_ioctls[] = { > DRM_IOCTL_DEF_DRV(I915_REG_READ, i915_reg_read_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), > DRM_IOCTL_DEF_DRV(I915_GET_RESET_STATS, i915_get_reset_stats_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), > DRM_IOCTL_DEF_DRV(I915_GEM_USERPTR, i915_gem_userptr_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), > +#ifdef CONFIG_DRM_I915_SYNC > + DRM_IOCTL_DEF_DRV(I915_GEM_FENCE, i915_sync_create_fence_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), > +#else > + DRM_IOCTL_DEF_DRV(I915_GEM_FENCE, drm_noop, DRM_UNLOCKED|DRM_RENDER_ALLOW), > +#endif > }; > > int i915_max_ioctl = ARRAY_SIZE(i915_ioctls); > diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h > index 049482f..367d95a 100644 > --- a/drivers/gpu/drm/i915/i915_drv.h > +++ b/drivers/gpu/drm/i915/i915_drv.h > @@ -1525,6 +1525,9 @@ struct i915_workarounds { > u32 count; > }; > > +struct i915_sync_timeline; > + > + Dead definition? > struct drm_i915_private { > struct drm_device *dev; > struct kmem_cache *slab; > @@ -1560,6 +1563,9 @@ struct drm_i915_private { > uint32_t last_seqno, next_seqno; > > struct drm_dma_handle *status_page_dmah; > + > + struct i915_sync_timeline *sync_tl[I915_NUM_RINGS]; Dead member? > struct resource mch_res; > > /* protects the irq masks */ > @@ -2509,6 +2515,20 @@ void i915_init_vm(struct drm_i915_private *dev_priv, > void i915_gem_free_object(struct drm_gem_object *obj); > void i915_gem_vma_destroy(struct i915_vma *vma); > > +/* i915_sync.c */ > +#ifdef CONFIG_DRM_I915_SYNC > +int i915_sync_init(struct drm_i915_private *dev_priv); > +void i915_sync_fini(struct drm_i915_private *dev_priv); > +int i915_sync_create_fence_ioctl(struct drm_device *dev, void *data, > + struct drm_file *file); > +#else > +static inline int i915_sync_init(struct drm_i915_private *dev_priv) > +{ > + return 0; > +} > +static inline void i915_sync_fini(struct drm_i915_private *dev_priv) { } > +#endif > + > #define PIN_MAPPABLE 0x1 > #define PIN_NONBLOCK 0x2 > #define PIN_GLOBAL 0x4 > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 7a83a9f..39407b4 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -4887,6 +4887,9 @@ int i915_gem_init(struct drm_device *dev) > atomic_set_mask(I915_WEDGED, &dev_priv->gpu_error.reset_counter); > ret = 0; > } > + > + i915_sync_init(dev_priv); > + > mutex_unlock(&dev->struct_mutex); > > return ret; > @@ -5011,6 +5014,8 @@ void i915_gem_release(struct drm_device *dev, struct drm_file *file) > request->file_priv = NULL; > } > spin_unlock(&file_priv->mm.lock); > + > + i915_sync_fini(dev->dev_private); > } > > static void > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c > index 7913a72..d4ede83 100644 > --- a/drivers/gpu/drm/i915/i915_irq.c > +++ b/drivers/gpu/drm/i915/i915_irq.c > @@ -33,6 +33,7 @@ > #include <linux/circ_buf.h> > #include <drm/drmP.h> > #include <drm/i915_drm.h> > +#include "../../../staging/android/sync.h" > #include "i915_drv.h" > #include "i915_trace.h" > #include "intel_drv.h" > @@ -2378,8 +2379,9 @@ static void i915_error_wake_up(struct drm_i915_private *dev_priv, > */ > > /* Wake up __wait_seqno, potentially holding dev->struct_mutex. */ > - for_each_ring(ring, dev_priv, i) > + for_each_ring(ring, dev_priv, i) { > wake_up_all(&ring->irq_queue); > + } > > /* Wake up intel_crtc_wait_for_pending_flips, holding crtc->mutex. */ > wake_up_all(&dev_priv->pending_flip_queue); > diff --git a/drivers/gpu/drm/i915/i915_sync.c b/drivers/gpu/drm/i915/i915_sync.c > new file mode 100644 > index 0000000..4741b0c > --- /dev/null > +++ b/drivers/gpu/drm/i915/i915_sync.c > @@ -0,0 +1,325 @@ > +/* > + * Copyright © 2014 Intel Corporation > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice (including the next > + * paragraph) shall be included in all copies or substantial portions of the > + * Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS > + * IN THE SOFTWARE. > + * > + * Authors: > + * Jesse Barnes <jbarnes@virtuousgeek.org> > + * > + */ > + > +#include <drm/drmP.h> > +#include <drm/drm_vma_manager.h> > +#include <drm/i915_drm.h> > +#include "i915_drv.h" > +#include "i915_trace.h" > +#include "intel_drv.h" > +#include <linux/oom.h> > +#include <linux/shmem_fs.h> > +#include <linux/slab.h> > +#include <linux/swap.h> > +#include <linux/pci.h> > +#include <linux/dma-buf.h> > +#include "../../../staging/android/sync.h" > + > +/* Nothing really to protect here... */ > +static spinlock_t fence_lock; > + > +/* > + * i915 fences on sync timelines > + * > + * We implement sync points in terms of i915 seqnos. They're exposed > + * through the new DRM_I915_GEM_FENCE ioctl, and can be mixed and matched > + * with other Android timelines and aggregated into sync_fences, etc. > + * > + * TODO: > + * rebase on top of Chris's seqno/request stuff and use requests > + * allow non-RCS fences (need ring/context association) > + */ > + > +struct i915_fence { > + struct fence base; > + struct intel_engine_cs *ring; > + struct intel_context *ctx; > + wait_queue_t wait; > + struct drm_i915_gem_request *request; > +}; > + > +#define to_intel_fence(x) container_of(x, struct i915_fence, base) > + > +int __wait_seqno(struct intel_engine_cs *ring, u32 seqno, > + unsigned reset_counter, > + bool interruptible, > + struct timespec *timeout, > + struct drm_i915_file_private *file_priv); > + > +static const char *i915_fence_get_driver_name(struct fence *fence) > +{ > + return "i915"; > +} > + > +static const char *i915_fence_get_timeline_name(struct fence *fence) > +{ > + struct i915_fence *intel_fence = to_intel_fence(fence); > + > + return intel_fence->ring->name; > +} > + > +static int i915_fence_check(wait_queue_t *wait, unsigned mode, int flags, > + void *key) > +{ > + struct i915_fence *intel_fence = wait->private; > + struct intel_engine_cs *ring = intel_fence->ring; > + > + if (!i915_gem_request_completed(intel_fence->request, false)) > + return 0; > + > + fence_signal_locked(&intel_fence->base); > + > + __remove_wait_queue(&ring->irq_queue, wait); > + fence_put(&intel_fence->base); > + ring->irq_put(ring); > + > + return 0; > +} > + > +static bool i915_fence_enable_signaling(struct fence *fence) > +{ > + struct i915_fence *intel_fence = to_intel_fence(fence); > + struct intel_engine_cs *ring = intel_fence->ring; > + struct drm_i915_private *dev_priv = ring->dev->dev_private; > + wait_queue_t *wait = &intel_fence->wait; > + > + /* queue fence wait queue on irq queue and get fence */ > + if (i915_gem_request_completed(intel_fence->request, false) || > + i915_terminally_wedged(&dev_priv->gpu_error)) > + return false; > + > + if (!ring->irq_get(ring)) > + return false; > + > + if (i915_gem_request_completed(intel_fence->request, false)) { > + ring->irq_put(ring); > + return true; > + } > + > + wait->flags = 0; > + wait->private = intel_fence; > + wait->func = i915_fence_check; > + > + __add_wait_queue(&ring->irq_queue, wait); > + fence_get(fence); > + > + return true; > +} > + > +static bool i915_fence_signaled(struct fence *fence) > +{ > + struct i915_fence *intel_fence = to_intel_fence(fence); > + > + return i915_gem_request_completed(intel_fence->request, false); > +} > + > +static signed long i915_fence_wait(struct fence *fence, bool intr, > + signed long timeout_js) > +{ > + struct i915_fence *intel_fence = to_intel_fence(fence); > + struct drm_i915_private *dev_priv = intel_fence->ring->dev->dev_private; > + int ret; > + s64 timeout; > + > + timeout = jiffies_to_nsecs(timeout_js); > + > + ret = __i915_wait_request(intel_fence->request, > + atomic_read(&dev_priv->gpu_error.reset_counter), > + intr, &timeout, NULL); > + if (ret == -ETIME) > + return nsecs_to_jiffies(timeout); > + > + return ret; > +} > + > +static int i915_fence_fill_driver_data(struct fence *fence, void *data, > + int size) > +{ > + struct i915_fence *intel_fence = to_intel_fence(fence); > + > + if (size < sizeof(intel_fence->request->seqno)) > + return -ENOMEM; > + > + memcpy(data, &intel_fence->request->seqno, > + sizeof(intel_fence->request->seqno)); > + > + return sizeof(intel_fence->request->seqno); > +} > + > +static void i915_fence_value_str(struct fence *fence, char *str, int size) > +{ > + struct i915_fence *intel_fence = to_intel_fence(fence); > + > + snprintf(str, size, "%u", intel_fence->request->seqno); > +} > + > +static void i915_fence_timeline_value_str(struct fence *fence, char *str, > + int size) > +{ > + struct i915_fence *intel_fence = to_intel_fence(fence); > + struct intel_engine_cs *ring = intel_fence->ring; > + > + snprintf(str, size, "%u", ring->get_seqno(ring, false)); > +} > + > +static struct fence_ops i915_fence_ops = { > + .get_driver_name = i915_fence_get_driver_name, > + .get_timeline_name = i915_fence_get_timeline_name, > + .enable_signaling = i915_fence_enable_signaling, > + .signaled = i915_fence_signaled, > + .wait = i915_fence_wait, > + .fill_driver_data = i915_fence_fill_driver_data, > + .fence_value_str = i915_fence_value_str, > + .timeline_value_str = i915_fence_timeline_value_str, > +}; > + > +static struct fence *i915_fence_create(struct intel_engine_cs *ring, > + struct intel_context *ctx) > +{ > + struct i915_fence *fence; > + int ret; > + > + fence = kzalloc(sizeof(*fence), GFP_KERNEL); > + if (!fence) > + return NULL; > + > + ret = ring->add_request(ring); > + if (ret) { > + DRM_ERROR("add_request failed\n"); > + fence_free((struct fence *)fence); > + return NULL; > + } > + > + fence->ring = ring; > + fence->ctx = ctx; > + fence->request = ring->outstanding_lazy_request; > + fence_init(&fence->base, &i915_fence_ops, &fence_lock, ctx->user_handle, > + fence->request->seqno); > + > + return &fence->base; > +} > + > +/** > + * i915_sync_create_fence_ioctl - fence creation function > + * @dev: drm device > + * @data: ioctl data > + * @file: file struct > + * > + * This function creates a fence given a context and ring, and returns > + * it to the caller in the form of a file descriptor. > + * > + * The returned descriptor is a sync fence fd, and can be used with all > + * the usual sync fence operations (poll, ioctl, etc). > + * > + * The process fd limit should prevent an overallocation of fence objects, > + * which need to be destroyed manually with a close() call. > + */ > +int i915_sync_create_fence_ioctl(struct drm_device *dev, void *data, > + struct drm_file *file) > +{ > + struct drm_i915_private *dev_priv = dev->dev_private; > + struct drm_i915_gem_fence *fdata = data; > + struct fence *fence; > + struct sync_fence *sfence; > + struct intel_engine_cs *ring; > + struct intel_context *ctx; > + u32 ctx_id = fdata->ctx_id; > + int fd = get_unused_fd_flags(O_CLOEXEC); > + int ret = 0; > + > + if (file == NULL) { > + DRM_ERROR("no file priv?\n"); > + return -EINVAL; > + } > + > + ret = i915_mutex_lock_interruptible(dev); > + if (ret) { > + DRM_ERROR("mutex interrupted\n"); > + goto out; > + } > + > + ctx = i915_gem_context_get(file->driver_priv, ctx_id); > + if (ctx == NULL) { > + DRM_ERROR("context lookup failed\n"); > + ret = -ENOENT; > + goto err; > + } > + > + ring = &dev_priv->ring[RCS]; > + > + if (!intel_ring_initialized(ring)) { > + DRM_ERROR("ring not ready\n"); > + ret = -EIO; > + goto err; > + } > + > + fence = i915_fence_create(ring, ctx); > + if (!fence) { > + ret = -ENOMEM; > + goto err; > + } > + > + fdata->name[sizeof(fdata->name) - 1] = '\0'; > + sfence = sync_fence_create_dma(fdata->name, fence); > + if (!sfence) { > + ret = -ENOMEM; > + goto err; > + } > + > + fdata->fd = fd; > + > + sync_fence_install(sfence, fd); > + > + mutex_unlock(&dev->struct_mutex); > +out: > + return ret; > + > +err: > + mutex_unlock(&dev->struct_mutex); > + put_unused_fd(fd); > + return ret; > +} > + > +int i915_sync_init(struct drm_i915_private *dev_priv) > +{ > + struct intel_engine_cs *ring; > + int i, ret = 0; > + > + for_each_ring(ring, dev_priv, i) { > + /* FIXME: non-RCS fences */ > + } > + > + return ret; > +} > + > +void i915_sync_fini(struct drm_i915_private *dev_priv) > +{ > + int i; > + > + for (i = 0; i < I915_NUM_RINGS; i++) { > + } > +} Do you have a need for sync_init/fini? The rest looks good to me. Code's cleaned up a lot. :-) ~Maarten _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/3] drm/i915: Android sync points for i915 v3 2014-12-03 19:49 ` [PATCH 2/3] drm/i915: Android sync points for i915 v3 Jesse Barnes 2014-12-03 21:29 ` Maarten Lankhorst @ 2014-12-04 8:10 ` Chris Wilson 2014-12-04 9:03 ` Chris Wilson 2014-12-04 11:21 ` Daniel Vetter 2 siblings, 1 reply; 15+ messages in thread From: Chris Wilson @ 2014-12-04 8:10 UTC (permalink / raw) To: Jesse Barnes; +Cc: Maarten Lankhorst, intel-gfx On Wed, Dec 03, 2014 at 11:49:06AM -0800, Jesse Barnes wrote: > Expose an ioctl to create Android fences based on the Android sync point > infrastructure (which in turn is based on DMA-buf fences). Just a > sketch at this point, no testing has been done. > > There are a couple of goals here: > 1) allow applications and libraries to create fences without an > associated buffer > 2) re-use a common API so userspace doesn't have to impedance mismatch > between different driver implementations too much > 3) allow applications and libraries to use explicit synchronization if > they choose by exposing fences directly > > v2: use struct fence directly using Maarten's new interface > v3: use new i915 request structure (Jesse) > make i915 fences a compile time option since Android support is still > in staging (Jesse) > check for request complete after arming IRQs (Chris) > add timeline id to ioctl (Tvrtko) > > Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> > --- > drivers/gpu/drm/i915/Kconfig | 14 ++ > drivers/gpu/drm/i915/Makefile | 1 + > drivers/gpu/drm/i915/i915_dma.c | 5 + > drivers/gpu/drm/i915/i915_drv.h | 20 +++ > drivers/gpu/drm/i915/i915_gem.c | 5 + > drivers/gpu/drm/i915/i915_irq.c | 4 +- > drivers/gpu/drm/i915/i915_sync.c | 325 +++++++++++++++++++++++++++++++++++++++ > include/uapi/drm/i915_drm.h | 23 +++ > 8 files changed, 396 insertions(+), 1 deletion(-) > create mode 100644 drivers/gpu/drm/i915/i915_sync.c > > diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig > index 4e39ab3..6b23d17 100644 > --- a/drivers/gpu/drm/i915/Kconfig > +++ b/drivers/gpu/drm/i915/Kconfig > @@ -43,6 +43,20 @@ config DRM_I915_KMS > > If in doubt, say "Y". > > +config DRM_I915_SYNC > + bool "Enable explicit sync support" > + depends on DRM_I915 > + default y > + select STAGING > + select ANDROID > + select SYNC > + help > + Choose this option to enable Android native sync support and the > + corresponding i915 driver code to expose it. Slightly increases > + driver size and pulls in sync support from staging. > + > + If in doubt, say "Y". > + > config DRM_I915_FBDEV > bool "Enable legacy fbdev support for the modesetting intel driver" > depends on DRM_I915 > diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile > index e4083e4..45e155f 100644 > --- a/drivers/gpu/drm/i915/Makefile > +++ b/drivers/gpu/drm/i915/Makefile > @@ -56,6 +56,7 @@ i915-y += intel_audio.o \ > intel_sprite.o > i915-$(CONFIG_ACPI) += intel_acpi.o intel_opregion.o > i915-$(CONFIG_DRM_I915_FBDEV) += intel_fbdev.o > +i915-$(CONFIG_DRM_I915_SYNC) += i915_sync.o > > # modesetting output/encoder code > i915-y += dvo_ch7017.o \ > diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c > index 887d88f..621bd7f 100644 > --- a/drivers/gpu/drm/i915/i915_dma.c > +++ b/drivers/gpu/drm/i915/i915_dma.c > @@ -1062,6 +1062,11 @@ const struct drm_ioctl_desc i915_ioctls[] = { > DRM_IOCTL_DEF_DRV(I915_REG_READ, i915_reg_read_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), > DRM_IOCTL_DEF_DRV(I915_GET_RESET_STATS, i915_get_reset_stats_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), > DRM_IOCTL_DEF_DRV(I915_GEM_USERPTR, i915_gem_userptr_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), > +#ifdef CONFIG_DRM_I915_SYNC > + DRM_IOCTL_DEF_DRV(I915_GEM_FENCE, i915_sync_create_fence_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW), > +#else > + DRM_IOCTL_DEF_DRV(I915_GEM_FENCE, drm_noop, DRM_UNLOCKED|DRM_RENDER_ALLOW), > +#endif This can be done via the ifdef trickery in the header. > +struct i915_sync_timeline; > + > + > struct drm_i915_private { > struct drm_device *dev; > struct kmem_cache *slab; > @@ -1560,6 +1563,9 @@ struct drm_i915_private { > uint32_t last_seqno, next_seqno; > > struct drm_dma_handle *status_page_dmah; > + > + struct i915_sync_timeline *sync_tl[I915_NUM_RINGS]; If I understand correctly, the fence timeline are indeed per-ring, but here you set them up (and name them) on a per-engine basis. > + > struct resource mch_res; > > /* protects the irq masks */ > @@ -2509,6 +2515,20 @@ void i915_init_vm(struct drm_i915_private *dev_priv, > void i915_gem_free_object(struct drm_gem_object *obj); > void i915_gem_vma_destroy(struct i915_vma *vma); > > +/* i915_sync.c */ > +#ifdef CONFIG_DRM_I915_SYNC > +int i915_sync_init(struct drm_i915_private *dev_priv); > +void i915_sync_fini(struct drm_i915_private *dev_priv); > +int i915_sync_create_fence_ioctl(struct drm_device *dev, void *data, > + struct drm_file *file); > +#else > +static inline int i915_sync_init(struct drm_i915_private *dev_priv) > +{ > + return 0; > +} > +static inline void i915_sync_fini(struct drm_i915_private *dev_priv) { } > +#endif > + > #define PIN_MAPPABLE 0x1 > #define PIN_NONBLOCK 0x2 > #define PIN_GLOBAL 0x4 > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c > index 7a83a9f..39407b4 100644 > --- a/drivers/gpu/drm/i915/i915_gem.c > +++ b/drivers/gpu/drm/i915/i915_gem.c > @@ -4887,6 +4887,9 @@ int i915_gem_init(struct drm_device *dev) > atomic_set_mask(I915_WEDGED, &dev_priv->gpu_error.reset_counter); > ret = 0; > } > + > + i915_sync_init(dev_priv); > + > mutex_unlock(&dev->struct_mutex); > > return ret; > @@ -5011,6 +5014,8 @@ void i915_gem_release(struct drm_device *dev, struct drm_file *file) > request->file_priv = NULL; > } > spin_unlock(&file_priv->mm.lock); > + > + i915_sync_fini(dev->dev_private); > } > > static void > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c > index 7913a72..d4ede83 100644 > --- a/drivers/gpu/drm/i915/i915_irq.c > +++ b/drivers/gpu/drm/i915/i915_irq.c > @@ -33,6 +33,7 @@ > #include <linux/circ_buf.h> > #include <drm/drmP.h> > #include <drm/i915_drm.h> > +#include "../../../staging/android/sync.h" > #include "i915_drv.h" > #include "i915_trace.h" > #include "intel_drv.h" > @@ -2378,8 +2379,9 @@ static void i915_error_wake_up(struct drm_i915_private *dev_priv, > */ > > /* Wake up __wait_seqno, potentially holding dev->struct_mutex. */ > - for_each_ring(ring, dev_priv, i) > + for_each_ring(ring, dev_priv, i) { > wake_up_all(&ring->irq_queue); > + } > > /* Wake up intel_crtc_wait_for_pending_flips, holding crtc->mutex. */ > wake_up_all(&dev_priv->pending_flip_queue); > diff --git a/drivers/gpu/drm/i915/i915_sync.c b/drivers/gpu/drm/i915/i915_sync.c > new file mode 100644 > index 0000000..4741b0c > --- /dev/null > +++ b/drivers/gpu/drm/i915/i915_sync.c > @@ -0,0 +1,325 @@ > +/* > + * Copyright © 2014 Intel Corporation > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice (including the next > + * paragraph) shall be included in all copies or substantial portions of the > + * Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS > + * IN THE SOFTWARE. > + * > + * Authors: > + * Jesse Barnes <jbarnes@virtuousgeek.org> > + * > + */ > + > +#include <drm/drmP.h> > +#include <drm/drm_vma_manager.h> > +#include <drm/i915_drm.h> > +#include "i915_drv.h" > +#include "i915_trace.h" > +#include "intel_drv.h" > +#include <linux/oom.h> > +#include <linux/shmem_fs.h> > +#include <linux/slab.h> > +#include <linux/swap.h> > +#include <linux/pci.h> > +#include <linux/dma-buf.h> > +#include "../../../staging/android/sync.h" > + > +/* Nothing really to protect here... */ > +static spinlock_t fence_lock; > + > +/* > + * i915 fences on sync timelines > + * > + * We implement sync points in terms of i915 seqnos. They're exposed > + * through the new DRM_I915_GEM_FENCE ioctl, and can be mixed and matched > + * with other Android timelines and aggregated into sync_fences, etc. > + * > + * TODO: > + * rebase on top of Chris's seqno/request stuff and use requests > + * allow non-RCS fences (need ring/context association) > + */ > + > +struct i915_fence { > + struct fence base; > + struct intel_engine_cs *ring; > + struct intel_context *ctx; > + wait_queue_t wait; > + struct drm_i915_gem_request *request; > +}; > + > +#define to_intel_fence(x) container_of(x, struct i915_fence, base) > + > +int __wait_seqno(struct intel_engine_cs *ring, u32 seqno, > + unsigned reset_counter, > + bool interruptible, > + struct timespec *timeout, > + struct drm_i915_file_private *file_priv); > + > +static const char *i915_fence_get_driver_name(struct fence *fence) > +{ > + return "i915"; > +} > + > +static const char *i915_fence_get_timeline_name(struct fence *fence) > +{ > + struct i915_fence *intel_fence = to_intel_fence(fence); > + > + return intel_fence->ring->name; > +} > + > +static int i915_fence_check(wait_queue_t *wait, unsigned mode, int flags, > + void *key) > +{ > + struct i915_fence *intel_fence = wait->private; > + struct intel_engine_cs *ring = intel_fence->ring; > + > + if (!i915_gem_request_completed(intel_fence->request, false)) > + return 0; > + > + fence_signal_locked(&intel_fence->base); > + > + __remove_wait_queue(&ring->irq_queue, wait); > + fence_put(&intel_fence->base); > + ring->irq_put(ring); > + > + return 0; > +} > + > +static bool i915_fence_enable_signaling(struct fence *fence) > +{ > + struct i915_fence *intel_fence = to_intel_fence(fence); > + struct intel_engine_cs *ring = intel_fence->ring; > + struct drm_i915_private *dev_priv = ring->dev->dev_private; > + wait_queue_t *wait = &intel_fence->wait; > + > + /* queue fence wait queue on irq queue and get fence */ > + if (i915_gem_request_completed(intel_fence->request, false) || > + i915_terminally_wedged(&dev_priv->gpu_error)) > + return false; > + > + if (!ring->irq_get(ring)) > + return false; > + > + if (i915_gem_request_completed(intel_fence->request, false)) { > + ring->irq_put(ring); > + return true; > + } > + > + wait->flags = 0; > + wait->private = intel_fence; > + wait->func = i915_fence_check; > + > + __add_wait_queue(&ring->irq_queue, wait); > + fence_get(fence); > + > + return true; > +} > + > +static bool i915_fence_signaled(struct fence *fence) > +{ > + struct i915_fence *intel_fence = to_intel_fence(fence); > + > + return i915_gem_request_completed(intel_fence->request, false); > +} > + > +static signed long i915_fence_wait(struct fence *fence, bool intr, > + signed long timeout_js) > +{ > + struct i915_fence *intel_fence = to_intel_fence(fence); > + struct drm_i915_private *dev_priv = intel_fence->ring->dev->dev_private; > + int ret; > + s64 timeout; > + > + timeout = jiffies_to_nsecs(timeout_js); > + > + ret = __i915_wait_request(intel_fence->request, > + atomic_read(&dev_priv->gpu_error.reset_counter), > + intr, &timeout, NULL); > + if (ret == -ETIME) > + return nsecs_to_jiffies(timeout); > + > + return ret; > +} > + > +static int i915_fence_fill_driver_data(struct fence *fence, void *data, > + int size) > +{ > + struct i915_fence *intel_fence = to_intel_fence(fence); > + > + if (size < sizeof(intel_fence->request->seqno)) > + return -ENOMEM; > + > + memcpy(data, &intel_fence->request->seqno, > + sizeof(intel_fence->request->seqno)); > + > + return sizeof(intel_fence->request->seqno); > +} > + > +static void i915_fence_value_str(struct fence *fence, char *str, int size) > +{ > + struct i915_fence *intel_fence = to_intel_fence(fence); > + > + snprintf(str, size, "%u", intel_fence->request->seqno); > +} > + > +static void i915_fence_timeline_value_str(struct fence *fence, char *str, > + int size) > +{ > + struct i915_fence *intel_fence = to_intel_fence(fence); > + struct intel_engine_cs *ring = intel_fence->ring; > + > + snprintf(str, size, "%u", ring->get_seqno(ring, false)); > +} > + > +static struct fence_ops i915_fence_ops = { > + .get_driver_name = i915_fence_get_driver_name, > + .get_timeline_name = i915_fence_get_timeline_name, > + .enable_signaling = i915_fence_enable_signaling, > + .signaled = i915_fence_signaled, > + .wait = i915_fence_wait, > + .fill_driver_data = i915_fence_fill_driver_data, > + .fence_value_str = i915_fence_value_str, > + .timeline_value_str = i915_fence_timeline_value_str, > +}; > + > +static struct fence *i915_fence_create(struct intel_engine_cs *ring, > + struct intel_context *ctx) > +{ > + struct i915_fence *fence; > + int ret; > + > + fence = kzalloc(sizeof(*fence), GFP_KERNEL); > + if (!fence) > + return NULL; > + > + ret = ring->add_request(ring); > + if (ret) { > + DRM_ERROR("add_request failed\n"); > + fence_free((struct fence *)fence); > + return NULL; > + } > + > + fence->ring = ring; > + fence->ctx = ctx; > + fence->request = ring->outstanding_lazy_request; Pardon? This is horribly, horribly broken. (Emitting the same breadcrumb/interrupt multiple times without coordination.) What are the fence semantics with flushing top/bottom of pipe? I think you what something along the lines of: rq = i915_request_create(ctx, engine); if (IS_ERR(rq)) { free(fence); return ERR_CAST(rq); } /* breadcrumb will insert all necessary flushes and barriers for fence * completion, or if you want to be explicit: i915_request_emit_flush(rq, I915_COMMAND_BARRIER | I915_FLUSH_CACHES); */ ret = i915_request_emit_breadcrumb(rq); if (ret == 0) ret = i915_request_commit(rq); if (ret) { i915_request_put(rq); free(fence); return ERR_PTR(ret); } fence->rq = rq; and rq->ctx, rq->engine, rq->ring already exist and so the redundant information can be removed from i915_fence. > + fence_init(&fence->base, &i915_fence_ops, &fence_lock, ctx->user_handle, > + fence->request->seqno); > + > + return &fence->base; > +} > + > +/** > + * i915_sync_create_fence_ioctl - fence creation function > + * @dev: drm device > + * @data: ioctl data > + * @file: file struct > + * > + * This function creates a fence given a context and ring, and returns > + * it to the caller in the form of a file descriptor. > + * > + * The returned descriptor is a sync fence fd, and can be used with all > + * the usual sync fence operations (poll, ioctl, etc). > + * > + * The process fd limit should prevent an overallocation of fence objects, > + * which need to be destroyed manually with a close() call. > + */ > +int i915_sync_create_fence_ioctl(struct drm_device *dev, void *data, > + struct drm_file *file) > +{ > + struct drm_i915_private *dev_priv = dev->dev_private; > + struct drm_i915_gem_fence *fdata = data; > + struct fence *fence; > + struct sync_fence *sfence; > + struct intel_engine_cs *ring; > + struct intel_context *ctx; > + u32 ctx_id = fdata->ctx_id; > + int fd = get_unused_fd_flags(O_CLOEXEC); > + int ret = 0; > + > + if (file == NULL) { > + DRM_ERROR("no file priv?\n"); > + return -EINVAL; > + } > + > + ret = i915_mutex_lock_interruptible(dev); > + if (ret) { > + DRM_ERROR("mutex interrupted\n"); > + goto out; > + } > + > + ctx = i915_gem_context_get(file->driver_priv, ctx_id); > + if (ctx == NULL) { > + DRM_ERROR("context lookup failed\n"); > + ret = -ENOENT; > + goto err; > + } > + > + ring = &dev_priv->ring[RCS]; Hmm. This looks unfortunately restrictive. Especially for an i915 custom ioctl. > + > + if (!intel_ring_initialized(ring)) { > + DRM_ERROR("ring not ready\n"); > + ret = -EIO; ret = -EINVAL; The ring is invalid, not just "not ready". > + goto err; > + } > + > + fence = i915_fence_create(ring, ctx); > + if (!fence) { IS_ERR(fence) itym. Especially as fence creation could fail with -ERESTARTSYS. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/3] drm/i915: Android sync points for i915 v3 2014-12-04 8:10 ` Chris Wilson @ 2014-12-04 9:03 ` Chris Wilson 0 siblings, 0 replies; 15+ messages in thread From: Chris Wilson @ 2014-12-04 9:03 UTC (permalink / raw) To: Jesse Barnes, intel-gfx, Maarten Lankhorst On Thu, Dec 04, 2014 at 08:10:22AM +0000, Chris Wilson wrote: > If I understand correctly, the fence timeline are indeed per-ring, but > here you set them up (and name them) on a per-engine basis. On reflection, the timeline are per-fd, so using the engine name is not confusing after all. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/3] drm/i915: Android sync points for i915 v3 2014-12-03 19:49 ` [PATCH 2/3] drm/i915: Android sync points for i915 v3 Jesse Barnes 2014-12-03 21:29 ` Maarten Lankhorst 2014-12-04 8:10 ` Chris Wilson @ 2014-12-04 11:21 ` Daniel Vetter 2014-12-04 14:57 ` Jesse Barnes 2 siblings, 1 reply; 15+ messages in thread From: Daniel Vetter @ 2014-12-04 11:21 UTC (permalink / raw) To: Jesse Barnes; +Cc: Maarten Lankhorst, intel-gfx On Wed, Dec 03, 2014 at 11:49:06AM -0800, Jesse Barnes wrote: > Expose an ioctl to create Android fences based on the Android sync point > infrastructure (which in turn is based on DMA-buf fences). Just a > sketch at this point, no testing has been done. > > There are a couple of goals here: > 1) allow applications and libraries to create fences without an > associated buffer > 2) re-use a common API so userspace doesn't have to impedance mismatch > between different driver implementations too much > 3) allow applications and libraries to use explicit synchronization if > they choose by exposing fences directly > > v2: use struct fence directly using Maarten's new interface > v3: use new i915 request structure (Jesse) > make i915 fences a compile time option since Android support is still > in staging (Jesse) > check for request complete after arming IRQs (Chris) > add timeline id to ioctl (Tvrtko) > > Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> Same high-level comment as before: I still don't see the point in a separate get_fence ioctl when we want both in&out fences for execbuf (and later on atomic flips) for android. This separate ioctl here looks incomplete and just means you have to do 2 ioctls at least to actually get at the fence for an execbuf. And android sync stuff still needs to be de-staged. On that: The function names to wrap a struct fence into a syncpt file and then insert the file into an fd slot need a bit of name bikeshedding since it's not clear that it's just a userspace interface wrapper now. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/3] drm/i915: Android sync points for i915 v3 2014-12-04 11:21 ` Daniel Vetter @ 2014-12-04 14:57 ` Jesse Barnes 0 siblings, 0 replies; 15+ messages in thread From: Jesse Barnes @ 2014-12-04 14:57 UTC (permalink / raw) To: Daniel Vetter; +Cc: Maarten Lankhorst, intel-gfx On Thu, 4 Dec 2014 12:21:01 +0100 Daniel Vetter <daniel@ffwll.ch> wrote: > On Wed, Dec 03, 2014 at 11:49:06AM -0800, Jesse Barnes wrote: > > Expose an ioctl to create Android fences based on the Android sync point > > infrastructure (which in turn is based on DMA-buf fences). Just a > > sketch at this point, no testing has been done. > > > > There are a couple of goals here: > > 1) allow applications and libraries to create fences without an > > associated buffer > > 2) re-use a common API so userspace doesn't have to impedance mismatch > > between different driver implementations too much > > 3) allow applications and libraries to use explicit synchronization if > > they choose by exposing fences directly > > > > v2: use struct fence directly using Maarten's new interface > > v3: use new i915 request structure (Jesse) > > make i915 fences a compile time option since Android support is still > > in staging (Jesse) > > check for request complete after arming IRQs (Chris) > > add timeline id to ioctl (Tvrtko) > > > > Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> > > Same high-level comment as before: > > I still don't see the point in a separate get_fence ioctl when we want > both in&out fences for execbuf (and later on atomic flips) for android. > This separate ioctl here looks incomplete and just means you have to do 2 > ioctls at least to actually get at the fence for an execbuf. Yeah I'm working on converting over to putting the creation calls into execbuf instead; that also makes per-ring/context handling easier, and also the display stuff easier, since we won't have to worry about enumerating crtcs and planes as separate timelines. > And android sync stuff still needs to be de-staged. On that: The function > names to wrap a struct fence into a syncpt file and then insert the file > into an fd slot need a bit of name bikeshedding since it's not clear that > it's just a userspace interface wrapper now. I don't think we'll be able to de-stage without a user, but I can add a patch to do that as part of the series. -- Jesse Barnes, Intel Open Source Technology Center _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 3/3] drm/i915: add fences to the request struct 2014-12-03 19:49 [RFC] Updated Android sync patches Jesse Barnes 2014-12-03 19:49 ` [PATCH 1/3] android: add sync_fence_create_dma Jesse Barnes 2014-12-03 19:49 ` [PATCH 2/3] drm/i915: Android sync points for i915 v3 Jesse Barnes @ 2014-12-03 19:49 ` Jesse Barnes 2014-12-04 9:13 ` Chris Wilson 2 siblings, 1 reply; 15+ messages in thread From: Jesse Barnes @ 2014-12-03 19:49 UTC (permalink / raw) To: intel-gfx; +Cc: Maarten Lankhorst This simplifies the sync code quite a bit. I don't think we'll be able to get away with using the core fence code's seqno support, since we'll be moving away from simple seqno comparisions with the scheduler and preemption, but the additional code is pretty minimal anyway, and lets us add additional debugging as needed, so it's probably fine to keep either way. We still need to add support for other rings here; we ought to be able to do that with the timeline field of the ioctl (which will include other "rings" like the display flip queue for example). Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org> --- drivers/gpu/drm/i915/i915_drv.h | 6 +++ drivers/gpu/drm/i915/i915_sync.c | 89 +++++++++++++++------------------------- 2 files changed, 40 insertions(+), 55 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 367d95a..2725243 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -49,6 +49,7 @@ #include <linux/intel-iommu.h> #include <linux/kref.h> #include <linux/pm_qos.h> +#include <linux/fence.h> /* General customization: */ @@ -2025,6 +2026,11 @@ struct drm_i915_gem_request { struct drm_i915_file_private *file_priv; /** file_priv list entry for this request */ struct list_head client_list; + + /* core fence obj for this request, may be exported */ + struct fence fence; + + wait_queue_t wait; }; void i915_gem_request_free(struct kref *req_ref); diff --git a/drivers/gpu/drm/i915/i915_sync.c b/drivers/gpu/drm/i915/i915_sync.c index 4741b0c..ea54b34 100644 --- a/drivers/gpu/drm/i915/i915_sync.c +++ b/drivers/gpu/drm/i915/i915_sync.c @@ -50,25 +50,10 @@ static spinlock_t fence_lock; * with other Android timelines and aggregated into sync_fences, etc. * * TODO: - * rebase on top of Chris's seqno/request stuff and use requests * allow non-RCS fences (need ring/context association) */ -struct i915_fence { - struct fence base; - struct intel_engine_cs *ring; - struct intel_context *ctx; - wait_queue_t wait; - struct drm_i915_gem_request *request; -}; - -#define to_intel_fence(x) container_of(x, struct i915_fence, base) - -int __wait_seqno(struct intel_engine_cs *ring, u32 seqno, - unsigned reset_counter, - bool interruptible, - struct timespec *timeout, - struct drm_i915_file_private *file_priv); +#define to_i915_request(x) container_of(x, struct drm_i915_gem_request, fence) static const char *i915_fence_get_driver_name(struct fence *fence) { @@ -77,24 +62,24 @@ static const char *i915_fence_get_driver_name(struct fence *fence) static const char *i915_fence_get_timeline_name(struct fence *fence) { - struct i915_fence *intel_fence = to_intel_fence(fence); + struct drm_i915_gem_request *req = to_i915_request(fence); - return intel_fence->ring->name; + return req->ring->name; } static int i915_fence_check(wait_queue_t *wait, unsigned mode, int flags, void *key) { - struct i915_fence *intel_fence = wait->private; - struct intel_engine_cs *ring = intel_fence->ring; + struct drm_i915_gem_request *req = wait->private; + struct intel_engine_cs *ring = req->ring; - if (!i915_gem_request_completed(intel_fence->request, false)) + if (!i915_gem_request_completed(req, false)) return 0; - fence_signal_locked(&intel_fence->base); + fence_signal_locked(&req->fence); __remove_wait_queue(&ring->irq_queue, wait); - fence_put(&intel_fence->base); + fence_put(&req->fence); ring->irq_put(ring); return 0; @@ -102,26 +87,26 @@ static int i915_fence_check(wait_queue_t *wait, unsigned mode, int flags, static bool i915_fence_enable_signaling(struct fence *fence) { - struct i915_fence *intel_fence = to_intel_fence(fence); - struct intel_engine_cs *ring = intel_fence->ring; + struct drm_i915_gem_request *req = to_i915_request(fence); + struct intel_engine_cs *ring = req->ring; struct drm_i915_private *dev_priv = ring->dev->dev_private; - wait_queue_t *wait = &intel_fence->wait; + wait_queue_t *wait = &req->wait; /* queue fence wait queue on irq queue and get fence */ - if (i915_gem_request_completed(intel_fence->request, false) || + if (i915_gem_request_completed(req, false) || i915_terminally_wedged(&dev_priv->gpu_error)) return false; if (!ring->irq_get(ring)) return false; - if (i915_gem_request_completed(intel_fence->request, false)) { + if (i915_gem_request_completed(req, false)) { ring->irq_put(ring); return true; } wait->flags = 0; - wait->private = intel_fence; + wait->private = req; wait->func = i915_fence_check; __add_wait_queue(&ring->irq_queue, wait); @@ -132,22 +117,22 @@ static bool i915_fence_enable_signaling(struct fence *fence) static bool i915_fence_signaled(struct fence *fence) { - struct i915_fence *intel_fence = to_intel_fence(fence); + struct drm_i915_gem_request *req = to_i915_request(fence); - return i915_gem_request_completed(intel_fence->request, false); + return i915_gem_request_completed(req, false); } static signed long i915_fence_wait(struct fence *fence, bool intr, signed long timeout_js) { - struct i915_fence *intel_fence = to_intel_fence(fence); - struct drm_i915_private *dev_priv = intel_fence->ring->dev->dev_private; + struct drm_i915_gem_request *req = to_i915_request(fence); + struct drm_i915_private *dev_priv = req->ring->dev->dev_private; int ret; s64 timeout; timeout = jiffies_to_nsecs(timeout_js); - ret = __i915_wait_request(intel_fence->request, + ret = __i915_wait_request(req, atomic_read(&dev_priv->gpu_error.reset_counter), intr, &timeout, NULL); if (ret == -ETIME) @@ -159,29 +144,29 @@ static signed long i915_fence_wait(struct fence *fence, bool intr, static int i915_fence_fill_driver_data(struct fence *fence, void *data, int size) { - struct i915_fence *intel_fence = to_intel_fence(fence); + struct drm_i915_gem_request *req = to_i915_request(fence); - if (size < sizeof(intel_fence->request->seqno)) + if (size < sizeof(req->seqno)) return -ENOMEM; - memcpy(data, &intel_fence->request->seqno, - sizeof(intel_fence->request->seqno)); + memcpy(data, &req->seqno, + sizeof(req->seqno)); - return sizeof(intel_fence->request->seqno); + return sizeof(req->seqno); } static void i915_fence_value_str(struct fence *fence, char *str, int size) { - struct i915_fence *intel_fence = to_intel_fence(fence); + struct drm_i915_gem_request *req = to_i915_request(fence); - snprintf(str, size, "%u", intel_fence->request->seqno); + snprintf(str, size, "%u", req->seqno); } static void i915_fence_timeline_value_str(struct fence *fence, char *str, int size) { - struct i915_fence *intel_fence = to_intel_fence(fence); - struct intel_engine_cs *ring = intel_fence->ring; + struct drm_i915_gem_request *req = to_i915_request(fence); + struct intel_engine_cs *ring = req->ring; snprintf(str, size, "%u", ring->get_seqno(ring, false)); } @@ -200,27 +185,21 @@ static struct fence_ops i915_fence_ops = { static struct fence *i915_fence_create(struct intel_engine_cs *ring, struct intel_context *ctx) { - struct i915_fence *fence; + struct drm_i915_gem_request *req; int ret; - fence = kzalloc(sizeof(*fence), GFP_KERNEL); - if (!fence) - return NULL; - ret = ring->add_request(ring); if (ret) { DRM_ERROR("add_request failed\n"); - fence_free((struct fence *)fence); return NULL; } - fence->ring = ring; - fence->ctx = ctx; - fence->request = ring->outstanding_lazy_request; - fence_init(&fence->base, &i915_fence_ops, &fence_lock, ctx->user_handle, - fence->request->seqno); + req = ring->outstanding_lazy_request; + + fence_init(&req->fence, &i915_fence_ops, &fence_lock, + ctx->user_handle, req->seqno); - return &fence->base; + return &req->fence; } /** -- 1.9.1 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 3/3] drm/i915: add fences to the request struct 2014-12-03 19:49 ` [PATCH 3/3] drm/i915: add fences to the request struct Jesse Barnes @ 2014-12-04 9:13 ` Chris Wilson 2014-12-04 11:05 ` Daniel Vetter 0 siblings, 1 reply; 15+ messages in thread From: Chris Wilson @ 2014-12-04 9:13 UTC (permalink / raw) To: Jesse Barnes; +Cc: Maarten Lankhorst, intel-gfx On Wed, Dec 03, 2014 at 11:49:07AM -0800, Jesse Barnes wrote: > This simplifies the sync code quite a bit. I don't think we'll be able > to get away with using the core fence code's seqno support, since we'll > be moving away from simple seqno comparisions with the scheduler and > preemption, but the additional code is pretty minimal anyway, and lets > us add additional debugging as needed, so it's probably fine to keep > either way. > > We still need to add support for other rings here; we ought to be able > to do that with the timeline field of the ioctl (which will include > other "rings" like the display flip queue for example). I am ambivalent about this. I don't think migrating i915_request over to use the heavyweight fence primitives is a good idea, so see little value in embedding the struct fence inside i915_request vs a small bookkeeping struct referencing i915_request. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 3/3] drm/i915: add fences to the request struct 2014-12-04 9:13 ` Chris Wilson @ 2014-12-04 11:05 ` Daniel Vetter 2014-12-04 11:29 ` Chris Wilson 0 siblings, 1 reply; 15+ messages in thread From: Daniel Vetter @ 2014-12-04 11:05 UTC (permalink / raw) To: Chris Wilson, Jesse Barnes, intel-gfx, Maarten Lankhorst On Thu, Dec 04, 2014 at 09:13:21AM +0000, Chris Wilson wrote: > On Wed, Dec 03, 2014 at 11:49:07AM -0800, Jesse Barnes wrote: > > This simplifies the sync code quite a bit. I don't think we'll be able > > to get away with using the core fence code's seqno support, since we'll > > be moving away from simple seqno comparisions with the scheduler and > > preemption, but the additional code is pretty minimal anyway, and lets > > us add additional debugging as needed, so it's probably fine to keep > > either way. > > > > We still need to add support for other rings here; we ought to be able > > to do that with the timeline field of the ioctl (which will include > > other "rings" like the display flip queue for example). > > I am ambivalent about this. I don't think migrating i915_request over to > use the heavyweight fence primitives is a good idea, so see little value > in embedding the struct fence inside i915_request vs a small bookkeeping > struct referencing i915_request. Which part of struct fence is too heavyweight? And I guess we could always add a slab or some cache if the allocation overhead is too much. I really like this conceptually so if there's a concern with overhead I want solid data for it. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 3/3] drm/i915: add fences to the request struct 2014-12-04 11:05 ` Daniel Vetter @ 2014-12-04 11:29 ` Chris Wilson 2014-12-04 12:58 ` Daniel Vetter 0 siblings, 1 reply; 15+ messages in thread From: Chris Wilson @ 2014-12-04 11:29 UTC (permalink / raw) To: Daniel Vetter; +Cc: Maarten Lankhorst, intel-gfx On Thu, Dec 04, 2014 at 12:05:34PM +0100, Daniel Vetter wrote: > On Thu, Dec 04, 2014 at 09:13:21AM +0000, Chris Wilson wrote: > > On Wed, Dec 03, 2014 at 11:49:07AM -0800, Jesse Barnes wrote: > > > This simplifies the sync code quite a bit. I don't think we'll be able > > > to get away with using the core fence code's seqno support, since we'll > > > be moving away from simple seqno comparisions with the scheduler and > > > preemption, but the additional code is pretty minimal anyway, and lets > > > us add additional debugging as needed, so it's probably fine to keep > > > either way. > > > > > > We still need to add support for other rings here; we ought to be able > > > to do that with the timeline field of the ioctl (which will include > > > other "rings" like the display flip queue for example). > > > > I am ambivalent about this. I don't think migrating i915_request over to > > use the heavyweight fence primitives is a good idea, so see little value > > in embedding the struct fence inside i915_request vs a small bookkeeping > > struct referencing i915_request. > > Which part of struct fence is too heavyweight? And I guess we could always > add a slab or some cache if the allocation overhead is too much. I really > like this conceptually so if there's a concern with overhead I want solid > data for it. It uses locked atomic operations, which are unnecessary for the very frequent is-complete checks (due to the nice ordering constraints of x86). You have me confused. You keep applying patches without solid data to back up your claims. -Chris -- Chris Wilson, Intel Open Source Technology Centre _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 3/3] drm/i915: add fences to the request struct 2014-12-04 11:29 ` Chris Wilson @ 2014-12-04 12:58 ` Daniel Vetter 0 siblings, 0 replies; 15+ messages in thread From: Daniel Vetter @ 2014-12-04 12:58 UTC (permalink / raw) To: Chris Wilson, Daniel Vetter, Jesse Barnes, intel-gfx, Maarten Lankhorst On Thu, Dec 4, 2014 at 12:29 PM, Chris Wilson <chris@chris-wilson.co.uk> wrote: > It uses locked atomic operations, which are unnecessary for the very > frequent is-complete checks (due to the nice ordering constraints of x86). So let's look at the fastpaths: - fence already signaled: Just a test_bit which amounts to a normal read on x86. - fence not yet signaled: Just a call into the driver backend. Obviously we want to keep the lazy coherency trick for i915, but struct fence already has the concept of lazy coherency: By default they only signal "somewhen", you need to call ->enable_signalling for tight guarantees. We can just reuse the book-keeping bit struct fence already has to compute lazy_coherency. So no additional atomics compared to our current fastpath. - signalling a fence: This one indeed did grow a test_and_set_bit, so a regression in the fastpath (locks will only be taking when the fence is already in the accurate slowpath mode). But this is only once per fence so hopefully doesn't hurt. If it does we can fix it by trading speed for memory and putting the signalling bit into it's own word, resulting in a simple store. So looks all good or fixable. The only real issue is that the slowpath cost is beared by all threads if just one thread goes into a slowpath. So a notch less fair for this case. But in the normal case where everyone just checks business to manage caches this shouldn't happen. Or did I miss something? In that case we should really try to fix fences first, since they're meant to be fast in the fastpaths at least and used universally. > You have me confused. You keep applying patches without solid data to > back up your claims. I guess you mean the ggtt vma first, which imo really just reverts premature optimization. But the point of the separate patch (and me merging it quickly) is to catch regressions and issues. So if we really need this I'll happily revert the patch (so that we can add the perf data the first attempt lacked and store it for perpetuity). -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2015-01-14 16:14 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-12-03 19:49 [RFC] Updated Android sync patches Jesse Barnes 2014-12-03 19:49 ` [PATCH 1/3] android: add sync_fence_create_dma Jesse Barnes 2015-01-14 14:09 ` Tvrtko Ursulin 2015-01-14 16:04 ` Maarten Lankhorst 2014-12-03 19:49 ` [PATCH 2/3] drm/i915: Android sync points for i915 v3 Jesse Barnes 2014-12-03 21:29 ` Maarten Lankhorst 2014-12-04 8:10 ` Chris Wilson 2014-12-04 9:03 ` Chris Wilson 2014-12-04 11:21 ` Daniel Vetter 2014-12-04 14:57 ` Jesse Barnes 2014-12-03 19:49 ` [PATCH 3/3] drm/i915: add fences to the request struct Jesse Barnes 2014-12-04 9:13 ` Chris Wilson 2014-12-04 11:05 ` Daniel Vetter 2014-12-04 11:29 ` Chris Wilson 2014-12-04 12:58 ` Daniel Vetter
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox