From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.5 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,HK_RANDOM_FROM,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1A256C433E0 for ; Wed, 5 Aug 2020 15:04:28 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E5FD023332 for ; Wed, 5 Aug 2020 15:04:27 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E5FD023332 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=intel-gfx-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 998486E804; Wed, 5 Aug 2020 15:04:27 +0000 (UTC) Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6D13F6E7EC for ; Wed, 5 Aug 2020 15:04:25 +0000 (UTC) IronPort-SDR: bnVA2NR7mmkzuj0CONz06W68zhc9F4b+rstWcyFp4CRrGAcEMWh0Dkdith4zdVPhcdFmtLa+QO 5VjpG1FSldwg== X-IronPort-AV: E=McAfee;i="6000,8403,9703"; a="170630562" X-IronPort-AV: E=Sophos;i="5.75,438,1589266800"; d="scan'208";a="170630562" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2020 08:04:03 -0700 IronPort-SDR: T/vvr5U1/HIkRhO183QCIq0LJvDpaIC/MFKKGv3ImxVx/etsRSDzA5viae67SPEOf4O1AgI+kL R8NHNFpRq63A== X-IronPort-AV: E=Sophos;i="5.75,438,1589266800"; d="scan'208";a="467494745" Received: from tlevis-mobl1.ger.corp.intel.com (HELO [10.251.167.151]) ([10.251.167.151]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Aug 2020 08:04:00 -0700 To: Chris Wilson , intel-gfx@lists.freedesktop.org References: <20200805122231.23313-1-chris@chris-wilson.co.uk> <20200805122231.23313-3-chris@chris-wilson.co.uk> From: Tvrtko Ursulin Organization: Intel Corporation UK Plc Message-ID: <77d41536-bcfc-dd05-b57e-d254f603a434@linux.intel.com> Date: Wed, 5 Aug 2020 16:03:58 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <20200805122231.23313-3-chris@chris-wilson.co.uk> Content-Language: en-US Subject: Re: [Intel-gfx] [PATCH 02/37] drm/i915/gt: Protect context lifetime with RCU X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" On 05/08/2020 13:21, Chris Wilson wrote: > Allow a brief period for continued access to a dead intel_context by > deferring the release of the struct until after an RCU grace period. > As we are using a dedicated slab cache for the contexts, we can defer > the release of the slab pages via RCU, with the caveat that individual > structs may be reused from the freelist within an RCU grace period. To > handle that, we have to avoid clearing members of the zombie struct. Is this related to debugfs race, optimising the driver latencies or both? Need to hack up mutex_reinit bothers me, on top of general desire to avoid even more rcu complexity. Regards, Tvrtko > Signed-off-by: Chris Wilson > --- > drivers/gpu/drm/i915/gt/intel_context.c | 330 +++++++++++++----------- > drivers/gpu/drm/i915/i915_active.c | 10 + > drivers/gpu/drm/i915/i915_active.h | 2 + > drivers/gpu/drm/i915/i915_utils.h | 7 + > 4 files changed, 202 insertions(+), 147 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c > index 52db2bde44a3..4e7924640ffa 100644 > --- a/drivers/gpu/drm/i915/gt/intel_context.c > +++ b/drivers/gpu/drm/i915/gt/intel_context.c > @@ -22,7 +22,7 @@ static struct i915_global_context { > > static struct intel_context *intel_context_alloc(void) > { > - return kmem_cache_zalloc(global.slab_ce, GFP_KERNEL); > + return kmem_cache_alloc(global.slab_ce, GFP_KERNEL); > } > > void intel_context_free(struct intel_context *ce) > @@ -30,6 +30,177 @@ void intel_context_free(struct intel_context *ce) > kmem_cache_free(global.slab_ce, ce); > } > > +static int __context_pin_state(struct i915_vma *vma) > +{ > + unsigned int bias = i915_ggtt_pin_bias(vma) | PIN_OFFSET_BIAS; > + int err; > + > + err = i915_ggtt_pin(vma, 0, bias | PIN_HIGH); > + if (err) > + return err; > + > + err = i915_active_acquire(&vma->active); > + if (err) > + goto err_unpin; > + > + /* > + * And mark it as a globally pinned object to let the shrinker know > + * it cannot reclaim the object until we release it. > + */ > + i915_vma_make_unshrinkable(vma); > + vma->obj->mm.dirty = true; > + > + return 0; > + > +err_unpin: > + i915_vma_unpin(vma); > + return err; > +} > + > +static void __context_unpin_state(struct i915_vma *vma) > +{ > + i915_vma_make_shrinkable(vma); > + i915_active_release(&vma->active); > + __i915_vma_unpin(vma); > +} > + > +static int __ring_active(struct intel_ring *ring) > +{ > + int err; > + > + err = intel_ring_pin(ring); > + if (err) > + return err; > + > + err = i915_active_acquire(&ring->vma->active); > + if (err) > + goto err_pin; > + > + return 0; > + > +err_pin: > + intel_ring_unpin(ring); > + return err; > +} > + > +static void __ring_retire(struct intel_ring *ring) > +{ > + i915_active_release(&ring->vma->active); > + intel_ring_unpin(ring); > +} > + > +__i915_active_call > +static void __intel_context_retire(struct i915_active *active) > +{ > + struct intel_context *ce = container_of(active, typeof(*ce), active); > + > + CE_TRACE(ce, "retire runtime: { total:%lluns, avg:%lluns }\n", > + intel_context_get_total_runtime_ns(ce), > + intel_context_get_avg_runtime_ns(ce)); > + > + set_bit(CONTEXT_VALID_BIT, &ce->flags); > + if (ce->state) > + __context_unpin_state(ce->state); > + > + intel_timeline_unpin(ce->timeline); > + __ring_retire(ce->ring); > + > + intel_context_put(ce); > +} > + > +static int __intel_context_active(struct i915_active *active) > +{ > + struct intel_context *ce = container_of(active, typeof(*ce), active); > + int err; > + > + CE_TRACE(ce, "active\n"); > + > + intel_context_get(ce); > + > + err = __ring_active(ce->ring); > + if (err) > + goto err_put; > + > + err = intel_timeline_pin(ce->timeline); > + if (err) > + goto err_ring; > + > + if (!ce->state) > + return 0; > + > + err = __context_pin_state(ce->state); > + if (err) > + goto err_timeline; > + > + return 0; > + > +err_timeline: > + intel_timeline_unpin(ce->timeline); > +err_ring: > + __ring_retire(ce->ring); > +err_put: > + intel_context_put(ce); > + return err; > +} > + > +static void __intel_context_ctor(void *arg) > +{ > + struct intel_context *ce = arg; > + > + INIT_LIST_HEAD(&ce->signal_link); > + INIT_LIST_HEAD(&ce->signals); > + > + atomic_set(&ce->pin_count, 0); > + mutex_init(&ce->pin_mutex); > + > + ce->active_count = 0; > + i915_active_init(&ce->active, > + __intel_context_active, __intel_context_retire); > + > + ce->inflight = NULL; > + ce->lrc_reg_state = NULL; > + ce->lrc.desc = 0; > +} > + > +static void > +__intel_context_init(struct intel_context *ce, struct intel_engine_cs *engine) > +{ > + GEM_BUG_ON(!engine->cops); > + GEM_BUG_ON(!engine->gt->vm); > + > + kref_init(&ce->ref); > + i915_active_reinit(&ce->active); > + mutex_reinit(&ce->pin_mutex); > + > + ce->engine = engine; > + ce->ops = engine->cops; > + ce->sseu = engine->sseu; > + > + ce->wa_bb_page = 0; > + ce->flags = 0; > + ce->tag = 0; > + > + memset(&ce->runtime, 0, sizeof(ce->runtime)); > + > + ce->vm = i915_vm_get(engine->gt->vm); > + ce->gem_context = NULL; > + > + ce->ring = __intel_context_ring_size(SZ_4K); > + ce->timeline = NULL; > + ce->state = NULL; > + > + GEM_BUG_ON(atomic_read(&ce->pin_count)); > + GEM_BUG_ON(ce->active_count); > + GEM_BUG_ON(ce->inflight); > +} > + > +void > +intel_context_init(struct intel_context *ce, struct intel_engine_cs *engine) > +{ > + __intel_context_ctor(ce); > + __intel_context_init(ce, engine); > +} > + > struct intel_context * > intel_context_create(struct intel_engine_cs *engine) > { > @@ -39,7 +210,7 @@ intel_context_create(struct intel_engine_cs *engine) > if (!ce) > return ERR_PTR(-ENOMEM); > > - intel_context_init(ce, engine); > + __intel_context_init(ce, engine); > return ce; > } > > @@ -158,154 +329,13 @@ void intel_context_unpin(struct intel_context *ce) > /* > * Once released, we may asynchronously drop the active reference. > * As that may be the only reference keeping the context alive, > - * take an extra now so that it is not freed before we finish > + * hold onto RCU so that it is not freed before we finish > * dereferencing it. > */ > - intel_context_get(ce); > + rcu_read_lock(); > intel_context_active_release(ce); > - intel_context_put(ce); > -} > - > -static int __context_pin_state(struct i915_vma *vma) > -{ > - unsigned int bias = i915_ggtt_pin_bias(vma) | PIN_OFFSET_BIAS; > - int err; > - > - err = i915_ggtt_pin(vma, 0, bias | PIN_HIGH); > - if (err) > - return err; > - > - err = i915_active_acquire(&vma->active); > - if (err) > - goto err_unpin; > - > - /* > - * And mark it as a globally pinned object to let the shrinker know > - * it cannot reclaim the object until we release it. > - */ > - i915_vma_make_unshrinkable(vma); > - vma->obj->mm.dirty = true; > - > - return 0; > - > -err_unpin: > - i915_vma_unpin(vma); > - return err; > -} > - > -static void __context_unpin_state(struct i915_vma *vma) > -{ > - i915_vma_make_shrinkable(vma); > - i915_active_release(&vma->active); > - __i915_vma_unpin(vma); > -} > - > -static int __ring_active(struct intel_ring *ring) > -{ > - int err; > - > - err = intel_ring_pin(ring); > - if (err) > - return err; > - > - err = i915_active_acquire(&ring->vma->active); > - if (err) > - goto err_pin; > - > - return 0; > - > -err_pin: > - intel_ring_unpin(ring); > - return err; > -} > - > -static void __ring_retire(struct intel_ring *ring) > -{ > - i915_active_release(&ring->vma->active); > - intel_ring_unpin(ring); > + rcu_read_unlock(); > } > - > -__i915_active_call > -static void __intel_context_retire(struct i915_active *active) > -{ > - struct intel_context *ce = container_of(active, typeof(*ce), active); > - > - CE_TRACE(ce, "retire runtime: { total:%lluns, avg:%lluns }\n", > - intel_context_get_total_runtime_ns(ce), > - intel_context_get_avg_runtime_ns(ce)); > - > - set_bit(CONTEXT_VALID_BIT, &ce->flags); > - if (ce->state) > - __context_unpin_state(ce->state); > - > - intel_timeline_unpin(ce->timeline); > - __ring_retire(ce->ring); > - > - intel_context_put(ce); > -} > - > -static int __intel_context_active(struct i915_active *active) > -{ > - struct intel_context *ce = container_of(active, typeof(*ce), active); > - int err; > - > - CE_TRACE(ce, "active\n"); > - > - intel_context_get(ce); > - > - err = __ring_active(ce->ring); > - if (err) > - goto err_put; > - > - err = intel_timeline_pin(ce->timeline); > - if (err) > - goto err_ring; > - > - if (!ce->state) > - return 0; > - > - err = __context_pin_state(ce->state); > - if (err) > - goto err_timeline; > - > - return 0; > - > -err_timeline: > - intel_timeline_unpin(ce->timeline); > -err_ring: > - __ring_retire(ce->ring); > -err_put: > - intel_context_put(ce); > - return err; > -} > - > -void > -intel_context_init(struct intel_context *ce, > - struct intel_engine_cs *engine) > -{ > - GEM_BUG_ON(!engine->cops); > - GEM_BUG_ON(!engine->gt->vm); > - > - kref_init(&ce->ref); > - > - ce->engine = engine; > - ce->ops = engine->cops; > - ce->sseu = engine->sseu; > - ce->ring = __intel_context_ring_size(SZ_4K); > - > - ewma_runtime_init(&ce->runtime.avg); > - > - ce->vm = i915_vm_get(engine->gt->vm); > - > - INIT_LIST_HEAD(&ce->signal_link); > - INIT_LIST_HEAD(&ce->signals); > - > - mutex_init(&ce->pin_mutex); > - > - i915_active_init(&ce->active, > - __intel_context_active, __intel_context_retire); > -} > - > void intel_context_fini(struct intel_context *ce) > { > if (ce->timeline) > @@ -333,7 +363,13 @@ static struct i915_global_context global = { { > > int __init i915_global_context_init(void) > { > - global.slab_ce = KMEM_CACHE(intel_context, SLAB_HWCACHE_ALIGN); > + global.slab_ce = > + kmem_cache_create("intel_context", > + sizeof(struct intel_context), > + __alignof__(struct intel_context), > + SLAB_HWCACHE_ALIGN | > + SLAB_TYPESAFE_BY_RCU, > + __intel_context_ctor); > if (!global.slab_ce) > return -ENOMEM; > > diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c > index b0a6522be3d1..6a3bb10e3bb2 100644 > --- a/drivers/gpu/drm/i915/i915_active.c > +++ b/drivers/gpu/drm/i915/i915_active.c > @@ -795,6 +795,16 @@ void i915_active_fini(struct i915_active *ref) > kmem_cache_free(global.slab_cache, ref->cache); > } > > +void i915_active_reinit(struct i915_active *ref) > +{ > + GEM_BUG_ON(!i915_active_is_idle(ref)); > + debug_active_init(ref); > + mutex_reinit(&ref->mutex); > + > + ref->cache = NULL; > + ref->tree = RB_ROOT; > +} > + > static inline bool is_idle_barrier(struct active_node *node, u64 idx) > { > return node->timeline == idx && !i915_active_fence_isset(&node->base); > diff --git a/drivers/gpu/drm/i915/i915_active.h b/drivers/gpu/drm/i915/i915_active.h > index fb165d3f01cf..6df7e721616d 100644 > --- a/drivers/gpu/drm/i915/i915_active.h > +++ b/drivers/gpu/drm/i915/i915_active.h > @@ -219,6 +219,8 @@ i915_active_is_idle(const struct i915_active *ref) > > void i915_active_fini(struct i915_active *ref); > > +void i915_active_reinit(struct i915_active *ref); > + > int i915_active_acquire_preallocate_barrier(struct i915_active *ref, > struct intel_engine_cs *engine); > void i915_active_acquire_barrier(struct i915_active *ref); > diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h > index 54773371e6bd..ef8db3aa75c7 100644 > --- a/drivers/gpu/drm/i915/i915_utils.h > +++ b/drivers/gpu/drm/i915/i915_utils.h > @@ -443,6 +443,13 @@ static inline bool timer_expired(const struct timer_list *t) > return READ_ONCE(t->expires) && !timer_pending(t); > } > > +static inline void mutex_reinit(struct mutex *lock) > +{ > +#if IS_ENABLED(CONFIG_DEBUG_MUTEXES) > + lock->magic = lock; > +#endif > +} > + > /* > * This is a lookalike for IS_ENABLED() that takes a kconfig value, > * e.g. CONFIG_DRM_I915_SPIN_REQUEST, and evaluates whether it is non-zero > _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx