From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?windows-1252?Q?Christian_K=F6nig?= Subject: Re: [PULL REQUEST] ttm fence conversion Date: Tue, 02 Sep 2014 11:52:46 +0200 Message-ID: <5405936E.2030301@vodafone.de> References: <540459E3.9060406@canonical.com> <54046722.1000207@vodafone.de> <540475A6.2020603@canonical.com> <54049D19.30802@vodafone.de> <5404BE44.8030407@canonical.com> <5405850F.8010901@vodafone.de> <54058A07.2090101@canonical.com> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252"; Format="flowed" Content-Transfer-Encoding: quoted-printable Return-path: Received: from pegasos-out.vodafone.de (pegasos-out.vodafone.de [80.84.1.38]) by gabe.freedesktop.org (Postfix) with ESMTP id C224889954 for ; Tue, 2 Sep 2014 02:53:22 -0700 (PDT) In-Reply-To: <54058A07.2090101@canonical.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: Maarten Lankhorst , Dave Airlie Cc: "dri-devel@lists.freedesktop.org" List-Id: dri-devel@lists.freedesktop.org Am 02.09.2014 um 11:12 schrieb Maarten Lankhorst: > Op 02-09-14 om 10:51 schreef Christian K=F6nig: >> Am 01.09.2014 um 20:43 schrieb Maarten Lankhorst: >>> Hey, >>> >>> On 01-09-14 18:21, Christian K=F6nig wrote: >>>> Am 01.09.2014 um 15:33 schrieb Maarten Lankhorst: >>>>> Hey, >>>>> >>>>> Op 01-09-14 om 14:31 schreef Christian K=F6nig: >>>>>> Please wait a second with that. >>>>>> >>>>>> I didn't had a chance to test this yet and nobody has yet given it's= rb on at least the radeon changes in this branch. >>>>> Ok, my fault. I thought it was implicitly acked. I haven't made any f= unctional changes to these patches, >>>>> just some small fixups and a fix to make it apply after the upstream = removal of RADEON_FENCE_SIGNALED_SEQ. >>>> Yeah, but the resulting patch looks to complex for my taste and should= be simplified a bit more. Here is a more detailed review: >>>> >>>>> + wait_queue_t fence_wake; >>>> Only a nitpick, but please fix the indention and maybe add a comment. >>>> >>>>> + struct work_struct delayed_irq_work; >>>> Just drop that, the new fall back work item should take care of this w= hen the unfortunate case happens that somebody tries to enable_signaling in= the middle of a GPU reset. >>> I can only drop it if radeon_gpu_reset will always call radeon_irq_set = after downgrading to read mode, even if no work needs to be done. :-) >>> >>> Then again, should be possible. >> The fall back handler should take care of the rare condition that we can= 't activate the IRQ because the driver is in a lockup handler. >> >> The issue is that the delayed_irq_work handler needs to take the exclusi= ve lock once more and so would block an innocent process for the duration o= f the GPU lockup. >> >> Either reschedule as delayed work item if we can't take the lock immedia= tely or just live with the delay of the fall back handler. Since IRQs usual= ly don't work correctly immediately after an GPU reset I'm pretty sure that= the fallback handler will be needed anyway. > Ok, rescheduling would be fine. Or could I go with the alternative, remov= e the delayed_irq_work and always set irqs after downgrading the write lock? Always setting the IRQ's after downgrading the write lock would work for = me as well. > >>>>> /* >>>>> - * Cast helper >>>>> - */ >>>>> -#define to_radeon_fence(p) ((struct radeon_fence *)(p)) >>>>> - >>>>> -/* >>>> Please define the new cast helper in radeon.h as well. >>> The ops are only defined in radeon_fence.c, and nothing outside of rade= on_fence.c should care about the internals. >> Then define this as a function instead, I need a checked cast from a fen= ce to a radeon_fence outside of the fence code as well. > Ok. > >>>>> if (!rdev->needs_reset) { >>>>> - up_write(&rdev->exclusive_lock); >>>>> + downgrade_write(&rdev->exclusive_lock); >>>>> + wake_up_all(&rdev->fence_queue); >>>>> + up_read(&rdev->exclusive_lock); >>>>> return 0; >>>>> } >>>> Just drop that as well, no need to wake up anybody here. >>> Maybe not, but if I have to remove delayed_irq_work I do need to add a = radeon_irq_set here. >>> >>>>> downgrade_write(&rdev->exclusive_lock); >>>>> + wake_up_all(&rdev->fence_queue); >>>> Same here, the IB test will wake up all fences for recheck anyway. >>> Same as previous comment. :-) >>> >>>>> + * radeon_fence_read_seq - Returns the current fence value without u= pdating >>>>> + * >>>>> + * @rdev: radeon_device pointer >>>>> + * @ring: ring index to return the seqno of >>>>> + */ >>>>> +static uint64_t radeon_fence_read_seq(struct radeon_device *rdev, in= t ring) >>>>> +{ >>>>> + uint64_t last_seq =3D atomic64_read(&rdev->fence_drv[ring].last_= seq); >>>>> + uint64_t last_emitted =3D rdev->fence_drv[ring].sync_seq[ring]; >>>>> + uint64_t seq =3D radeon_fence_read(rdev, ring); >>>>> + >>>>> + seq =3D radeon_fence_read(rdev, ring); >>>>> + seq |=3D last_seq & 0xffffffff00000000LL; >>>>> + if (seq < last_seq) { >>>>> + seq &=3D 0xffffffff; >>>>> + seq |=3D last_emitted & 0xffffffff00000000LL; >>>>> + } >>>>> + return seq; >>>>> +} >>>> Completely drop that and just check the last_seq signaled as set by ra= deon_fence_activity. >>> Do you mean call radeon_fence_activity in radeon_fence_signaled? Or sho= uld I just use the cached value in radeon_fence_check_signaled. >> Just check the cached value, it should be updated by radeon_fence_activi= ty immediately before calling this anyway. > Ok. I think I wrote this as a workaround for unreliable interrupts. :-) > >>> I can't call fence_activity in check_signaled, because it would cause r= e-entrancy in fence_queue. >>> >>>>> + if (!ret) >>>>> + FENCE_TRACE(&fence->base, "signaled from irq context\n"); >>>>> + else >>>>> + FENCE_TRACE(&fence->base, "was already signaled\n"); >>>> Is all that text tracing necessary? Probably better define a trace poi= nt here. >>> It gets optimized out normally. There's already a tracepoint called in = fence_signal. >>> = >>>>> + if (atomic64_read(&rdev->fence_drv[fence->ring].last_seq) >=3D f= ence->seq || >>>>> + !rdev->ddev->irq_enabled) >>>>> + return false; >>>> Checking irq_enabled here might not be such a good idea if the fence c= ode don't has a fall back on it's own. What exactly happens if enable_signa= ling returns false? >>> I thought irq_enabled couldn't happen under normal circumstances? >> Not 100% sure, but I think it is temporary turned off during reset. >> >>> Anyway the fence gets treated as signaled if it returns false, and fenc= e_signal will get called. >> Thought so, well that's rather bad if we failed to install the IRQ handl= e that we just treat all fences as signaled isn't it? > I wrote this code before the delayed work was added, I guess the check fo= r !irq_enabled can be removed now. :-) > >>>>> +static signed long radeon_fence_default_wait(struct fence *f, bool i= ntr, >>>>> + signed long timeout) >>>>> +{ >>>>> + struct radeon_fence *fence =3D to_radeon_fence(f); >>>>> + struct radeon_device *rdev =3D fence->rdev; >>>>> + bool signaled; >>>>> + >>>>> + fence_enable_sw_signaling(&fence->base); >>>>> + >>>>> + /* >>>>> + * This function has to return -EDEADLK, but cannot hold >>>>> + * exclusive_lock during the wait because some callers >>>>> + * may already hold it. This means checking needs_reset without >>>>> + * lock, and not fiddling with any gpu internals. >>>>> + * >>>>> + * The callback installed with fence_enable_sw_signaling will >>>>> + * run before our wait_event_*timeout call, so we will see >>>>> + * both the signaled fence and the changes to needs_reset. >>>>> + */ >>>>> + >>>>> + if (intr) >>>>> + timeout =3D wait_event_interruptible_timeout(rdev->fence_que= ue, >>>>> + ((signaled =3D (test_bit(FENCE_FLAG_S= IGNALED_BIT, &fence->base.flags))) || rdev->needs_reset), >>>>> + timeout); >>>>> + else >>>>> + timeout =3D wait_event_timeout(rdev->fence_queue, >>>>> + ((signaled =3D (test_bit(FENCE_FLAG_SIGNALE= D_BIT, &fence->base.flags))) || rdev->needs_reset), >>>>> + timeout); >>>>> + >>>>> + if (timeout > 0 && !signaled) >>>>> + return -EDEADLK; >>>>> + return timeout; >>>>> +} >>>> This at least needs to be properly formated, but I think since we now = don't need extra handling any more we don't need an extra wait function as = well. >>> I thought of removing the extra handling, but the -EDEADLK stuff is nee= ded because a deadlock could happen in ttm_bo_lock_delayed_workqueue otherw= ise if the gpu's really hung there would never be any progress forward. >> Hui what? ttm_bo_lock_delayed_workqueue shouldn't call any blocking wait. > Oops, you're right. ttm_bo_delayed_delete is called with remove_all false= , not true. > > Unfortunately ttm_bo_vm_fault does hold the exclusive_lock in read mode, = and other places that use eviction will use it too. > Without returning -EDEADLK this could mean that ttm_bo_move_accel_cleanup= would block forever, > so this function has to stay. Ok, fine with me. I'm not deep enough into the TTM code to really judge = this, but my understanding was that TTM still calls it's own wait callback. Regards, Christian. > > ~Maarten >