From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?windows-1252?Q?Christian_K=F6nig?= Subject: Re: [PULL REQUEST] ttm fence conversion Date: Tue, 02 Sep 2014 10:51:27 +0200 Message-ID: <5405850F.8010901@vodafone.de> References: <540459E3.9060406@canonical.com> <54046722.1000207@vodafone.de> <540475A6.2020603@canonical.com> <54049D19.30802@vodafone.de> <5404BE44.8030407@canonical.com> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252"; Format="flowed" Content-Transfer-Encoding: quoted-printable Return-path: Received: from pegasos-out.vodafone.de (pegasos-out.vodafone.de [80.84.1.38]) by gabe.freedesktop.org (Postfix) with ESMTP id 2208289FC3 for ; Tue, 2 Sep 2014 01:52:03 -0700 (PDT) In-Reply-To: <5404BE44.8030407@canonical.com> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: Maarten Lankhorst , Dave Airlie Cc: "dri-devel@lists.freedesktop.org" List-Id: dri-devel@lists.freedesktop.org Am 01.09.2014 um 20:43 schrieb Maarten Lankhorst: > Hey, > > On 01-09-14 18:21, Christian K=F6nig wrote: >> Am 01.09.2014 um 15:33 schrieb Maarten Lankhorst: >>> Hey, >>> >>> Op 01-09-14 om 14:31 schreef Christian K=F6nig: >>>> Please wait a second with that. >>>> >>>> I didn't had a chance to test this yet and nobody has yet given it's r= b on at least the radeon changes in this branch. >>> Ok, my fault. I thought it was implicitly acked. I haven't made any fun= ctional changes to these patches, >>> just some small fixups and a fix to make it apply after the upstream re= moval of RADEON_FENCE_SIGNALED_SEQ. >> Yeah, but the resulting patch looks to complex for my taste and should b= e simplified a bit more. Here is a more detailed review: >> >>> + wait_queue_t fence_wake; >> Only a nitpick, but please fix the indention and maybe add a comment. >> >>> + struct work_struct delayed_irq_work; >> Just drop that, the new fall back work item should take care of this whe= n the unfortunate case happens that somebody tries to enable_signaling in t= he middle of a GPU reset. > I can only drop it if radeon_gpu_reset will always call radeon_irq_set af= ter downgrading to read mode, even if no work needs to be done. :-) > > Then again, should be possible. The fall back handler should take care of the rare condition that we = can't activate the IRQ because the driver is in a lockup handler. The issue is that the delayed_irq_work handler needs to take the = exclusive lock once more and so would block an innocent process for the = duration of the GPU lockup. Either reschedule as delayed work item if we can't take the lock = immediately or just live with the delay of the fall back handler. Since = IRQs usually don't work correctly immediately after an GPU reset I'm = pretty sure that the fallback handler will be needed anyway. >>> /* >>> - * Cast helper >>> - */ >>> -#define to_radeon_fence(p) ((struct radeon_fence *)(p)) >>> - >>> -/* >> Please define the new cast helper in radeon.h as well. > The ops are only defined in radeon_fence.c, and nothing outside of radeon= _fence.c should care about the internals. Then define this as a function instead, I need a checked cast from a = fence to a radeon_fence outside of the fence code as well. > >>> if (!rdev->needs_reset) { >>> - up_write(&rdev->exclusive_lock); >>> + downgrade_write(&rdev->exclusive_lock); >>> + wake_up_all(&rdev->fence_queue); >>> + up_read(&rdev->exclusive_lock); >>> return 0; >>> } >> Just drop that as well, no need to wake up anybody here. > Maybe not, but if I have to remove delayed_irq_work I do need to add a ra= deon_irq_set here. > >>> downgrade_write(&rdev->exclusive_lock); >>> + wake_up_all(&rdev->fence_queue); >> Same here, the IB test will wake up all fences for recheck anyway. > Same as previous comment. :-) > >>> + * radeon_fence_read_seq - Returns the current fence value without upd= ating >>> + * >>> + * @rdev: radeon_device pointer >>> + * @ring: ring index to return the seqno of >>> + */ >>> +static uint64_t radeon_fence_read_seq(struct radeon_device *rdev, int = ring) >>> +{ >>> + uint64_t last_seq =3D atomic64_read(&rdev->fence_drv[ring].last_se= q); >>> + uint64_t last_emitted =3D rdev->fence_drv[ring].sync_seq[ring]; >>> + uint64_t seq =3D radeon_fence_read(rdev, ring); >>> + >>> + seq =3D radeon_fence_read(rdev, ring); >>> + seq |=3D last_seq & 0xffffffff00000000LL; >>> + if (seq < last_seq) { >>> + seq &=3D 0xffffffff; >>> + seq |=3D last_emitted & 0xffffffff00000000LL; >>> + } >>> + return seq; >>> +} >> Completely drop that and just check the last_seq signaled as set by rade= on_fence_activity. > Do you mean call radeon_fence_activity in radeon_fence_signaled? Or shoul= d I just use the cached value in radeon_fence_check_signaled. Just check the cached value, it should be updated by = radeon_fence_activity immediately before calling this anyway. > > I can't call fence_activity in check_signaled, because it would cause re-= entrancy in fence_queue. > >>> + if (!ret) >>> + FENCE_TRACE(&fence->base, "signaled from irq context\n"); >>> + else >>> + FENCE_TRACE(&fence->base, "was already signaled\n"); >> Is all that text tracing necessary? Probably better define a trace point= here. > It gets optimized out normally. There's already a tracepoint called in fe= nce_signal. > = >>> + if (atomic64_read(&rdev->fence_drv[fence->ring].last_seq) >=3D fen= ce->seq || >>> + !rdev->ddev->irq_enabled) >>> + return false; >> Checking irq_enabled here might not be such a good idea if the fence cod= e don't has a fall back on it's own. What exactly happens if enable_signali= ng returns false? > I thought irq_enabled couldn't happen under normal circumstances? Not 100% sure, but I think it is temporary turned off during reset. > Anyway the fence gets treated as signaled if it returns false, and fence_= signal will get called. Thought so, well that's rather bad if we failed to install the IRQ = handle that we just treat all fences as signaled isn't it? > >>> +static signed long radeon_fence_default_wait(struct fence *f, bool int= r, >>> + signed long timeout) >>> +{ >>> + struct radeon_fence *fence =3D to_radeon_fence(f); >>> + struct radeon_device *rdev =3D fence->rdev; >>> + bool signaled; >>> + >>> + fence_enable_sw_signaling(&fence->base); >>> + >>> + /* >>> + * This function has to return -EDEADLK, but cannot hold >>> + * exclusive_lock during the wait because some callers >>> + * may already hold it. This means checking needs_reset without >>> + * lock, and not fiddling with any gpu internals. >>> + * >>> + * The callback installed with fence_enable_sw_signaling will >>> + * run before our wait_event_*timeout call, so we will see >>> + * both the signaled fence and the changes to needs_reset. >>> + */ >>> + >>> + if (intr) >>> + timeout =3D wait_event_interruptible_timeout(rdev->fence_queue, >>> + ((signaled =3D (test_bit(FENCE_FLAG_SIG= NALED_BIT, &fence->base.flags))) || rdev->needs_reset), >>> + timeout); >>> + else >>> + timeout =3D wait_event_timeout(rdev->fence_queue, >>> + ((signaled =3D (test_bit(FENCE_FLAG_SIGNALED_= BIT, &fence->base.flags))) || rdev->needs_reset), >>> + timeout); >>> + >>> + if (timeout > 0 && !signaled) >>> + return -EDEADLK; >>> + return timeout; >>> +} >> This at least needs to be properly formated, but I think since we now do= n't need extra handling any more we don't need an extra wait function as we= ll. > I thought of removing the extra handling, but the -EDEADLK stuff is neede= d because a deadlock could happen in ttm_bo_lock_delayed_workqueue otherwis= e if the gpu's really hung there would never be any progress forward. Hui what? ttm_bo_lock_delayed_workqueue shouldn't call any blocking wait. Regards, Christian. > > ~Maarten