From mboxrd@z Thu Jan 1 00:00:00 1970 From: Maarten Lankhorst Subject: Re: [PULL REQUEST] ttm fence conversion Date: Mon, 01 Sep 2014 20:43:16 +0200 Message-ID: <5404BE44.8030407@canonical.com> References: <540459E3.9060406@canonical.com> <54046722.1000207@vodafone.de> <540475A6.2020603@canonical.com> <54049D19.30802@vodafone.de> Mime-Version: 1.0 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: quoted-printable Return-path: Received: from mblankhorst.nl (mblankhorst.nl [141.105.120.124]) by gabe.freedesktop.org (Postfix) with ESMTP id 1CFEA6E091 for ; Mon, 1 Sep 2014 11:43:19 -0700 (PDT) In-Reply-To: <54049D19.30802@vodafone.de> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: =?windows-1252?Q?Christian_K=F6nig?= , Dave Airlie Cc: "dri-devel@lists.freedesktop.org" List-Id: dri-devel@lists.freedesktop.org Hey, On 01-09-14 18:21, Christian K=F6nig wrote: > Am 01.09.2014 um 15:33 schrieb Maarten Lankhorst: >> Hey, >> >> Op 01-09-14 om 14:31 schreef Christian K=F6nig: >>> Please wait a second with that. >>> >>> I didn't had a chance to test this yet and nobody has yet given it's rb= on at least the radeon changes in this branch. >> Ok, my fault. I thought it was implicitly acked. I haven't made any func= tional changes to these patches, >> just some small fixups and a fix to make it apply after the upstream rem= oval of RADEON_FENCE_SIGNALED_SEQ. > = > Yeah, but the resulting patch looks to complex for my taste and should be= simplified a bit more. Here is a more detailed review: > = >> + wait_queue_t fence_wake; > Only a nitpick, but please fix the indention and maybe add a comment. > = >> + struct work_struct delayed_irq_work; > Just drop that, the new fall back work item should take care of this when= the unfortunate case happens that somebody tries to enable_signaling in th= e middle of a GPU reset. I can only drop it if radeon_gpu_reset will always call radeon_irq_set afte= r downgrading to read mode, even if no work needs to be done. :-) Then again, should be possible. >> /* >> - * Cast helper >> - */ >> -#define to_radeon_fence(p) ((struct radeon_fence *)(p)) >> - >> -/* > Please define the new cast helper in radeon.h as well. The ops are only defined in radeon_fence.c, and nothing outside of radeon_f= ence.c should care about the internals. >> if (!rdev->needs_reset) { >> - up_write(&rdev->exclusive_lock); >> + downgrade_write(&rdev->exclusive_lock); >> + wake_up_all(&rdev->fence_queue); >> + up_read(&rdev->exclusive_lock); >> return 0; >> } > Just drop that as well, no need to wake up anybody here. Maybe not, but if I have to remove delayed_irq_work I do need to add a rade= on_irq_set here. >> downgrade_write(&rdev->exclusive_lock); >> + wake_up_all(&rdev->fence_queue); > Same here, the IB test will wake up all fences for recheck anyway. Same as previous comment. :-) >> + * radeon_fence_read_seq - Returns the current fence value without upda= ting >> + * >> + * @rdev: radeon_device pointer >> + * @ring: ring index to return the seqno of >> + */ >> +static uint64_t radeon_fence_read_seq(struct radeon_device *rdev, int r= ing) >> +{ >> + uint64_t last_seq =3D atomic64_read(&rdev->fence_drv[ring].last_seq= ); >> + uint64_t last_emitted =3D rdev->fence_drv[ring].sync_seq[ring]; >> + uint64_t seq =3D radeon_fence_read(rdev, ring); >> + >> + seq =3D radeon_fence_read(rdev, ring); >> + seq |=3D last_seq & 0xffffffff00000000LL; >> + if (seq < last_seq) { >> + seq &=3D 0xffffffff; >> + seq |=3D last_emitted & 0xffffffff00000000LL; >> + } >> + return seq; >> +} > Completely drop that and just check the last_seq signaled as set by radeo= n_fence_activity. Do you mean call radeon_fence_activity in radeon_fence_signaled? Or should = I just use the cached value in radeon_fence_check_signaled. I can't call fence_activity in check_signaled, because it would cause re-en= trancy in fence_queue. >> + if (!ret) >> + FENCE_TRACE(&fence->base, "signaled from irq context\n"); >> + else >> + FENCE_TRACE(&fence->base, "was already signaled\n"); > Is all that text tracing necessary? Probably better define a trace point = here. It gets optimized out normally. There's already a tracepoint called in fenc= e_signal. = >> + if (atomic64_read(&rdev->fence_drv[fence->ring].last_seq) >=3D fenc= e->seq || >> + !rdev->ddev->irq_enabled) >> + return false; > Checking irq_enabled here might not be such a good idea if the fence code= don't has a fall back on it's own. What exactly happens if enable_signalin= g returns false? I thought irq_enabled couldn't happen under normal circumstances? Anyway the fence gets treated as signaled if it returns false, and fence_si= gnal will get called. >> +static signed long radeon_fence_default_wait(struct fence *f, bool intr, >> + signed long timeout) >> +{ >> + struct radeon_fence *fence =3D to_radeon_fence(f); >> + struct radeon_device *rdev =3D fence->rdev; >> + bool signaled; >> + >> + fence_enable_sw_signaling(&fence->base); >> + >> + /* >> + * This function has to return -EDEADLK, but cannot hold >> + * exclusive_lock during the wait because some callers >> + * may already hold it. This means checking needs_reset without >> + * lock, and not fiddling with any gpu internals. >> + * >> + * The callback installed with fence_enable_sw_signaling will >> + * run before our wait_event_*timeout call, so we will see >> + * both the signaled fence and the changes to needs_reset. >> + */ >> + >> + if (intr) >> + timeout =3D wait_event_interruptible_timeout(rdev->fence_queue, >> + ((signaled =3D (test_bit(FENCE_FLAG_SIGN= ALED_BIT, &fence->base.flags))) || rdev->needs_reset), >> + timeout); >> + else >> + timeout =3D wait_event_timeout(rdev->fence_queue, >> + ((signaled =3D (test_bit(FENCE_FLAG_SIGNALED_B= IT, &fence->base.flags))) || rdev->needs_reset), >> + timeout); >> + >> + if (timeout > 0 && !signaled) >> + return -EDEADLK; >> + return timeout; >> +} > This at least needs to be properly formated, but I think since we now don= 't need extra handling any more we don't need an extra wait function as wel= l. I thought of removing the extra handling, but the -EDEADLK stuff is needed = because a deadlock could happen in ttm_bo_lock_delayed_workqueue otherwise = if the gpu's really hung there would never be any progress forward. ~Maarten