From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tomi Valkeinen Subject: Re: [PATCH 03/33] HACK: drm/omap: fix memory barrier bug in DMM driver Date: Wed, 24 Feb 2016 12:34:35 +0200 Message-ID: <56CD873B.4070601@ti.com> References: <1455875288-4370-1-git-send-email-tomi.valkeinen@ti.com> <1455875288-4370-4-git-send-email-tomi.valkeinen@ti.com> <21417920.F8D2RjGPET@avalon> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============1625693864==" Return-path: Received: from arroyo.ext.ti.com (arroyo.ext.ti.com [192.94.94.40]) by gabe.freedesktop.org (Postfix) with ESMTPS id B2CEA893CB for ; Wed, 24 Feb 2016 13:51:14 +0000 (UTC) In-Reply-To: <21417920.F8D2RjGPET@avalon> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: Laurent Pinchart Cc: dri-devel@lists.freedesktop.org List-Id: dri-devel@lists.freedesktop.org --===============1625693864== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="AMd0682Nxid6h7ju3viQRGSPS2DX7TVP4" --AMd0682Nxid6h7ju3viQRGSPS2DX7TVP4 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 23/02/16 23:13, Laurent Pinchart wrote: > Hi Tomi, >=20 > Thank you for the patch. >=20 > On Friday 19 February 2016 11:47:38 Tomi Valkeinen wrote: >> A DMM timeout "timed out waiting for done" has been observed on DRA7 >> devices. The timeout happens rarely, and only when the system is under= >> heavy load. >> >> Debugging showed that the timeout can be made to happen much more >> frequently by optimizing the DMM driver, so that there's almost no cod= e >> between writing the last DMM descriptors to RAM, and writing to DMM >> register which starts the DMM transaction. >> >> The current theory is that a wmb() does not properly ensure that the >> data written to RAM is observable by all the components in the system.= >> >> This DMM timeout has caused interesting (and rare) bugs as the error >> handling was not functioning properly (the error handling has been fix= ed >> in previous commits): >> >> * If a DMM timeout happened when a GEM buffer was being pinned for >> display on the screen, a timeout error would be shown, but the driv= er >> would continue programming DSS HW with broken buffer, leading to >> SYNCLOST floods and possible crashes. >> >> * If a DMM timeout happened when other user (say, video decoder) was >> pinning a GEM buffer, a timeout would be shown but if the user >> handled the error properly, no other issues followed. >> >> * If a DMM timeout happened when a GEM buffer was being released, the= >> driver does not even notice the error, leading to crashes or hang >> later. >> >> This patch adds wmb() and readl() calls after the last bit is written = to >> RAM, which should ensure that the execution proceeds only after the da= ta >> is actually in RAM, and thus observable by DMM. >> >> This patch is a HACK, as a read-back should not be needed. Further stu= dy >> is required to understand if DMM is somehow special case and read-back= >> is ok, or if DRA7's memory barriers do not work correctly. >=20 > CONFIG_SOC_DRA7XX selects OMAP_INTERCONNECT and OMAP_INTERCONNECT_BARRI= ER, but=20 > dra7xx_map_io() doesn't call omap_barriers_init(). Could that be the ro= ot=20 > cause of the issue ? I don't have access to a DRA7xx system, would you = be able=20 > to test that ? No idea, but I did dig up discussions about this in my mailbox, and it seems there's been some work done after I wrote this patch, in "Fix OMAP4 barrier support" series last summer. I'm not sure if that's only for OMAP4, though. I'll drop this patch too from the series, and spend a bit more time on it. This is again something that's a bit tricky to reproduce and test. Tomi --AMd0682Nxid6h7ju3viQRGSPS2DX7TVP4 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJWzYc7AAoJEPo9qoy8lh716RwP/jB3h//hD46zP9mCgph3B9EF eDYNMMuRE/8MQ5re1ghduPDHHobVN6JqX+oN3ohsi8/rbi5ltKpq86LvZ6FW0LeP lEs+RCNM+Eh0JxZXJaK2GKj8HTOxj3DpID1ugs73UBCZb2qU2bMdlglb1IPp2xpl BKwLMuuADt8PFQli4D4dbGYzWeCzvVuhagasEWn16+gNCC3/w8pk/0cR4CpDYCZL pJfcbaMkOVqEkjeuuQNAgptUv1KYXlD4ayvEpp4uJphuORTL5XljZdJBIVVrVZv6 HCeJtOKunIB8KPUlc39hUdmzkKYTnG04dbxFpBefDt1HqefAO9ITVXOuVXgEfX60 Ua0ILYHI7Bpu+/Fb5fno7FesI7jGoECGaYIbm8n7B1SaX/MFF72cWwpuMsCs68Gk +1PU6ZBpJFW9fQwD752C1fY1+DJB1PXEwL28ao2x4PZGEaC3kcSXcTf+kXFFuDMQ KQZVLCG+UKcwJAJ3vYWrFxEremNAApl/AVVv5TNaEL1Blv9jWPPp0yD5llCKl/BK gTrhwoltfOou5a2YwnoZqydcQRbiVm9KEtZUxSEMMkpIXvbeIp7bS9c6GWp/F6+E g55SI3dmA4WDEZSRK/+UuVC2n15HQvceO1bMs8turDOzcVWAQH0n3ElxzfUNWhas EHyd42P6yhZFnXyp+UYQ =kzYz -----END PGP SIGNATURE----- --AMd0682Nxid6h7ju3viQRGSPS2DX7TVP4-- --===============1625693864== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHBzOi8vbGlz dHMuZnJlZWRlc2t0b3Aub3JnL21haWxtYW4vbGlzdGluZm8vZHJpLWRldmVsCg== --===============1625693864==--