From mboxrd@z Thu Jan 1 00:00:00 1970 From: Thierry Reding Subject: Re: [PATCH 2/6] drm/cache: Implement drm_clflush_*() for ARM Date: Fri, 10 Apr 2015 15:05:59 +0200 Message-ID: <20150410130557.GB32473@ulmo.nvidia.com> References: <1428590049-20357-1-git-send-email-thierry.reding@gmail.com> <1428590049-20357-2-git-send-email-thierry.reding@gmail.com> <20150410120312.GD12732@n2100.arm.linux.org.uk> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============2017006585==" Return-path: In-Reply-To: <20150410120312.GD12732@n2100.arm.linux.org.uk> List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" To: Russell King - ARM Linux Cc: Arnd Bergmann , Catalin Marinas , intel-gfx@lists.freedesktop.org, Will Deacon , dri-devel@lists.freedesktop.org, linux-arm-kernel@lists.infradead.org List-Id: intel-gfx@lists.freedesktop.org --===============2017006585== Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="wq9mPyueHGvFACwf" Content-Disposition: inline --wq9mPyueHGvFACwf Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Apr 10, 2015 at 01:03:13PM +0100, Russell King - ARM Linux wrote: > On Thu, Apr 09, 2015 at 04:34:05PM +0200, Thierry Reding wrote: > > From: Thierry Reding > >=20 > > Add implementations for drm_clflush_*() on ARM by borrowing code from > > the DMA mapping API implementation. Unfortunately ARM doesn't export an > > API to flush caches on a page by page basis, so this replicates most of > > the code. >=20 > I'm _really_ not happy with this, because it's poking about in ARM > internal implementation details of the DMA API. It's also not going > to work too well on aliasing caches either - especially when you > consider that the userspace mapping of a page may have no relationship > to the address you get from kmap. >=20 > For an aliasing cache, the way things work with the DMA API, we ensure > that the kernel alias is clean whenever pages are un-kmapped, which > means that unmapped highmem pages never have L1 cache lines associated > with the kernel alias of them. The user aliases are handled separately > via the normal flush_dcache_page()/flush_anon_page() calls. >=20 > None of this exists here... >=20 > It gets even more hairly on older ARM CPUs - but I hope no one is > expecting to support DRM's clflush there - we should make that explicit > though, and ensure that clflush support returns an error there. >=20 > That aside, we have most of this logic already inside > dma_cache_maint_page(), and even if it was the right thing to be doing, > we shouldn't be duplicating this architecture specific code inside a > driver. >=20 > So you can take that as a NAK on this. Perhaps I should take a step back and explain what I'm trying to solve, maybe that'll allow us to come up with a more proper solution. Both the MSM and Tegra drivers allocate framebuffers from shmem in the presence of an IOMMU. The problem with allocating pages from the shmem is that pages end up not being flushed, resulting in visual artifacts on the screen (horizontal black lines) when the cachelines from earlier allocations of these pages get flushed. The drivers also use the IOMMU API directly to manage the IOVA space. Currently both drivers work around this by faking up an sg_table and calling dma_map_sg(), which ends up doing the right thing. Unfortunately this only works if nothing backs the DMA API and we end up getting the regular arm_dma_ops. If an IOMMU registers with the ARM/DMA glue, we'll end up getting a different set of dma_ops and things break (buffers are mapped through the IOMMU twice, ...). Still, this has worked fine on Tegra (and I assume the same is true for MSM) so far because we don't register an IOMMU with the ARM/DMA glue. However while porting this to 64-bit ARM I started seeing failures to map buffers because SWIOTLB is enabled by default and gets in the way. With regards to code duplication, I suspect we could call something like arm_dma_ops.sync_sg_for_device() directly, but that raises the question about which device to pass in. It isn't clear at allocation time which device will use the buffer. It may in fact be used by multiple devices at once. Do you have an alternative suggestion on how to fix this? Thierry --wq9mPyueHGvFACwf Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJVJ8qzAAoJEN0jrNd/PrOhIpIQAJQK/YMOb4u4gZ0P7AT+Sdyu t6rxg1gENTapgFg57Nd1F88y+U89qwh1LMeJ5gF1Rmspju5BnNFhjzWDo1MzwvHY APV4ixrF94KMy+RmAz9kT0SQ57Abtfk3wyC6ZmDQZF7uJdnM2qrZieoeuCd/KPxN YMXQ2MjaK04oO335uinrp+UdSO9gn3BZPbdh8Jzu3vpxWKRL1uGDPOeaOZFf1EKF EnfYUjlbl/PvDQ0dwtd1kjrFzVT7oMP0S6uWO1SWYRbHANsybO13g1NjRi03JOG1 g8J7fnTevJ5US1bBB/zXZ8n586fIwspCo1NOSKpXpKvhSWMAF/wORKsTVSitZism ugTTA2osXcWeOrr1VR4wqv0qUjGWw6iqFI3EhseF+Ji3gyXzz7XkeFbAucsG/Lwy UIEy1nWoZ65pjaH70GuHGVUSRpcYCmiI1MEkQVULLeZdQYCY4UvLVznEabT+OZp4 8roer0t1Nxnf/joAQzvSmMgTDuw+aa/9JkBt8DSd5oNK130J6hTIDUCzA5P8l5vB w6/5Ldp+CuD+1dqBdrXp+dVDsGyqAyvMs4nf1aHMlRRuIJlSh661WmDtBejTfDzq v2GTKXf56O4YW0tC6Qax4DFIbKOIQJ2SVlXTlHx3c6UeFh69HqIXYnFhGnKdAjur RHuyX/1n+T1ae07xvk48 =70BE -----END PGP SIGNATURE----- --wq9mPyueHGvFACwf-- --===============2017006585== Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: base64 Content-Disposition: inline X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18KZHJpLWRldmVs IG1haWxpbmcgbGlzdApkcmktZGV2ZWxAbGlzdHMuZnJlZWRlc2t0b3Aub3JnCmh0dHA6Ly9saXN0 cy5mcmVlZGVza3RvcC5vcmcvbWFpbG1hbi9saXN0aW5mby9kcmktZGV2ZWwK --===============2017006585==--