From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.0 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DD226C433FF for ; Tue, 6 Aug 2019 16:51:09 +0000 (UTC) Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4021320C01 for ; Tue, 6 Aug 2019 16:51:09 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=armlinux.org.uk header.i=@armlinux.org.uk header.b="eZwyX0IY" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4021320C01 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=armlinux.org.uk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 4630yJ3GMCzDr1f for ; Wed, 7 Aug 2019 02:51:04 +1000 (AEST) Authentication-Results: lists.ozlabs.org; spf=none (mailfrom) smtp.mailfrom=armlinux.org.uk (client-ip=2001:4d48:ad52:3201:214:fdff:fe10:1be6; helo=pandora.armlinux.org.uk; envelope-from=linux+linuxppc-dev=lists.ozlabs.org@armlinux.org.uk; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=pass (p=none dis=none) header.from=armlinux.org.uk Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=armlinux.org.uk header.i=@armlinux.org.uk header.b="eZwyX0IY"; dkim-atps=neutral X-Greylist: delayed 173 seconds by postgrey-1.36 at bilbo; Wed, 07 Aug 2019 02:48:26 AEST Received: from pandora.armlinux.org.uk (pandora.armlinux.org.uk [IPv6:2001:4d48:ad52:3201:214:fdff:fe10:1be6]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4630vG3rlPzDr08 for ; Wed, 7 Aug 2019 02:48:26 +1000 (AEST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=armlinux.org.uk; s=pandora-2019; h=Sender:In-Reply-To: Content-Transfer-Encoding:Content-Type:MIME-Version:References:Message-ID: Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=2xbU9eGH8Hn5AbLibXFA8i2BhIbjPGsDxSHEGclO7DY=; b=eZwyX0IYZ2o1mynwE2Hz8BgOn 1IYSUKBfERotJIfG9zTtzX0bg4i08mQ5mMz6MCpMov84N9NI9dKaErTO39e8hveags+aR+z+d61qP W3w6jej9QozJBNVBhiIvdTj7/8x21XG296MB4TCw5PA3XToUIUHcmZVo2bgcj29giaXjSaRwo5m1i apKc2Nx+v1j2CSPgt0VzhW9CxMtyl+lhnoPIf1/QpSd7MesE+kISmudz2+Jf9n3t9voIsXSyen3WY +ze92NX2gq4pRI9lO51qHr+H7EmvaTCALnfcgZUyl64IRxEtq2PKmgi/japG58C1Z2+kdJ8eNmMbW Nmv4H0fpQ==; Received: from shell.armlinux.org.uk ([2002:4e20:1eda:1:5054:ff:fe00:4ec]:49228) by pandora.armlinux.org.uk with esmtpsa (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.90_1) (envelope-from ) id 1hv2dF-0000TH-Ou; Tue, 06 Aug 2019 17:48:17 +0100 Received: from linux by shell.armlinux.org.uk with local (Exim 4.92) (envelope-from ) id 1hv2dE-0000fq-Mb; Tue, 06 Aug 2019 17:48:16 +0100 Date: Tue, 6 Aug 2019 17:48:16 +0100 From: Russell King - ARM Linux admin To: Will Deacon Subject: Re: [PATCH] dma-mapping: fix page attributes for dma_mmap_* Message-ID: <20190806164816.GE1330@shell.armlinux.org.uk> References: <20190801142118.21225-1-hch@lst.de> <20190801142118.21225-2-hch@lst.de> <20190801162305.3m32chycsdjmdejk@willie-the-truck> <20190801163457.GB26588@lst.de> <20190801164411.kmsl4japtfkgvzxe@willie-the-truck> <20190802081441.GA9725@lst.de> <20190802103803.3qrbhqwxlasojsco@willie-the-truck> <20190803064812.GA29746@lst.de> <20190806160854.htk67msiyadlrl4m@willie-the-truck> <20190806164503.GD1330@shell.armlinux.org.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20190806164503.GD1330@shell.armlinux.org.uk> User-Agent: Mutt/1.10.1 (2018-07-13) X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Shawn Anastasio , linux-kernel@vger.kernel.org, iommu@lists.linux-foundation.org, Catalin Marinas , linuxppc-dev@lists.ozlabs.org, Christoph Hellwig , linux-arm-kernel@lists.infradead.org, Robin Murphy Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" On Tue, Aug 06, 2019 at 05:45:03PM +0100, Russell King - ARM Linux admin wrote: > On Tue, Aug 06, 2019 at 05:08:54PM +0100, Will Deacon wrote: > > On Sat, Aug 03, 2019 at 08:48:12AM +0200, Christoph Hellwig wrote: > > > On Fri, Aug 02, 2019 at 11:38:03AM +0100, Will Deacon wrote: > > > > > > > > So this boils down to a terminology mismatch. The Arm architecture doesn't have > > > > anything called "write combine", so in Linux we instead provide what the Arm > > > > architecture calls "Normal non-cacheable" memory for pgprot_writecombine(). > > > > Amongst other things, this memory type permits speculation, unaligned accesses > > > > and merging of writes. I found something in the architecture spec about > > > > non-cachable memory, but it's written in Armglish[1]. > > > > > > > > pgprot_noncached(), on the other hand, provides what the architecture calls > > > > Strongly Ordered or Device-nGnRnE memory. This is intended for mapping MMIO > > > > (i.e. PCI config space) and therefore forbids speculation, preserves access > > > > size, requires strict alignment and also forces write responses to come from > > > > the endpoint. > > > > > > > > I think the naming mismatch is historical, but on arm64 we wanted to use the > > > > same names as arm32 so that any drivers using these things directly would get > > > > the same behaviour. > > > > > > That all makes sense, but it totally needs a comment. I'll try to draft > > > one based on this. I've also looked at the arm32 code a bit more, and > > > it seems arm always (?) supported Normal non-cacheable attribute, but > > > Linux only optionally uses it for arm v6+ because of fears of drivers > > > missing barriers. > > > > I think it was also to do with aliasing, but I don't recall all of the > > details. > > ARMv6+ is where the architecture significantly changed to introduce > the idea of [Normal, Device, Strongly Ordered] where Normal has the > cache attributes. > > Before that, we had just "uncached/unbuffered, uncached/buffered, > cached/unbuffered, cached/buffered" modes. > > The write buffer (enabled by buffered modes) has no architected > guarantees about how long writes will sit in it, and there is only > the "drain write buffer" instruction to push writes out. > > Up to and including ARMv5, we took the easy approach of just using > the "uncached/unbuffered" mode since that is (a) the safest, and (b) > avoids write buffers that alias when there are multiple different > mappings. > > We could have used a different approach, making all IO writes contain > a "drain write buffer" instruction, and map DMA memory as "buffered", > but as there were no Linux barriers defined to order memory accesses > to DMA memory (so, for example, ring buffers can be updated in the > correct order) back in those days, using the uncached/unbuffered mode > was the sanest and most reliable solution. > > > > > > The other really weird things is that in arm32 > > > pgprot_dmacoherent incudes the L_PTE_XN bit, which from my understanding > > > is the no-execture bit, but pgprot_writecombine does not. This seems to > > > not very unintentional. So minus that the whole DMA_ATTR_WRITE_COMBІNE > > > seems to be about flagging old arm specific drivers as having the proper > > > barriers in places and otherwise is a no-op. > > > > I think it only matters for Armv7 CPUs, but yes, we should probably be > > setting L_PTE_XN for both of these memory types. > > Conventionally, pgprot_writecombine() has only been used to change > the memory type and not the permissions. Since writecombine memory > is still capable of being executed, I don't see any reason to set XN > for it. > > If the user wishes to mmap() using PROT_READ|PROT_EXEC, then is there > really a reason for writecombine to set XN overriding the user? > > That said, pgprot_writecombine() is mostly used for framebuffers, which > arguably shouldn't be executable anyway - but who'd want to mmap() the > framebuffer with PROT_EXEC? > > > > > > Here is my tentative plan: > > > > > > - respin this patch with a small fix to handle the > > > DMA_ATTR_NON_CONSISTENT (as in ignore it unless actually supported), > > > but keep the name as-is to avoid churn. This should allow 5.3 > > > inclusion and backports > > > - remove DMA_ATTR_WRITE_COMBINE support from mips, probably also 5.3 > > > material. > > > - move all architectures but arm over to just define > > > pgprot_dmacoherent, including a comment with the above explanation > > > for arm64. > > > > That would be great, thanks. > > > > > - make DMA_ATTR_WRITE_COMBINE a no-op and schedule it for removal, > > > thus removing the last instances of arch_dma_mmap_pgprot > > > > All sounds good to me, although I suppose 32-bit Arm platforms without > > CONFIG_ARM_DMA_MEM_BUFFERABLE may run into issues if DMA_ATTR_WRITE_COMBINE > > disappears. Only one way to find out... > > Looking at the results of grep, I think only OMAP2+ and Exynos may be > affected. > > However, removing writecombine support from the DMA API is going to > have a huge impact for framebuffers on earlier ARMs - that's where we > do expect framebuffers to be mapped "uncached/buffered" for performance > reasons and not "uncached/unbuffered". It's quite literally the > difference between console scrolling being usable and totally unusable. > > Given what I've said above, switching to using buffered mode for normal > DMA mappings is data-corrupting risky - as in your filesystem could get > fried. I don't think we should play fast and loose with people's data > by randomly changing that "because we'd like to", and I don't see that > screwing the console is really an option either. Sorry, I forgot to explain - the reason is dma_alloc_writecombine() internally uses DMA_ATTR_WRITE_COMBINE, which I'd forgotten about when grepping - so there's potentially way more users than my greps above found. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up According to speedtest.net: 11.9Mbps down 500kbps up