From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C4296C4332F for ; Sun, 12 Nov 2023 07:40:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-ID:In-reply-to: Date:Subject:Cc:To:From:References:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=/yOS8GS7hpHOgf9S8Bpy2n8osgsyrwmNgbICx98DEQg=; b=KLwpyFqdtVl6AH bTXcz8QJs68flinskHZYly/qlqDoODEpcm5dinFYc1MeXGa0c8zGJuRGno3EchBHgg1lDomsMqA5G 7NNVl1BnXgm9x6PRVNKxKjsT9jmqAFNQNucJ2NXfZx8E/2Gz7e86u6aO9G2NWVaUbaf9g0y28I1e2 5BUnDzI8s3EKvAZcx5JqPY5wI5yQW6X0fh1pIYGaWzuY6UmhrHaX29CoRT3aG966NCbYI4z9WGCh+ VVZThXxDszoBa9nPCg6LuKdl8z1JRuYr6+aL5USKftPFSUVoHwsvxFSIYQ5dQj0DAxJvZYI0KakMs QlbTkXoG4sAfV8j1bqPw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1r253z-00BqLd-1H; Sun, 12 Nov 2023 07:39:23 +0000 Received: from wiki.tkos.co.il ([84.110.109.230] helo=mail.tkos.co.il) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1r253v-00BqKg-0D for linux-arm-kernel@lists.infradead.org; Sun, 12 Nov 2023 07:39:21 +0000 Received: from tarshish (unknown [10.0.8.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.tkos.co.il (Postfix) with ESMTPS id C606C440394; Sun, 12 Nov 2023 09:38:08 +0200 (IST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tkos.co.il; s=default; t=1699774688; bh=CT+4CTFw3TWups965GYXi3kRzFYumSfG7ykeet06X+Q=; h=References:From:To:Cc:Subject:Date:In-reply-to:From; b=bhw7d3xBy9NQ3TQC6P+MHKyj3rncKvZmIZJRmPXI1e2PnhZS5rHhlBEISOjJnW0No 9dXTtjiK6ht5C8HYmYy+jYhT0QjiTAAv/xgSHMOFq+EuCnUzuXjL4BAe2Svc6XFo59 wYLO+VutkrA9bW3WqDVKBD7Yu8xvTg046vN0UrRxGCUgTmRcjdZ2pSOBfaJiJEtB8C BLAaHRQktThnRbvl7qQ5fKjJMRpTT33b4b1ewOM9p1bIpo16SA3N7IidxT3VYsHx0T wWKAc1SCXEoSLqtVD+2A2NlKhkOvdWi4sDkHZeBf6EYSYmu+v+8rga54/xea+WW1YK cYqoI4cRaOffA== References: <9af8a19c3398e7dc09cfc1fbafed98d795d9f83e.1699464622.git.baruch@tkos.co.il> User-agent: mu4e 1.9.21; emacs 29.1 From: Baruch Siach To: Catalin Marinas , Petr =?utf-8?B?VGVzYcWZw61r?= Cc: Will Deacon , Christoph Hellwig , Marek Szyprowski , Robin Murphy , Ramon Fried , iommu@lists.linux.dev, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH RFC] arm64: DMA zone above 4GB Date: Sun, 12 Nov 2023 09:25:46 +0200 In-reply-to: Message-ID: <877cmn7750.fsf@tarshish> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231111_233919_500256_2FA33BEA X-CRM114-Status: GOOD ( 29.03 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi Catalin, Petr, Thanks for your detailed response. See below a few comments and questions. On Thu, Nov 09 2023, Catalin Marinas wrote: > On Wed, Nov 08, 2023 at 07:30:22PM +0200, Baruch Siach wrote: >> Consider a bus with this 'dma-ranges' property: >> >> #address-cells = <2>; >> #size-cells = <2>; >> dma-ranges = <0x00000000 0xc0000000 0x00000008 0x00000000 0x0 0x40000000>; >> >> Devices under this bus can see 1GB of DMA range between 3GB-4GB. This >> range is mapped to CPU memory at 32GB-33GB. > > Is this on real hardware or just theoretical for now (the rest of your > email implies it's real)? Normally I'd expected the first GB (or first > two) of RAM from 32G to be aliased to the lower 32-bit range for the CPU > view as well, not just for devices. You'd then get a ZONE_DMA without > having to play with DMA offsets. This hardware is currently in fabrication past tapeout. Software tests are running on FPGA models and software simulators. CPU view of the 3GB-4GB range is not linear with DMA addresses. That is, for offset N where 0 <= N <= 1GB, the CPU address 3GB+N does not map to the same physical location of DMA address 3GB+N. Hardware engineers are not sure this is fixable. So as is often the case we look at software to save us. After all, from hardware perspective this design "works". >> Current zone_sizes_init() code considers 'dma-ranges' only when it maps >> to RAM under 4GB, because zone_dma_bits is limited to 32. In this case >> 'dma-ranges' is ignored in practice, since DMA/DMA32 zones are both >> assumed to be located under 4GB. The result is that the stmmac driver >> DMA buffers allocation GFP_DMA32 flag has no effect. As a result DMA >> buffer allocations fail. >> >> The patch below is a crude workaround hack. It makes the DMA zone >> cover the 1GB memory area that is visible to stmmac DMA as follows: >> >> [ 0.000000] Zone ranges: >> [ 0.000000] DMA [mem 0x0000000800000000-0x000000083fffffff] >> [ 0.000000] DMA32 empty >> [ 0.000000] Normal [mem 0x0000000840000000-0x0000000bffffffff] >> ... >> [ 0.000000] software IO TLB: mapped [mem 0x000000083bfff000-0x000000083ffff000] (64MB) >> >> With this hack the stmmac driver works on my platform with no >> modification. >> >> Clearly this can't be the right solutions. zone_dma_bits is now wrong for >> one. It probably breaks other code as well. > > zone_dma_bits ends up as 36 if I counted correctly. So DMA_BIT_MASK(36) > is 0xf_ffff_ffff and the phys_limit for your device is below this mask, > so dma_direct_optimal_gfp_mask() does end up setting GFP_DMA. However, > looking at how it sets GFP_DMA32, it is obvious that the code is not set > up for such configurations. I'm also not a big fan of zone_dma_bits > describing a mask that goes well above what the device can access. > > A workaround would be for zone_dma_bits to become a *_limit and sort out > all places where we compare masks with masks derived from zone_dma_bits > (e.g. cma_in_zone(), dma_direct_supported()). I was also thinking along these lines. I wasn't sure I see the entire picture, so I hesitated to suggest a patch. Specifically, the assumption that DMA range limits are power of 2 looks deeply ingrained in the code. Another assumption is that DMA32 zone is in low 4GB range. I can work on an RFC implementation of this approach. Petr suggested a more radical solution of per bus DMA constraints to replace DMA/DMA32 zones. As Petr acknowledged, this does not look like a near future solution. > Alternatively, take the DMA offset into account when comparing the > physical address corresponding to zone_dma_bits and keep zone_dma_bits > small (phys offset subtracted, so in your case it would be 30 rather > than 36). I am not following here. Care to elaborate? Thanks, baruch -- ~. .~ Tk Open Systems =}------------------------------------------------ooO--U--Ooo------------{= - baruch@tkos.co.il - tel: +972.52.368.4656, http://www.tkos.co.il - _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel