From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 53FE3C3DA4A for ; Thu, 8 Aug 2024 13:47:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:In-Reply-To: Content-Transfer-Encoding:Content-Type:MIME-Version:References:Message-ID: Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=sE4Yz8sojee5LkhAJxD8xwbjHV18iEpbJd+xMVWms9o=; b=QNepyNWEFdG4WKEYaVsM2td42y UERM2SEettBIuJm4bkZZjBO7mot6rziWHvDPMwRgZZQ9wfkqjTng/EQ2GGtBcmbwH4uLFZXiHpRGd ceiG/5L5Kmkn5yeveBsBCO8DR6Mmf9wmJdK+oUuNl0MPWHZ2nifK7pneAXIldqWO9cAM3RpXhozBo TkgLyBmXO6ZfJykQPTOBWkRxgJvv+TRJNKcqCkMOhCYqOrECv9JfbuIX355yIF0Ur7P2do5HXgGS2 VKlxvGp3q1FF2Dq+glqukMiwQ78Iu/k5xnBwFKbNpXHquXnSN7d0pxVjZYLIjEzLogBK9a5hjLOWF L+62oFBA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sc3UN-00000008POb-1lAl; Thu, 08 Aug 2024 13:47:35 +0000 Received: from dfw.source.kernel.org ([139.178.84.217]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1sc3To-00000008PJP-0k1e for linux-arm-kernel@lists.infradead.org; Thu, 08 Aug 2024 13:47:01 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 669C461528; Thu, 8 Aug 2024 13:46:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BDA35C32782; Thu, 8 Aug 2024 13:46:56 +0000 (UTC) Date: Thu, 8 Aug 2024 14:46:54 +0100 From: Catalin Marinas To: Petr =?utf-8?B?VGVzYcWZw61r?= Cc: Baruch Siach , Christoph Hellwig , Marek Szyprowski , Will Deacon , Robin Murphy , iommu@lists.linux.dev, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, Ramon Fried , Elad Nachman Subject: Re: [PATCH v5 2/3] dma: replace zone_dma_bits by zone_dma_limit Message-ID: References: <5821a1b2eb82847ccbac0945da040518d6f6f16b.1722578375.git.baruch@tkos.co.il> <20240807161938.5729b656@mordecai.tesarici.cz> <20240808113501.4fde4cb0@mordecai.tesarici.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20240808113501.4fde4cb0@mordecai.tesarici.cz> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240808_064700_300703_BB697B00 X-CRM114-Status: GOOD ( 25.44 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Thu, Aug 08, 2024 at 11:35:01AM +0200, Petr Tesařík wrote: > On Wed, 7 Aug 2024 19:14:58 +0100 > Catalin Marinas wrote: > > With ZONE_DMA32, since all the DMA code assumes that ZONE_DMA32 ends at > > 4GB CPU address, it doesn't really work for such platforms. If there are > > 32-bit devices with a corresponding CPU address offset, ZONE_DMA32 > > should end at 36GB on Baruch's platform. But to simplify things, we just > > ignore this on arm64 and make ZONE_DMA32 empty. > > Ah. That makes sense. It also seems to support my theory that Linux > memory zones are an obsolete concept and should be replaced by a > different mechanism. I agree, they are too coarse-grained. From an API perspective, what we need is an alloc_pages() that takes a DMA mask or phys address limit, maybe something similar to memblock_alloc_range_nid(). OTOH, an advantage of the zones is that by default you keep the lower memory free by using ZONE_NORMAL as default, you have free lists per zone. Maybe with some alternative data structures we could efficiently search free pages based on phys ranges or bitmasks and get rid of the zones but I haven't put any thoughts into it. We'd still need some boundaries like *_dma_get_max_cpu_address() to at least allocate an swiotlb buffer that's suitable for all devices. > > In some cases where we have the device structure we could instead do a > > dma_to_phys(DMA_BIT_MASK(32)) but not in the two cases above. I guess if > > we really want to address this properly, we'd need to introduce a > > zone_dma32_limit that's initialised by the arch code. For arm64, I'm > > happy with just having an empty ZONE_DMA32 on such platforms. > > The obvious caveat is that zone boundaries are system-wide, but the > mapping between bus addresses and CPU addresses depends on the device > structure. After all, that's why dma_to_phys takes the device as a > parameter... In fact, a system may have multiple busses behind > different bridges with a different offset applied by each. Indeed, and as Robin mentioned, the ACPI/DT code already handle this. > FYI I want to make more people aware of these issues at this year's > Plumbers, see https://lpc.events/event/18/contributions/1776/ Looking forward to this. I'll dial in, unfortunately can't make Plumbers in person this year. In the meantime, I think this series is a good compromise ;). -- Catalin