From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1CA6F200DF for ; Wed, 12 Jul 2023 17:01:09 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2611BC433C8; Wed, 12 Jul 2023 17:01:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1689181269; bh=3U5+qjIrNYsctwWGp8CSe+HK28fBVYTMEpTdg4H1TWM=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=G7hLiPHsZehszEmj1LcpbrZvMvkO4zq2AZ5E46d6z6hQl2/xJwMraxzPFDi4bVelv lLQRBwLUIpPQgOGKlxAHfkhgVRXKIe/VjkcUdSzhu/MgD3mEEzPT6tGcB/ZWSL9c/9 gP6kItoyb34TC/AwbTXI4t0HsL9zO4Dm+A2kakMZr7IG0cRxy68lY4tvuMtjllGDxG TVCcF/UWkvEzIFHqtQd4mLBZabuR/8tEUf3NHg1lRHMVXvE6H2fBFyZ57P7p/IIz66 TctAgr7IdWwkSvPcGFwU6c57uwUVyee6F6CdDefyZ3N54/NlIlIvyebmC/hQuZHR1K uqx9RWLM5hf5Q== Date: Wed, 12 Jul 2023 10:01:08 -0700 From: Jakub Kicinski To: Jesper Dangaard Brouer Cc: Yunsheng Lin , brouer@redhat.com, netdev@vger.kernel.org, almasrymina@google.com, hawk@kernel.org, ilias.apalodimas@linaro.org, edumazet@google.com, dsahern@gmail.com, michael.chan@broadcom.com, willemb@google.com Subject: Re: [RFC 00/12] net: huge page backed page_pool Message-ID: <20230712100108.00bee44f@kernel.org> In-Reply-To: <8b50a49e-5df8-dccd-154e-4423f0e8eda5@redhat.com> References: <20230707183935.997267-1-kuba@kernel.org> <1721282f-7ec8-68bd-6d52-b4ef209f047b@redhat.com> <20230711170838.08adef4c@kernel.org> <8b50a49e-5df8-dccd-154e-4423f0e8eda5@redhat.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit On Wed, 12 Jul 2023 14:43:32 +0200 Jesper Dangaard Brouer wrote: > On 12/07/2023 13.47, Yunsheng Lin wrote: > > On 2023/7/12 8:08, Jakub Kicinski wrote: > >> Oh, I split the page into individual 4k pages after DMA mapping. > >> There's no need for the host memory to be a huge page. I mean, > >> the actual kernel identity mapping is a huge page AFAIU, and the > >> struct pages are allocated, anyway. We just need it to be a huge > >> page at DMA mapping time. > >> > >> So the pages from the huge page provider only differ from normal > >> alloc_page() pages by the fact that they are a part of a 1G DMA > >> mapping. > > So, Jakub you are saying the PP refcnt's are still done "as usual" on > individual pages. Yes - other than coming from a specific 1G of physical memory the resulting pages are really pretty ordinary 4k pages. > > If it is about DMA mapping, is it possible to use dma_map_sg() > > to enable a big continuous dma map for a lot of discontinuous > > 4k pages to avoid allocating big huge page? > > > > As the comment: > > "The scatter gather list elements are merged together (if possible) > > and tagged with the appropriate dma address and length." > > > > https://elixir.free-electrons.com/linux/v4.16.18/source/arch/arm/mm/dma-mapping.c#L1805 > > > > This is interesting for two reasons. > > (1) if this DMA merging helps IOTLB misses (?) Maybe I misunderstand how IOMMU / virtual addressing works, but I don't see how one can merge mappings from physically non-contiguous pages. IOW we can't get 1G-worth of random 4k pages and hope that thru some magic they get strung together and share an IOTLB entry (if that's where Yunsheng's suggestion was going..) > (2) PP could use dma_map_sg() to amortize dma_map call cost. > > For case (2) __page_pool_alloc_pages_slow() already does bulk allocation > of pages (alloc_pages_bulk_array_node()), and then loops over the pages > to DMA map them individually. It seems like an obvious win to use > dma_map_sg() here? That could well be worth investigating!