From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D099AEB64D9 for ; Mon, 19 Jun 2023 18:07:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232130AbjFSSHK (ORCPT ); Mon, 19 Jun 2023 14:07:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45294 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230184AbjFSSHK (ORCPT ); Mon, 19 Jun 2023 14:07:10 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0CD9D120; Mon, 19 Jun 2023 11:07:09 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 45EB360DF4; Mon, 19 Jun 2023 18:07:08 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B85C3C433C8; Mon, 19 Jun 2023 18:07:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1687198027; bh=XNrWx0e+VFb0jaH8wmeXcZNg0osk8eDrh+qYB5nAfyk=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=hnq0AOiprtBC3qrex4NKnoSCrfoj8sJY8nyfDOYmMWS+LAdWlLROaAKj8NrHdAqKV gVhM8EURFdB3wFLNb5+cKug0R7eGHAYnxKLQtws1cQD1w4jy4qo3Hb7J1+QEk+tsBU CwaUfd+n40I3VQBfJ115llahzE5XrSWAEPqImjrhQ0Q9+4yBUo9WuCwv+e9Jgp1nFc JK4Y6V1bPN+wKsKswtz5r475G3cj3+O1jsBSXMG326is1pnTd64o/YSXvOjdb2TDxy /MScaTumbH45kNF4b2HPmMzPaXxYXkmbDyTzbrZl34qcyt4Si2iWxAdohcw8Fn2wxj eAVJ5xQrBneyw== Date: Mon, 19 Jun 2023 11:07:05 -0700 From: Jakub Kicinski To: Jesper Dangaard Brouer Cc: brouer@redhat.com, Alexander Duyck , Yunsheng Lin , davem@davemloft.net, pabeni@redhat.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Lorenzo Bianconi , Yisen Zhuang , Salil Mehta , Eric Dumazet , Sunil Goutham , Geetha sowjanya , Subbaraya Sundeep , hariprasad , Saeed Mahameed , Leon Romanovsky , Felix Fietkau , Ryder Lee , Shayne Chen , Sean Wang , Kalle Valo , Matthias Brugger , AngeloGioacchino Del Regno , Jesper Dangaard Brouer , Ilias Apalodimas , linux-rdma@vger.kernel.org, linux-wireless@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-mediatek@lists.infradead.org, Jonathan Lemon Subject: Re: Memory providers multiplexing (Was: [PATCH net-next v4 4/5] page_pool: remove PP_FLAG_PAGE_FRAG flag) Message-ID: <20230619110705.106ec599@kernel.org> In-Reply-To: References: <20230612130256.4572-1-linyunsheng@huawei.com> <20230612130256.4572-5-linyunsheng@huawei.com> <20230614101954.30112d6e@kernel.org> <8c544cd9-00a3-2f17-bd04-13ca99136750@huawei.com> <20230615095100.35c5eb10@kernel.org> <908b8b17-f942-f909-61e6-276df52a5ad5@huawei.com> <72ccf224-7b45-76c5-5ca9-83e25112c9c6@redhat.com> <20230616122140.6e889357@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org On Fri, 16 Jun 2023 22:42:35 +0200 Jesper Dangaard Brouer wrote: > > Former is better for huge pages, latter is better for IO mem > > (peer-to-peer DMA). I wonder if you have different use case which > > requires a different model :( > > I want for the network stack SKBs (and XDP) to support different memory > types for the "head" frame and "data-frags". Eric have described this > idea before, that hardware will do header-split, and we/he can get TCP > data part is another page/frag, making it faster for TCP-streams, but > this can be used for much more. > > My proposed use-cases involves more that TCP. We can easily imagine > NVMe protocol header-split, and the data-frag could be a mem_type that > actually belongs to the harddisk (maybe CPU cannot even read this). The > same scenario goes for GPU memory, which is for the AI use-case. IIRC > then Jonathan have previously send patches for the GPU use-case. > > I really hope we can work in this direction together, Perfect, that's also the use case I had in mind. The huge page thing was just a quick thing to implement as a PoC (although useful in its own right, one day I'll find the time to finish it, sigh). That said I couldn't convince myself that for a peer-to-peer setup we have enough space in struct page to store all the information we need. Or that we'd get a struct page at all, and not just a region of memory with no struct page * allocated :S That'd require serious surgery on the page pool's fast paths to work around. I haven't dug into the details, tho. If you think we can use page pool as a frontend for iouring and/or p2p memory that'd be awesome! The workaround solution I had in mind would be to create a narrower API for just data pages. Since we'd need to sprinkle ifs anyway, pull them up close to the call site. Allowing to switch page pool for a completely different implementation, like the one Jonathan coded up for iouring. Basically $name_alloc_page(queue) { if (queue->pp) return page_pool_dev_alloc_pages(queue->pp); else if (queue->iouring..) ... } From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7D4C5EB64D9 for ; Mon, 19 Jun 2023 18:07:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=GmpKH5uaqYzMDvaHcFEoHsGiSnJsZUamgYLOsOc5Wek=; b=cUeN5K5hAM5rg4 WohX+TcQ2XOUMkug1kCogZpXVBg7ed2hUHNemTaPaYdghZz91/5p7F2gDwR+kV371fzG7q+W/5XuO AQOfRLG507yf+veFSJoLjUh4WHFRYnFKV56anccBDKip6FWv8GPGtUf4LrhiALrnIm9xVn1+Weaj2 Ys7U8hPwdAMPWyNXdxeJzWuApLCO6CTiKyTHIE/3+p7ShiA5pXT/G1czQpsdF2wzzNS/MTQfXLtVp WjBxkZtArjFo7kwXhJMghLZ2YilHqwrKCpmBT00K4+X/nKL6skzEe8qWH1RXhGgB92IGrsng4HBq4 YHXGUKEjqcLog+SfMG3g==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qBJHT-0099wv-2z; Mon, 19 Jun 2023 18:07:11 +0000 Received: from dfw.source.kernel.org ([139.178.84.217]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qBJHR-0099w9-1r; Mon, 19 Jun 2023 18:07:10 +0000 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 48C8E60DFE; Mon, 19 Jun 2023 18:07:08 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B85C3C433C8; Mon, 19 Jun 2023 18:07:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1687198027; bh=XNrWx0e+VFb0jaH8wmeXcZNg0osk8eDrh+qYB5nAfyk=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=hnq0AOiprtBC3qrex4NKnoSCrfoj8sJY8nyfDOYmMWS+LAdWlLROaAKj8NrHdAqKV gVhM8EURFdB3wFLNb5+cKug0R7eGHAYnxKLQtws1cQD1w4jy4qo3Hb7J1+QEk+tsBU CwaUfd+n40I3VQBfJ115llahzE5XrSWAEPqImjrhQ0Q9+4yBUo9WuCwv+e9Jgp1nFc JK4Y6V1bPN+wKsKswtz5r475G3cj3+O1jsBSXMG326is1pnTd64o/YSXvOjdb2TDxy /MScaTumbH45kNF4b2HPmMzPaXxYXkmbDyTzbrZl34qcyt4Si2iWxAdohcw8Fn2wxj eAVJ5xQrBneyw== Date: Mon, 19 Jun 2023 11:07:05 -0700 From: Jakub Kicinski To: Jesper Dangaard Brouer Cc: brouer@redhat.com, Alexander Duyck , Yunsheng Lin , davem@davemloft.net, pabeni@redhat.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Lorenzo Bianconi , Yisen Zhuang , Salil Mehta , Eric Dumazet , Sunil Goutham , Geetha sowjanya , Subbaraya Sundeep , hariprasad , Saeed Mahameed , Leon Romanovsky , Felix Fietkau , Ryder Lee , Shayne Chen , Sean Wang , Kalle Valo , Matthias Brugger , AngeloGioacchino Del Regno , Jesper Dangaard Brouer , Ilias Apalodimas , linux-rdma@vger.kernel.org, linux-wireless@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-mediatek@lists.infradead.org, Jonathan Lemon Subject: Re: Memory providers multiplexing (Was: [PATCH net-next v4 4/5] page_pool: remove PP_FLAG_PAGE_FRAG flag) Message-ID: <20230619110705.106ec599@kernel.org> In-Reply-To: References: <20230612130256.4572-1-linyunsheng@huawei.com> <20230612130256.4572-5-linyunsheng@huawei.com> <20230614101954.30112d6e@kernel.org> <8c544cd9-00a3-2f17-bd04-13ca99136750@huawei.com> <20230615095100.35c5eb10@kernel.org> <908b8b17-f942-f909-61e6-276df52a5ad5@huawei.com> <72ccf224-7b45-76c5-5ca9-83e25112c9c6@redhat.com> <20230616122140.6e889357@kernel.org> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230619_110709_714319_B4A77CF0 X-CRM114-Status: GOOD ( 19.41 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Fri, 16 Jun 2023 22:42:35 +0200 Jesper Dangaard Brouer wrote: > > Former is better for huge pages, latter is better for IO mem > > (peer-to-peer DMA). I wonder if you have different use case which > > requires a different model :( > > I want for the network stack SKBs (and XDP) to support different memory > types for the "head" frame and "data-frags". Eric have described this > idea before, that hardware will do header-split, and we/he can get TCP > data part is another page/frag, making it faster for TCP-streams, but > this can be used for much more. > > My proposed use-cases involves more that TCP. We can easily imagine > NVMe protocol header-split, and the data-frag could be a mem_type that > actually belongs to the harddisk (maybe CPU cannot even read this). The > same scenario goes for GPU memory, which is for the AI use-case. IIRC > then Jonathan have previously send patches for the GPU use-case. > > I really hope we can work in this direction together, Perfect, that's also the use case I had in mind. The huge page thing was just a quick thing to implement as a PoC (although useful in its own right, one day I'll find the time to finish it, sigh). That said I couldn't convince myself that for a peer-to-peer setup we have enough space in struct page to store all the information we need. Or that we'd get a struct page at all, and not just a region of memory with no struct page * allocated :S That'd require serious surgery on the page pool's fast paths to work around. I haven't dug into the details, tho. If you think we can use page pool as a frontend for iouring and/or p2p memory that'd be awesome! The workaround solution I had in mind would be to create a narrower API for just data pages. Since we'd need to sprinkle ifs anyway, pull them up close to the call site. Allowing to switch page pool for a completely different implementation, like the one Jonathan coded up for iouring. Basically $name_alloc_page(queue) { if (queue->pp) return page_pool_dev_alloc_pages(queue->pp); else if (queue->iouring..) ... } _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel