From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from smtp1.osuosl.org (smtp1.osuosl.org [140.211.166.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0A655EB64DC for ; Tue, 11 Jul 2023 15:09:09 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id AA07382257; Tue, 11 Jul 2023 15:09:08 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org AA07382257 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=osuosl.org; s=default; t=1689088148; bh=I0qUOvE4UpyIeF1CStByDsbDBdWXMmycqW4pn56nqVI=; h=To:References:From:Date:In-Reply-To:Subject:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: Cc:From; b=1KjJTjh2ghVrOmndFsS41qL2p7VK+gbPGgV5U7QvjkRO7tqbokabD3J0RbObb3yYd DOk+1HFti2DnXXc8+uDRZjBnSvBq0Q8EXlnIvmqc4C9Us2QH42v4hyJMMGrK9cAPR0 yPWU3N6oKKGpFuwq+YH4FQZNj6q5o0XaWMar4UIKQbcERP7hPoROQp5nPcM5EZUFNN cm+mXqCZSXZr7xnOflV6bwOLlR29H+FoOeBSqTfR6AWTIMCt0z1XDtP3WjgS4xTSYN /uH4q+ncT3cm0uouhGiiFI8ijLYLCkukUYON/JS5Hp3daz1mnHrZs8XMEBirVG1Hfc gkbY+kaQa/hGw== X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 48Flytx_dMcc; Tue, 11 Jul 2023 15:09:06 +0000 (UTC) Received: from ash.osuosl.org (ash.osuosl.org [140.211.166.34]) by smtp1.osuosl.org (Postfix) with ESMTP id 64E69831D5; Tue, 11 Jul 2023 15:09:06 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org 64E69831D5 Received: from smtp1.osuosl.org (smtp1.osuosl.org [140.211.166.138]) by ash.osuosl.org (Postfix) with ESMTP id 3A2EF1BF422 for ; Tue, 11 Jul 2023 11:39:38 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id 0E62881E06 for ; Tue, 11 Jul 2023 11:39:38 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org 0E62881E06 X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aDfhFbPFwNYr for ; Tue, 11 Jul 2023 11:39:35 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org 349808161D Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by smtp1.osuosl.org (Postfix) with ESMTPS id 349808161D for ; Tue, 11 Jul 2023 11:39:34 +0000 (UTC) Received: from dggpemm500005.china.huawei.com (unknown [172.30.72.56]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4R0f5q12x9zqW2M; Tue, 11 Jul 2023 19:38:55 +0800 (CST) Received: from [10.69.30.204] (10.69.30.204) by dggpemm500005.china.huawei.com (7.185.36.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Tue, 11 Jul 2023 19:39:29 +0800 To: Alexander Lobakin , Yunsheng Lin References: <20230705155551.1317583-1-aleksander.lobakin@intel.com> <20230705155551.1317583-6-aleksander.lobakin@intel.com> <138b94a7-c186-bdd9-e073-2794760c9454@huawei.com> <09a9a9ef-cf77-3b60-2845-94595a42cf3e@intel.com> <71a8bab4-1a1d-cb1a-d75c-585a14c6fb2e@gmail.com> From: Yunsheng Lin Message-ID: <2ebd75df-51ff-4c62-2a68-d258dbf32b49@huawei.com> Date: Tue, 11 Jul 2023 19:39:28 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US X-Originating-IP: [10.69.30.204] X-ClientProxiedBy: dggems702-chm.china.huawei.com (10.3.19.179) To dggpemm500005.china.huawei.com (7.185.36.74) X-CFilter-Loop: Reflected X-Mailman-Approved-At: Tue, 11 Jul 2023 15:08:56 +0000 Subject: Re: [Intel-wired-lan] [PATCH RFC net-next v4 5/9] libie: add Rx buffer management (via Page Pool) X-BeenThere: intel-wired-lan@osuosl.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Wired Ethernet Linux Kernel Driver Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Paul Menzel , Jesper Dangaard Brouer , Larysa Zaremba , netdev@vger.kernel.org, Alexander Duyck , Ilias Apalodimas , linux-kernel@vger.kernel.org, Eric Dumazet , Michal Kubiak , intel-wired-lan@lists.osuosl.org, David Christensen , Jakub Kicinski , Paolo Abeni , "David S. Miller" Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: intel-wired-lan-bounces@osuosl.org Sender: "Intel-wired-lan" On 2023/7/10 21:25, Alexander Lobakin wrote: > From: Yunsheng Lin > Date: Sun, 9 Jul 2023 13:16:33 +0800 > >> On 2023/7/7 0:28, Alexander Lobakin wrote: >>> From: Yunsheng Lin >>> Date: Thu, 6 Jul 2023 20:47:28 +0800 >>> >>>> On 2023/7/5 23:55, Alexander Lobakin wrote: >>>> >>>>> +/** >>>>> + * libie_rx_page_pool_create - create a PP with the default libie settings >>>>> + * @napi: &napi_struct covering this PP (no usage outside its poll loops) >>>>> + * @size: size of the PP, usually simply Rx queue len >>>>> + * >>>>> + * Returns &page_pool on success, casted -errno on failure. >>>>> + */ >>>>> +struct page_pool *libie_rx_page_pool_create(struct napi_struct *napi, >>>>> + u32 size) >>>>> +{ >>>>> + struct page_pool_params pp = { >>>>> + .flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV, >>>>> + .order = LIBIE_RX_PAGE_ORDER, >>>>> + .pool_size = size, >>>>> + .nid = NUMA_NO_NODE, >>>>> + .dev = napi->dev->dev.parent, >>>>> + .napi = napi, >>>>> + .dma_dir = DMA_FROM_DEVICE, >>>>> + .offset = LIBIE_SKB_HEADROOM, >>>> >>>> I think it worth mentioning that the '.offset' is not really accurate >>>> when the page is split, as we do not really know what is the offset of >>>> the frag of a page except for the first frag. >>> >>> Yeah, this is read as "offset from the start of the page or frag to the >>> actual frame start, i.e. its Ethernet header" or "this is just >>> xdp->data - xdp->data_hard_start". >> >> So the problem seems to be if most of drivers have a similar reading as >> libie does here, as .offset seems to have a clear semantics which is used >> to skip dma sync operation for buffer range that is not touched by the >> dma operation. Even if it happens to have the same value of "offset from >> the start of the page or frag to the actual frame start", I am not sure >> it is future-proofing to reuse it. > > Not sure I'm following :s It would be better to avoid accessing the internal data of the page pool directly as much as possible, as that may be changed to different meaning or removed if the implememtation is changed. If it is common enough that most drivers are using it the same way, adding a helper for that would be great. > >> >> When page frag is added, I didn't really give much thought about that as >> we use it in a cache coherent system. >> It seems we might need to extend or update that semantics if we really want >> to skip dma sync operation for all the buffer ranges that are not touched >> by the dma operation for page split case. >> Or Skipping dma sync operation for all untouched ranges might not be worth >> the effort, because it might need a per frag dma sync operation, which is >> more costly than a batched per page dma sync operation. If it is true, page >> pool already support that currently as my understanding, because the dma >> sync operation is only done when the last frag is released/freed. >> >>> >>>> >>>>> + }; >>>>> + size_t truesize; >>>>> + >>>>> + pp.max_len = libie_rx_sync_len(napi->dev, pp.offset); >> >> As mentioned above, if we depend on the last released/freed frag to do the >> dma sync, the pp.max_len might need to cover all the frag. > > ^^^^^^^^^^^^ > > You mean the whole page or...? If we don't care about the accurate dma syncing, "cover all the frag" means the whole page here, as page pool doesn't have enough info to do accurate dma sync for now. > I think it's not the driver's duty to track all this. We always set > .offset to `data - data_hard_start` and .max_len to the maximum > HW-writeable length for one frame. We don't know whether PP will give us > a whole page or just a piece. DMA sync for device is performed in the PP > core code as well. Driver just creates a PP and don't care about the > internals. There problem is that when page_pool_put_page() is called with a split page, the page pool does not know which frag is freeing too. setting 'maximum HW-writeable length for one frame' only sync the first frag of a page as below: https://elixir.free-electrons.com/linux/v6.4-rc6/source/net/core/page_pool.c#L325 _______________________________________________ Intel-wired-lan mailing list Intel-wired-lan@osuosl.org https://lists.osuosl.org/mailman/listinfo/intel-wired-lan