From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA488C76196 for ; Fri, 31 Mar 2023 22:17:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231858AbjCaWRU (ORCPT ); Fri, 31 Mar 2023 18:17:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48282 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231241AbjCaWRS (ORCPT ); Fri, 31 Mar 2023 18:17:18 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E2802C177 for ; Fri, 31 Mar 2023 15:17:16 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 87D97B83271 for ; Fri, 31 Mar 2023 22:17:15 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AEB01C433EF; Fri, 31 Mar 2023 22:17:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1680301034; bh=PO81GQ+cHDoeljK/vys7vnWNFMQhEaqiBpo55cdMq/s=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=B410ycR4opPZkGelqE8Zzg7JUUXwDs16cRyc25rx0OPFXpjd9hWXxuYf4ekJekvKr y7XRtxVwz3zSIZ/M0rlNLLmaJTASJAMt1+HimtZfK9WfPfGgHBXL8gONq7xKspNVFy CrA5qOwuzMmqH2/NTN/XjmPD02SQGu7yJTFFX7ti8LZQ+MSBZ8MD7l82Outz3ld1bO Ymhqub/wSr5rfzlpuGpVf4ba7S3vh9jMUOq8ksxNxpRE5p3D3vCoGg7t+bavIcEZbi VtKn0+gZqUOBaQtUY+5yAPAoJC8m3ItXvybgG4PbfFagubiOzvXqzk14RsV6xHtQEh XbabiNZeKNURQ== Date: Fri, 31 Mar 2023 15:17:12 -0700 From: Jakub Kicinski To: Jesper Dangaard Brouer Cc: davem@davemloft.net, brouer@redhat.com, netdev@vger.kernel.org, edumazet@google.com, pabeni@redhat.com, hawk@kernel.org, ilias.apalodimas@linaro.org Subject: Re: [RFC net-next 1/2] page_pool: allow caching from safely localized NAPI Message-ID: <20230331151712.2ccd8317@kernel.org> In-Reply-To: <20230331120643.09ce0e59@kernel.org> References: <20230331043906.3015706-1-kuba@kernel.org> <20230331120643.09ce0e59@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Fri, 31 Mar 2023 12:06:43 -0700 Jakub Kicinski wrote: > > Make sense, but do read the comment above struct pp_alloc_cache. > > The sizing of pp_alloc_cache is important for this trick/heuristic to > > work, meaning the pp_alloc_cache have enough room. > > It is definitely on purpose that pp_alloc_cache have 128 elements and is > > only refill'ed with 64 elements, which leaves room for this kind of > > trick. But if Eric's deferred skb freeing have more than 64 pages to > > free, then we will likely fallback to ptr_ring recycling. > > > > Code wise, I suggest that you/we change page_pool_put_page_bulk() to > > have a variant that 'allow_direct' (looking at code below, you might > > already do this as this patch over-steer 'allow_direct'). Using the > > bulk API, would then bulk into ptr_ring in the cases we cannot use > > direct cache. > > Interesting point, let me re-run some tests with the statistics enabled. > For a simple stream test I think it may just be too steady to trigger > over/underflow. Each skb will carry at most 18 pages, and driver should > only produce 64 packets / consume 64 pages. Each NAPI cycle will start > by flushing the deferred free. So unless there is a hiccup either at the > app or NAPI side - the flows of pages in each direction should be steady > enough to do well with just 128 cache entries. Let me get the data and > report back. I patched the driver a bit to use page pool for HW-GRO. Below are recycle stats with HW-GRO and with SW GRO + XDP_PASS (packet per page). HW-GRO page=page before after before after recycle: cached: 0 138669686 0 150197505 cache_full: 0 223391 0 74582 ring: 138551933 9997191 149299454 0 ring_full: 0 488 3154 127590 released_refcnt: 0 0 0 0 alloc: fast: 136491361 148615710 146969587 150322859 slow: 1772 1799 144 105 slow_high_order: 0 0 0 0 empty: 1772 1799 144 105 refill: 2165245 156302 2332880 2128 waive: 0 0 0 0 Which seems to confirm that for this trivial test the cache sizing is good enough, and I won't see any benefit from batching (as cache is only full respectively 0.16% and 0.05% of the time).