From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BA86CC77B61 for ; Thu, 13 Apr 2023 04:45:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Message-ID:Date:Content-ID: Content-Type:MIME-Version:Subject:Cc:To:References:In-Reply-To:From:Reply-To: Content-Transfer-Encoding:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=DqjVx8DdVE7yyHJKc/GaG9WLw8shGI6NpWmfl69SUoE=; b=JzM2plWhItzWrq0KVaxCpMVRxu J8sJd1cV3hdv9WaPCPa97RE/YgCF64mtCQQOa/WJw3cAkzQFDoUSul3uwq61J8iOv15feuk3yvPap 3xoeJKO4J1qryUaEEaGH/r4gop/MrvviSKaPY8dWc2ihdp6Gd0cyIRO86Mp1kXkN60JXihQcC2wHq 3W6uLZIfQwNtrUAPjhP52JOuMTF1fNzNKkNfDM4Cu5I4V7CocxWUE7CZ0etnD0VZBs31PwwXGYWIk L/+m+liK42TTV2wodteodzlEraMcHYJydHpKzaL5Xbnr72pfBb3/oMv1yjiLqBMqe0pixjArQS9SX s2i9BdGQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1pmopz-0051FA-2x; Thu, 13 Apr 2023 04:45:35 +0000 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1pmje4-004bhb-1P for linux-nvme@lists.infradead.org; Wed, 12 Apr 2023 23:12:59 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1681341171; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=DqjVx8DdVE7yyHJKc/GaG9WLw8shGI6NpWmfl69SUoE=; b=bOyUPGtujdSlzeygurislRRvLpTQ85GAd4UMhkuI8lKbgCkz8XcxkgER+OEFor7Hl1WTOR bKSGT/QXqa/DR1Dfu+z+2H+kf2QV7g0apQGxL2b0iYEoZ+ITLRzzkTteE4zG07bYx5tmnE 2zA32RAXw35Q1iCH1Y2+0FKeUZEtPDM= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-578-RzSsyzYVPU2T4nYnBVXo7Q-1; Wed, 12 Apr 2023 19:12:47 -0400 X-MC-Unique: RzSsyzYVPU2T4nYnBVXo7Q-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id D72A7101A550; Wed, 12 Apr 2023 23:12:45 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.177]) by smtp.corp.redhat.com (Postfix) with ESMTP id 78B9A2166B26; Wed, 12 Apr 2023 23:12:41 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 From: David Howells In-Reply-To: References: <20230411160902.4134381-1-dhowells@redhat.com> <20230411160902.4134381-5-dhowells@redhat.com> To: Christoph Hellwig Cc: dhowells@redhat.com, "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Willem de Bruijn , David Ahern , Matthew Wilcox , Al Viro , Jens Axboe , Jeff Layton , Christian Brauner , Chuck Lever III , Linus Torvalds , Jeroen de Borst , Catherine Sullivan , Shailend Chand , Felix Fietkau , John Crispin , Sean Wang , Mark Lee , Lorenzo Bianconi , Matthias Brugger , AngeloGioacchino Del Regno , Keith Busch , Jens Axboe , Christoph Hellwig , Sagi Grimberg , Chaitanya Kulkarni , Andrew Morton , netdev@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-mediatek@lists.infradead.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH net-next v6 04/18] mm: Make the page_frag_cache allocator use per-cpu MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <399349.1681341160.1@warthog.procyon.org.uk> Date: Thu, 13 Apr 2023 00:12:40 +0100 Message-ID: <399350.1681341160@warthog.procyon.org.uk> X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230412_161256_557520_465A205F X-CRM114-Status: GOOD ( 34.93 ) X-Mailman-Approved-At: Wed, 12 Apr 2023 21:45:34 -0700 X-BeenThere: linux-nvme@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "Linux-nvme" Errors-To: linux-nvme-bounces+linux-nvme=archiver.kernel.org@lists.infradead.org Christoph Hellwig wrote: > On Tue, Apr 11, 2023 at 05:08:48PM +0100, David Howells wrote: > > Make the page_frag_cache allocator have a separate allocation bucket for > > each cpu to avoid racing. This means that no lock is required, other than > > preempt disablement, to allocate from it, though if a softirq wants to > > access it, then softirq disablement will need to be added. > ... > Let me ask a third time as I've not got an answer the last two times: Sorry about that. I think the problem is that the copy of the message from you directly to me arrives after the first copy that comes via a mailing list and google then deletes the direct one - as obviously no one could possibly want duplicates, right? :-/ - and so you usually get consigned to the linux-kernel or linux-fsdevel mailing list folder. > > Make the NVMe, mediatek and GVE drivers pass in NULL to page_frag_cache() > > and use the default allocation buckets rather than defining their own. > > why are these callers treated different from the others? There are only four users of struct page_frag_cache, the one these patches modify:: (1) GVE. (2) Mediatek. (3) NVMe. (4) skbuff. Note that things are slightly confused by there being three very similarly named frag allocators (page_frag and page_frag_1k in addition to page_frag_cache) and the __page_frag_cache_drain() function gets used for things other than just page_frag_cache. I've replaced the single allocation buckets with per-cpu allocation buckets for (1), (2) and (3) so that no locking[*] is required other than pinning it to the cpu temporarily - but I can't test them as I don't have hardware. [*] Note that what's upstream doesn't have locking, and I'm not sure all the users of it are SMP-safe. That leaves (4). Upstream, skbuff.c creates two separate per-cpu frag caches and I've elected to retain that, except that the per-cpu bits are now inside the frag allocator as I'm not entirely sure of the reason that there's a separate napi frag cache to the netdev_alloc_cache. The general page_frag_cache allocator is used by skb_splice_from_iter() if it encounters a page it can't take a ref on, so it has been tested through that using sunrpc, sunrpc+siw and cifs+siw. > Can you show any performance numbers? As far as I can tell, it doesn't make any obvious difference to directly pumping data through TCP or TLS over TCP or transferring data over a network filesystem such as sunrpc or cifs using siw/TCP. I've tested this between two machines over a 1G and a 10G link. I can generate some actual numbers tomorrow. Actually, I probably can drop these patches 2-4 from this patchset and just use the netdev_alloc_cache in skb_splice_from_iter() for now. Since that copies unspliceable data, I no longer need to allocate frags in the next layer up. David