From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28F29C2BD09 for ; Mon, 1 Jul 2024 23:10:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 545786B007B; Mon, 1 Jul 2024 19:10:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4CD5A6B0082; Mon, 1 Jul 2024 19:10:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 31FFA6B0083; Mon, 1 Jul 2024 19:10:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 0FE196B007B for ; Mon, 1 Jul 2024 19:10:25 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id AE51216182C for ; Mon, 1 Jul 2024 23:10:24 +0000 (UTC) X-FDA: 82292729568.04.E04DDF5 Received: from mail-pl1-f182.google.com (mail-pl1-f182.google.com [209.85.214.182]) by imf30.hostedemail.com (Postfix) with ESMTP id C6EB880015 for ; Mon, 1 Jul 2024 23:10:22 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=UgIpXVKk; spf=pass (imf30.hostedemail.com: domain of alexander.duyck@gmail.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=alexander.duyck@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1719875411; a=rsa-sha256; cv=none; b=FoiekYpbzPAWYw6SE5GLNmiK99knonrAoEQUSQpTeUtAHk3wGjsH5o4QF2GmHxD+qd6RP+ MNddAAM/XDGNMfhhSZVu0xYOSwO8gA5/SHmNswTerD7K0hYpURFBm1kZqIfdWYJlAhVpKn yPzldbUmPYPEmpBxv/zFN5pXdzCXVD0= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=UgIpXVKk; spf=pass (imf30.hostedemail.com: domain of alexander.duyck@gmail.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=alexander.duyck@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1719875411; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=JO3hxaUe4TBhQwaTOXXde5z2rexh9ciEnaN0UMs6Tpk=; b=hhWNTacv34z3HSg+BDVICjKknGyTUF4e3+pxr0i7IrS5F9xAu94UXF1o6b8jvZDd04Gc5I O8jnQULh6z+3LFD+nYf62ohO4f8YTnce1RIIWX2HJPuhsqiTWsOmcXSN772OTe7t6ImH+4 cGsfGAMF39C+2/IecMD6OQXHPF78BmY= Received: by mail-pl1-f182.google.com with SMTP id d9443c01a7336-1f4c7b022f8so25781925ad.1 for ; Mon, 01 Jul 2024 16:10:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1719875421; x=1720480221; darn=kvack.org; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:from:to:cc:subject :date:message-id:reply-to; bh=JO3hxaUe4TBhQwaTOXXde5z2rexh9ciEnaN0UMs6Tpk=; b=UgIpXVKkdH+EQrBD3RC4W+7ZJ4ZPViI/q2E/TnspzhILCKpyfsGkiBrUnj8JrM6dAk yJUHYu8BB4R65fJtD/zMIvAN6Np6Jst/tB+3SeTvMhimP7/w3O1iINPsm8rSJCX6JZ/e CnmW7ng819X5KeUWu62PWNNukNMREbkb/zDo+crAUeFEl9e1smaWQecPQn1zvNXtQ2zP bej+Etqw0uibVgyg5leZqnYG2xxKO0IohYPVxDLd7P9fMALIw6wdMZes0tYXbKGiLZjw 1r5PUBRv9zHNE0AiGBcMm+pIErWr9mg3hCS9jhCGYMOFUqnOIZveDICG+dj3h/4QqbJW wpQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1719875421; x=1720480221; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=JO3hxaUe4TBhQwaTOXXde5z2rexh9ciEnaN0UMs6Tpk=; b=UeMauNf1fWWo8bsmFSFLE7TVz5hKVJRithAux0Nkm+74Q5Y8e2CWcPxlK3dAw00H2u P+KqujQxdlx/b/0eAFTiNkkHXtVQpWD5w+efV1jD+HMpOxNwdrXofAQbDa1L0wEGKq8/ lwI4+nGKTtzR/npV/PSSDV6tjpSYI9ncYF9Rvzjfnof0WH6S9aASNobkguaZ67H1NeBX Bmboxk2MfgrkvCtC5yz49i46VBPw9G+PBRalQ0KTKbG5rxZ/SOvm1dfiLZZvoejg/nSp ZhpE8v0ufMqIeof/Xn+2tuc6JoydTpaboMK9nM8mP1Dl/5ulLs09V0kxNl/4ihEfZuLV iZ6w== X-Forwarded-Encrypted: i=1; AJvYcCV3dlTpAORwq1uVLOHx/vFxYTRxj/Z43tqDrWPnt10pvxSS5Es1pDDwRyVAjgJPT6gQjiGgoS7ncOE/O41UDr0hIjM= X-Gm-Message-State: AOJu0YyoEKPR3+5ehum7Y2LNwFXZfkDoi1EryQoosNq4hHQdK0x/kMhZ CZ1X6Ov2VfJKikxrE68naM2jTfLThOK1VGwcxBDq4lBfJraTOm0F X-Google-Smtp-Source: AGHT+IGmzjdb0aTKtkaW71dZIduU2eZKMunrsnUFRabvXhvxs2FJYST83sKTf5cZlAIX0lRmG0H0QQ== X-Received: by 2002:a17:902:eccc:b0:1f7:19b4:c7f5 with SMTP id d9443c01a7336-1fadbc5c25emr62078095ad.12.1719875421309; Mon, 01 Jul 2024 16:10:21 -0700 (PDT) Received: from ?IPv6:2605:59c8:829:4c00:82ee:73ff:fe41:9a02? ([2605:59c8:829:4c00:82ee:73ff:fe41:9a02]) by smtp.googlemail.com with ESMTPSA id d9443c01a7336-1fac1569d4bsm71988635ad.201.2024.07.01.16.10.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Jul 2024 16:10:20 -0700 (PDT) Message-ID: <86c56c9a88d07efbfe1e85bec678e86704588a15.camel@gmail.com> Subject: Re: [PATCH net-next v9 02/13] mm: move the page fragment allocator from page_alloc into its own file From: Alexander H Duyck To: Yunsheng Lin , davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, David Howells , Andrew Morton , linux-mm@kvack.org Date: Mon, 01 Jul 2024 16:10:19 -0700 In-Reply-To: <20240625135216.47007-3-linyunsheng@huawei.com> References: <20240625135216.47007-1-linyunsheng@huawei.com> <20240625135216.47007-3-linyunsheng@huawei.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.48.4 (3.48.4-1.fc38) MIME-Version: 1.0 X-Stat-Signature: e7fzcowaekepqypni1u4y9hzhqgw1teb X-Rspamd-Queue-Id: C6EB880015 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1719875422-223923 X-HE-Meta: U2FsdGVkX18/+rIAu3S8HB5ouGsWdbrMTY01RlQcvhTEUkjEkwdw5GdzT+rJ1g3dfG366S26I2td5vIHh7I/UGfE0qv7XB99bpoUFR7rd+u8KVcQ7OUAHWbzvqwqWJhM5nmD0bWR0vtRXybliN7zF9oCBsnaaXA+FfpvuYV7WJ23ViZIGm84/sFXq85/egdfmbsq7HK3yFwwAorj4z8J52KD01GrYJ00KCGZ7bsuPiftCPPyRnidBtdw5DSNlNDqttewJMeagteJn6ih5XBW9BbR7WrTR96Yv5kw0aMRpNbH9bSjAQhqTicTzMOHDzBc4AWahuoylx8Rbb4wxpKuRbIMZpPv0lpY2afdbdefT6dGy9F+pmKuOIc20CfO/b92G+kmC9azZSa4/WinEPIBOeMbR90qNxKiOkELXi12a84Qk2fAOvvDbuMHQt2gwbazOGjF1t+UhnVDJ0S1Q2Oq75ImZH/VoBwtYQk0NXP/n22eqVJxlKa1dMYXnSD1IzL++se4ucatzHQrtiFPtrOT9Y3g5QnbTtjZicIYEx2qynvvSj6zDuwre70LzQrxeLJSYH39rzoPYDnGsdTqtFqeecWgcr9rAu81De0/A20q7Yt60UHSwouTfZ/hiE6NNBtvEjqcf9tmLNPH51zYonuCsv2DIgoF7FeICuA14nK23e+OIMuDQA0ERins40hrBCuylHSqw5KiDhYGlTvXhamPSBWl1tdMSQ8PNQjhW29SVAbIkDjd9zANy4CJmu+6JLrO7cTTsjosOJ1Q3Uo5PTk4Zwxsg6BNz6VnpUgd7aQEzLXuQPMymyP0jV+veE9xUSGdol+B0O8idwpvmFVYOgUcn9LXrZrGfNx/3x9yZdQQQUzK+GPn2Pj+v0sGEIjSkOtG7MRXyrW4Rb52DRNgxIqoFFd3LgF0f//J5EwkJO0yLI/6S8y2L+x/H2TFJxGOUX5Jsnzq/BlPFkR24Y5NIit 26yiVT8Z iM09a6vMJcsgPg0VF67rEqt4dpuCUzDy1hjSxHe9JKGW2a4e8wR3DrMCvuZr+EpcjthiTIonLDxAXPdSLRmb8bIO0oF/hmYTfO5WkiKKwhN6iMIOzAyYo19O28eS35Ut23cq4DBquv0644qNoUeR9xEN1T5Y8en932wPA0YSFZF1YFs//rVz51deZcIaId4JSLTjbYVIHbJ7TB4QY/edraQOjKqEmZVwHKYaBBpwAig8kQMgPhiNFg9aKDp/AL/o9Z268y+xUscbKsQMJ51ltDhW7p1GZUfamGIisUeIPa5Inn/imkAXPzrWvuywF8+rT/6xyeoAxCl9W0TS0m/i418MQONuFqYSxK6Raa2VVmO/3aokciF3zZYNJEYSmCd5VPtRMmnKm9Es4bQM3nx3WSCe7qCCKm1D+0zPxE9SzJBe97JQvOdwb3wxLSNUNqWnAw1Uqk+kZWdNtxwqXIiFZ1s6KVL6E1ya642v+DFYuGaQnmbRY0I+g1lnj5bU3OaVSpbVbZhW3PLigBNPUz1TxuGEwsA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, 2024-06-25 at 21:52 +0800, Yunsheng Lin wrote: > Inspired by [1], move the page fragment allocator from page_alloc > into its own c file and header file, as we are about to make more > change for it to replace another page_frag implementation in > sock.c >=20 > 1. https://lore.kernel.org/all/20230411160902.4134381-3-dhowells@redhat.c= om/ >=20 > CC: David Howells > CC: Alexander Duyck > Signed-off-by: Yunsheng Lin So one thing that I think might have been overlooked in the previous reviews is the fact that the headers weren't necessarily self sufficient. You were introducing dependencies that had to be fulfilled by other headers. One thing you might try doing as part of your testing would be to add a C file that just adds your header and calls your functions to verify that there aren't any unincluded dependencies. > --- > include/linux/gfp.h | 22 ----- > include/linux/mm_types.h | 18 ---- > include/linux/page_frag_cache.h | 47 +++++++++++ > include/linux/skbuff.h | 1 + > mm/Makefile | 1 + > mm/page_alloc.c | 136 ------------------------------ > mm/page_frag_cache.c | 144 ++++++++++++++++++++++++++++++++ > mm/page_frag_test.c | 1 + > 8 files changed, 194 insertions(+), 176 deletions(-) > create mode 100644 include/linux/page_frag_cache.h > create mode 100644 mm/page_frag_cache.c >=20 ... > diff --git a/include/linux/page_frag_cache.h b/include/linux/page_frag_ca= che.h > new file mode 100644 > index 000000000000..3a44bfc99750 > --- /dev/null > +++ b/include/linux/page_frag_cache.h > @@ -0,0 +1,47 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > + > +#ifndef _LINUX_PAGE_FRAG_CACHE_H > +#define _LINUX_PAGE_FRAG_CACHE_H > + > +#include > + The gfp_types.h only really gives you the values you pass to the gfp_mask. Did you mean to include linux/types.h to get the gfp_t typedef? > +#define PAGE_FRAG_CACHE_MAX_SIZE __ALIGN_MASK(32768, ~PAGE_MASK) You should probably include linux/align.h to pull in the __ALIGN_MASK. > +#define PAGE_FRAG_CACHE_MAX_ORDER get_order(PAGE_FRAG_CACHE_MAX_SIZE) I am pretty sure get_order is from asm/page.h as well. > + > +struct page_frag_cache { > + void *va; > +#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) I am pretty sure PAGE_SIZE is included from asm/page.h > + __u16 offset; > + __u16 size; > +#else > + __u32 offset; > +#endif > + /* we maintain a pagecount bias, so that we dont dirty cache line > + * containing page->_refcount every time we allocate a fragment. > + */ > + unsigned int pagecnt_bias; > + bool pfmemalloc; > +}; > + > +void page_frag_cache_drain(struct page_frag_cache *nc); > +void __page_frag_cache_drain(struct page *page, unsigned int count); > +void *__page_frag_alloc_align(struct page_frag_cache *nc, unsigned int f= ragsz, > + gfp_t gfp_mask, unsigned int align_mask); > + > +static inline void *page_frag_alloc_align(struct page_frag_cache *nc, > + unsigned int fragsz, gfp_t gfp_mask, > + unsigned int align) > +{ > + WARN_ON_ONCE(!is_power_of_2(align)); To get is_power_of_2 you should be including linux/log2.h. > + return __page_frag_alloc_align(nc, fragsz, gfp_mask, -align); > +} > + > +static inline void *page_frag_alloc(struct page_frag_cache *nc, > + unsigned int fragsz, gfp_t gfp_mask) > +{ > + return __page_frag_alloc_align(nc, fragsz, gfp_mask, ~0u); > +} > + > +void page_frag_free(void *addr); > + > +#endif >=20 ... > diff --git a/mm/page_frag_cache.c b/mm/page_frag_cache.c > new file mode 100644 > index 000000000000..88f567ef0e29 > --- /dev/null > +++ b/mm/page_frag_cache.c > @@ -0,0 +1,144 @@ > +// SPDX-License-Identifier: GPL-2.0-only > +/* Page fragment allocator > + * > + * Page Fragment: > + * An arbitrary-length arbitrary-offset area of memory which resides wi= thin a > + * 0 or higher order page. Multiple fragments within that page are > + * individually refcounted, in the page's reference counter. > + * > + * The page_frag functions provide a simple allocation framework for pag= e > + * fragments. This is used by the network stack and network device driv= ers to > + * provide a backing region of memory for use as either an sk_buff->head= , or to > + * be used in the "frags" portion of skb_shared_info. > + */ > + > +#include > +#include > +#include > +#include > +#include "internal.h" You could probably include gfp_types.h here since this is where you are using the GFP_XXX values. > + > +static struct page *__page_frag_cache_refill(struct page_frag_cache *nc, > + gfp_t gfp_mask) > +{ > + struct page *page =3D NULL; > + gfp_t gfp =3D gfp_mask; > + > +#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) > + gfp_mask =3D (gfp_mask & ~__GFP_DIRECT_RECLAIM) | __GFP_COMP | > + __GFP_NOWARN | __GFP_NORETRY | __GFP_NOMEMALLOC; > + page =3D alloc_pages_node(NUMA_NO_NODE, gfp_mask, > + PAGE_FRAG_CACHE_MAX_ORDER); > + nc->size =3D page ? PAGE_FRAG_CACHE_MAX_SIZE : PAGE_SIZE; > +#endif > + if (unlikely(!page)) > + page =3D alloc_pages_node(NUMA_NO_NODE, gfp, 0); > + > + nc->va =3D page ? page_address(page) : NULL; > + > + return page; > +} > + > +void page_frag_cache_drain(struct page_frag_cache *nc) > +{ > + if (!nc->va) > + return; > + > + __page_frag_cache_drain(virt_to_head_page(nc->va), nc->pagecnt_bias); > + nc->va =3D NULL; > +} > +EXPORT_SYMBOL(page_frag_cache_drain); > + > +void __page_frag_cache_drain(struct page *page, unsigned int count) > +{ > + VM_BUG_ON_PAGE(page_ref_count(page) =3D=3D 0, page); > + > + if (page_ref_sub_and_test(page, count)) > + free_unref_page(page, compound_order(page)); > +} > +EXPORT_SYMBOL(__page_frag_cache_drain); > + > +void *__page_frag_alloc_align(struct page_frag_cache *nc, > + unsigned int fragsz, gfp_t gfp_mask, > + unsigned int align_mask) > +{ > + unsigned int size =3D PAGE_SIZE; > + struct page *page; > + int offset; > + > + if (unlikely(!nc->va)) { > +refill: > + page =3D __page_frag_cache_refill(nc, gfp_mask); > + if (!page) > + return NULL; > + > +#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) > + /* if size can vary use size else just use PAGE_SIZE */ > + size =3D nc->size; > +#endif > + /* Even if we own the page, we do not use atomic_set(). > + * This would break get_page_unless_zero() users. > + */ > + page_ref_add(page, PAGE_FRAG_CACHE_MAX_SIZE); > + > + /* reset page count bias and offset to start of new frag */ > + nc->pfmemalloc =3D page_is_pfmemalloc(page); > + nc->pagecnt_bias =3D PAGE_FRAG_CACHE_MAX_SIZE + 1; > + nc->offset =3D size; > + } > + > + offset =3D nc->offset - fragsz; > + if (unlikely(offset < 0)) { > + page =3D virt_to_page(nc->va); > + > + if (!page_ref_sub_and_test(page, nc->pagecnt_bias)) > + goto refill; > + > + if (unlikely(nc->pfmemalloc)) { > + free_unref_page(page, compound_order(page)); > + goto refill; > + } > + > +#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) > + /* if size can vary use size else just use PAGE_SIZE */ > + size =3D nc->size; > +#endif > + /* OK, page count is 0, we can safely set it */ > + set_page_count(page, PAGE_FRAG_CACHE_MAX_SIZE + 1); > + > + /* reset page count bias and offset to start of new frag */ > + nc->pagecnt_bias =3D PAGE_FRAG_CACHE_MAX_SIZE + 1; > + offset =3D size - fragsz; > + if (unlikely(offset < 0)) { > + /* > + * The caller is trying to allocate a fragment > + * with fragsz > PAGE_SIZE but the cache isn't big > + * enough to satisfy the request, this may > + * happen in low memory conditions. > + * We don't release the cache page because > + * it could make memory pressure worse > + * so we simply return NULL here. > + */ > + return NULL; > + } > + } > + > + nc->pagecnt_bias--; > + offset &=3D align_mask; > + nc->offset =3D offset; > + > + return nc->va + offset; > +} > +EXPORT_SYMBOL(__page_frag_alloc_align); > + > +/* > + * Frees a page fragment allocated out of either a compound or order 0 p= age. > + */ > +void page_frag_free(void *addr) > +{ > + struct page *page =3D virt_to_head_page(addr); > + > + if (unlikely(put_page_testzero(page))) > + free_unref_page(page, compound_order(page)); > +} > +EXPORT_SYMBOL(page_frag_free); > diff --git a/mm/page_frag_test.c b/mm/page_frag_test.c > index 5ee3f33b756d..07748ee0a21f 100644 > --- a/mm/page_frag_test.c > +++ b/mm/page_frag_test.c > @@ -16,6 +16,7 @@ > #include > #include > #include > +#include > =20 > #define OBJPOOL_NR_OBJECT_MAX BIT(24) > =20