From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A8525C3DA61 for ; Sun, 21 Jul 2024 23:40:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0726C6B007B; Sun, 21 Jul 2024 19:40:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 022DF6B0082; Sun, 21 Jul 2024 19:40:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E2C206B0083; Sun, 21 Jul 2024 19:40:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id C2ACB6B007B for ; Sun, 21 Jul 2024 19:40:33 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 1D74E14138A for ; Sun, 21 Jul 2024 23:40:33 +0000 (UTC) X-FDA: 82365381546.24.E3D960B Received: from mail-pf1-f177.google.com (mail-pf1-f177.google.com [209.85.210.177]) by imf17.hostedemail.com (Postfix) with ESMTP id 314C840013 for ; Sun, 21 Jul 2024 23:40:30 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=WvffFwtw; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf17.hostedemail.com: domain of alexander.duyck@gmail.com designates 209.85.210.177 as permitted sender) smtp.mailfrom=alexander.duyck@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721605198; a=rsa-sha256; cv=none; b=OQJMlAkm1XQU5W4xFqs2tiXKH+MBhd55z7comGK+h+XV87a8riu+sc+tT2ubZMpSi0w4pv KKmibERLyl7CpzITqFkFnq8fW5A8G70+pK5LTKbnlazfgzC4mHwlz/vmQGHLOgYyvoF8T4 P7bKBO3oO0FOgAv3lMZcUDyPoT193U0= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=WvffFwtw; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf17.hostedemail.com: domain of alexander.duyck@gmail.com designates 209.85.210.177 as permitted sender) smtp.mailfrom=alexander.duyck@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721605198; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9TPH+yub+2uMZPFyaF5DryQZTfPrAnPa0K8uakroCBg=; b=HVB74Muk2spqFAVaqeXOQG1HsoZpOWnJ4dZSopZHl448fh3obd21uGe9qboHcSNRsIDmgO VhKdyy+2LGefje9dZ6CkBDOZSocfynpZTJeEyS/pSWoQfdQYUSHgxCGe7zfqw1GCGZ4ysK inNkbXvFD1CHApZSsTa8KkbdRkYiwRs= Received: by mail-pf1-f177.google.com with SMTP id d2e1a72fcca58-70d2d7e692eso37536b3a.0 for ; Sun, 21 Jul 2024 16:40:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1721605230; x=1722210030; darn=kvack.org; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:from:to:cc:subject :date:message-id:reply-to; bh=9TPH+yub+2uMZPFyaF5DryQZTfPrAnPa0K8uakroCBg=; b=WvffFwtwy2Q88YWCPcjG4e8YOi/+VrH0bfqkz5dahgn7ey7TkWFCyi0oB0lawUBpLC UPAnli8xi0NsbTVCzkyUJwq9abH4ORfLMTyNSRstZP0cAQjJhz6V1g3vxlxp7vVKOyh8 gOgiv5l3y6OOzZxPXY0yIDdLKyi+zimldmoTt/8VG6Z5Y5Uggdj3DEszY024QzVu2zhH zSYaQMkNWkGm5nB9ShmEO+t0xSeJ8UcQHfHkh1XTCn9y2iPhS9JZRgFFkcI5mXNqXmpE PtvwjWArtMHN8P28eNf+zkHCPPNFs1WQXi+LEaqiyuqZbNOHNjCPKg6qCT1o2rTXav2n vzDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721605230; x=1722210030; h=mime-version:user-agent:content-transfer-encoding:references :in-reply-to:date:cc:to:from:subject:message-id:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=9TPH+yub+2uMZPFyaF5DryQZTfPrAnPa0K8uakroCBg=; b=jD92f7ulRbl6DIwi+9mLEhXhG1SaibsBJLdGY9JGkE5YNf8rxjhsmgWlAZ8hgGld8J wUHw6zM4+rdEXWhA7MKFuUrudmvmNvnCTzMDuKzUunBfQRKSzTx/hyf/HyQ5EdG2TtK5 ZSJosT/TZhL56LahkTzi7SQWdTndAjXjkk78KPmKuHMgmkTMLjbXGDhOGHiUgGI2ztJZ lxpRznauR6NLsZxqEkxvlzabn9OcJmV9l6+aQLn1DDLcg1uhiOJFbhdszB5xVZ7mE0Xt zAlx2ZuHyE2btnSx0PYybgRg8zf6EpEkJTVcRxQ2jGTaJvhZk387uCzcRHjO9V+0k9OA bt9A== X-Forwarded-Encrypted: i=1; AJvYcCVDfqQ2Hv2pR5tzezAH7D/UMhxo5Q8CtQvG3RPFVEh9Y7CchkKnjg2klbnVIIaIWDqy41PZMBeidT1KtxUStobg4P8= X-Gm-Message-State: AOJu0Yz11X6oA/Ijf0kaW0oLziXjR+L4zCKVTjukHLxQAgi4wtjH25QS ZmE0W4loUehbkQy/DYQZYCmatmvGgd30W/h5bBSf0LQUtvgFYqp9 X-Google-Smtp-Source: AGHT+IEF2CF28gw2Bl73BjJI5bksUNp+zlVFz5xnj7AwHNfsCMOxNLVOB1nZG7dZeZpYsi7MqRbQJw== X-Received: by 2002:a05:6a00:3cc4:b0:70d:2c8d:bed0 with SMTP id d2e1a72fcca58-70d2c8dc54cmr361287b3a.24.1721605229831; Sun, 21 Jul 2024 16:40:29 -0700 (PDT) Received: from ?IPv6:2605:59c8:829:4c00:82ee:73ff:fe41:9a02? ([2605:59c8:829:4c00:82ee:73ff:fe41:9a02]) by smtp.googlemail.com with ESMTPSA id d2e1a72fcca58-70d15c36135sm2154095b3a.60.2024.07.21.16.40.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 21 Jul 2024 16:40:29 -0700 (PDT) Message-ID: Subject: Re: [RFC v11 08/14] mm: page_frag: some minor refactoring before adding new API From: Alexander H Duyck To: Yunsheng Lin , davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Andrew Morton , linux-mm@kvack.org Date: Sun, 21 Jul 2024 16:40:28 -0700 In-Reply-To: <20240719093338.55117-9-linyunsheng@huawei.com> References: <20240719093338.55117-1-linyunsheng@huawei.com> <20240719093338.55117-9-linyunsheng@huawei.com> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.48.4 (3.48.4-1.fc38) MIME-Version: 1.0 X-Rspamd-Queue-Id: 314C840013 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: bp9z5dd55ojriesq79cgqzrs1x1ijger X-HE-Tag: 1721605230-644601 X-HE-Meta: U2FsdGVkX19ClzxJ8YRjg7rAvQZ5Iz4sXWKqp3Nz3Dly/aP0Y1o38YS/lu26arGTX8Uh0TUviPhszgctgKP5dFZ+pnttWte387PYtoC/bC3T6lMw/aU0KcuWUXDxNbuj1PFy7yjABKYLwhiBtfTqo7FEQTPDriqyFQzr8zpgmeM6VRwr5l43J8svQinPxYQJQgiemhc/ew65ylNGOEgI4ZKoEnGcbS+liF6AEZF+Lf1cJG3K6nBpuQ+Cdu2LuyHcpfLs/+2fg4ShPSdaU3pvFrOYvzSQ9mB7FrCIlIGhSOna2lp4Vv/jOmw3VQFurgx7fu45+FrAPFzh0eu7sA9yWHgv/hI1SmMEqJttWJHlZkK7EJzqy/PNB5EwY/sImq7dnaqxzZeqL9eSqZsab4jGIxWGJEWIk9qqJwi3vM2CuYKDp2NIWMQ+6+tq7RrUZKN+bb9sxohIQ9rojxXuNlZDKO5DYdmBMiGAmfLhvbQ4HJEC+8ZP4wac+XjqD4wxzueqvBCTiiXPZNCgE4KTERGQlsno9wHH9nzOPMu9u/9phAApsE3vbt11wQhEw6Jb41mpwKz4FqOwwU9ITqUH7hZfu4sOLljnGAZNVjqN2JKE83DmqbGYKuQD9yCjgSobtAXUTt2+GJkGKaeGH0N72k9IgfU1Fk/M4zROkYMyjnVkf6afqxbWV8ZXFRZhfM4J4DJo2BO6Zm8INP1l+R7hN51UxFGRqe7CaBFXMwvLKiK+7c3HVpmz8fbW/qohMuUVNgwdDFBWIlR9/EfTC7IpeOaXnb1aar4YnX8tpKfWv21/pSmUgUL36QdJVhWmReGEW+ShTqLv3cdJGmlwGlwLCR5c6czXrunNkp2iDI7DMB1r89alwxzimPPIwJDT3b4XPklEaKxZsbriNCnuOjMv8//d2loeLlcGtw2yqpEVUcKNMM2f6zV8hwXEAMY1M2JOHjKGnCaEk2gUwKE4WUjibRy zEiPr7P1 QjEEPgk+q8htHmn0yZiKz/zHXRWFtUMvMF6qfqGWX4+zRyk0bmpHTqznTwVdQ3mAyKtCwAG9g/KEWie3s68UvpXGcz+DAArkqyPc05ch4jovcNzVHMnhiSDwXe6vtbje2nAW/g+bCwGwxE/RECc88U3dHOxNwaiVGzseo2u5PWrPTGwajU21L8LS0zuSqYRijLpI1y59IxHM2fpWWQXY5OqOEa3TVrxdaIWwzgn9PW6Dal1EQ4apCS6HAbP80d2VcrOhi37Zry2bVxFDyQv5BKJ1AkLMuH55Kjt88UZT6savzd0TYE2Zfkpl0UuhsIJQ7583p554IRaninQO38dGFtnZgz2YlkLjbi/X1XFftDchj9PFkbfmTIcUPMTF1UwXU702qCfchRkIBEr48LftVfX3Hvtrg77QyauIe6C991dfv28FU2albtw0PML3Of+lnRGcQ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, 2024-07-19 at 17:33 +0800, Yunsheng Lin wrote: > Refactor common codes from __page_frag_alloc_va_align() > to __page_frag_cache_refill(), so that the new API can > make use of them. >=20 > CC: Alexander Duyck > Signed-off-by: Yunsheng Lin > --- > include/linux/page_frag_cache.h | 2 +- > mm/page_frag_cache.c | 93 +++++++++++++++++---------------- > 2 files changed, 49 insertions(+), 46 deletions(-) >=20 > diff --git a/include/linux/page_frag_cache.h b/include/linux/page_frag_ca= che.h > index 12a16f8e8ad0..5aa45de7a9a5 100644 > --- a/include/linux/page_frag_cache.h > +++ b/include/linux/page_frag_cache.h > @@ -50,7 +50,7 @@ static inline void *encoded_page_address(unsigned long = encoded_va) > =20 > static inline void page_frag_cache_init(struct page_frag_cache *nc) > { > - nc->encoded_va =3D 0; > + memset(nc, 0, sizeof(*nc)); > } > =20 I do not like requiring the entire structure to be reset as a part of init. If encoded_va is 0 then we have reset the page and the flags. There shouldn't be anything else we need to reset as remaining and bias will be reset when we reallocate. > static inline bool page_frag_cache_is_pfmemalloc(struct page_frag_cache = *nc) > diff --git a/mm/page_frag_cache.c b/mm/page_frag_cache.c > index 7928e5d50711..d9c9cad17af7 100644 > --- a/mm/page_frag_cache.c > +++ b/mm/page_frag_cache.c > @@ -19,6 +19,28 @@ > #include > #include "internal.h" > =20 > +static struct page *__page_frag_cache_recharge(struct page_frag_cache *n= c) > +{ > + unsigned long encoded_va =3D nc->encoded_va; > + struct page *page; > + > + page =3D virt_to_page((void *)encoded_va); > + if (!page_ref_sub_and_test(page, nc->pagecnt_bias)) > + return NULL; > + > + if (unlikely(encoded_page_pfmemalloc(encoded_va))) { > + VM_BUG_ON(compound_order(page) !=3D > + encoded_page_order(encoded_va)); > + free_unref_page(page, encoded_page_order(encoded_va)); > + return NULL; > + } > + > + /* OK, page count is 0, we can safely set it */ > + set_page_count(page, PAGE_FRAG_CACHE_MAX_SIZE + 1); > + > + return page; > +} > + > static struct page *__page_frag_cache_refill(struct page_frag_cache *nc, > gfp_t gfp_mask) > { > @@ -26,6 +48,14 @@ static struct page *__page_frag_cache_refill(struct pa= ge_frag_cache *nc, > struct page *page =3D NULL; > gfp_t gfp =3D gfp_mask; > =20 > + if (likely(nc->encoded_va)) { > + page =3D __page_frag_cache_recharge(nc); > + if (page) { > + order =3D encoded_page_order(nc->encoded_va); > + goto out; > + } > + } > + This code has no business here. This is refill, you just dropped recharge in here which will make a complete mess of the ordering and be confusing to say the least. The expectation was that if we are calling this function it is going to overwrite the virtual address to NULL on failure so we discard the old page if there is one present. This changes that behaviour. What you effectively did is made __page_frag_cache_refill into the recharge function. > #if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) > gfp_mask =3D (gfp_mask & ~__GFP_DIRECT_RECLAIM) | __GFP_COMP | > __GFP_NOWARN | __GFP_NORETRY | __GFP_NOMEMALLOC; > @@ -35,7 +65,7 @@ static struct page *__page_frag_cache_refill(struct pag= e_frag_cache *nc, > if (unlikely(!page)) { > page =3D alloc_pages_node(NUMA_NO_NODE, gfp, 0); > if (unlikely(!page)) { > - nc->encoded_va =3D 0; > + memset(nc, 0, sizeof(*nc)); > return NULL; > } > =20 The memset will take a few more instructions than the existing code did. I would prefer to keep this as is if at all possible. > @@ -45,6 +75,16 @@ static struct page *__page_frag_cache_refill(struct pa= ge_frag_cache *nc, > nc->encoded_va =3D encode_aligned_va(page_address(page), order, > page_is_pfmemalloc(page)); > =20 > + /* Even if we own the page, we do not use atomic_set(). > + * This would break get_page_unless_zero() users. > + */ > + page_ref_add(page, PAGE_FRAG_CACHE_MAX_SIZE); > + > +out: > + /* reset page count bias and remaining to start of new frag */ > + nc->pagecnt_bias =3D PAGE_FRAG_CACHE_MAX_SIZE + 1; > + nc->remaining =3D PAGE_SIZE << order; > + > return page; > } > =20 Why bother returning a page at all? It doesn't seem like you don't use it anymore. It looks like the use cases you have for it in patch 11/12 all appear to be broken from what I can tell as you are adding page as a variable when we don't need to be passing internal details to the callers of the function when just a simple error return code would do. > @@ -55,7 +95,7 @@ void page_frag_cache_drain(struct page_frag_cache *nc) > =20 > __page_frag_cache_drain(virt_to_head_page((void *)nc->encoded_va), > nc->pagecnt_bias); > - nc->encoded_va =3D 0; > + memset(nc, 0, sizeof(*nc)); > } > EXPORT_SYMBOL(page_frag_cache_drain); > =20 > @@ -72,31 +112,9 @@ void *__page_frag_alloc_va_align(struct page_frag_cac= he *nc, > unsigned int fragsz, gfp_t gfp_mask, > unsigned int align_mask) > { > - unsigned long encoded_va =3D nc->encoded_va; > - unsigned int size, remaining; > - struct page *page; > - > - if (unlikely(!encoded_va)) { > -refill: > - page =3D __page_frag_cache_refill(nc, gfp_mask); > - if (!page) > - return NULL; > - > - encoded_va =3D nc->encoded_va; > - size =3D page_frag_cache_page_size(encoded_va); > + unsigned int size =3D page_frag_cache_page_size(nc->encoded_va); > + unsigned int remaining =3D nc->remaining & align_mask; > =20 > - /* Even if we own the page, we do not use atomic_set(). > - * This would break get_page_unless_zero() users. > - */ > - page_ref_add(page, PAGE_FRAG_CACHE_MAX_SIZE); > - > - /* reset page count bias and remaining to start of new frag */ > - nc->pagecnt_bias =3D PAGE_FRAG_CACHE_MAX_SIZE + 1; > - nc->remaining =3D size; > - } > - > - size =3D page_frag_cache_page_size(encoded_va); > - remaining =3D nc->remaining & align_mask; > if (unlikely(remaining < fragsz)) { I am not a fan of adding a dependency on remaining being set *before* encoded_va. The fact is it relies on the size to set it. In addition this is creating a big blob of code for the conditional paths to have to jump over. I think it is much better to first validate encoded_va, and then validate remaining. Otherwise just checking remaining seems problematic and like a recipe for NULL pointer accesses. > if (unlikely(fragsz > PAGE_SIZE)) { > /* > @@ -111,32 +129,17 @@ void *__page_frag_alloc_va_align(struct page_frag_c= ache *nc, > return NULL; > } > =20 > - page =3D virt_to_page((void *)encoded_va); > - > - if (!page_ref_sub_and_test(page, nc->pagecnt_bias)) > - goto refill; > - > - if (unlikely(encoded_page_pfmemalloc(encoded_va))) { > - VM_BUG_ON(compound_order(page) !=3D > - encoded_page_order(encoded_va)); > - free_unref_page(page, encoded_page_order(encoded_va)); > - goto refill; > - } > - > - /* OK, page count is 0, we can safely set it */ > - set_page_count(page, PAGE_FRAG_CACHE_MAX_SIZE + 1); > - > - /* reset page count bias and remaining to start of new frag */ > - nc->pagecnt_bias =3D PAGE_FRAG_CACHE_MAX_SIZE + 1; > - nc->remaining =3D size; > + if (unlikely(!__page_frag_cache_refill(nc, gfp_mask))) > + return NULL; > =20 > + size =3D page_frag_cache_page_size(nc->encoded_va); So this is adding yet another setting/reading of size to the recharge path now. Previously the recharge path could just reuse the existing size. > remaining =3D size; > } > =20 > nc->pagecnt_bias--; > nc->remaining =3D remaining - fragsz; > =20 > - return encoded_page_address(encoded_va) + (size - remaining); > + return encoded_page_address(nc->encoded_va) + (size - remaining); > } > EXPORT_SYMBOL(__page_frag_alloc_va_align); > =20