From: Alexander H Duyck <alexander.duyck@gmail.com>
To: Yunsheng Lin <linyunsheng@huawei.com>
Cc: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org
Subject: Re: [RFC v11 08/14] mm: page_frag: some minor refactoring before adding new API
Date: Tue, 30 Jul 2024 08:12:42 -0700 [thread overview]
Message-ID: <ad691cb4a744cbdc7da283c5c068331801482b36.camel@gmail.com> (raw)
In-Reply-To: <af06fc13-ae3f-41ca-9723-af1c8d9d051d@huawei.com>
On Tue, 2024-07-30 at 21:20 +0800, Yunsheng Lin wrote:
> On 2024/7/23 21:19, Yunsheng Lin wrote:
> > >
...
> > > The only piece that is really reused here is the pagecnt_bias
> > > assignment. What is obfuscated away is that the order is gotten
> > > through one of two paths. Really order isn't order here it is size.
> > > Which should have been fetched already. What you end up doing with
> > > this change is duplicating a bunch of code throughout the function.
> > > You end up having to fetch size multiple times multiple ways. here you
> > > are generating it with order. Then you have to turn around and get it
> > > again at the start of the function, and again after calling this
> > > function in order to pull it back out.
> >
> > I am assuming you would like to reserve old behavior as below?
> >
> > if(!encoded_va) {
> > refill:
> > __page_frag_cache_refill()
> > }
> >
> >
> > if(remaining < fragsz) {
> > if(!__page_frag_cache_recharge())
> > goto refill;
> > }
> >
> > As we are adding new APIs, are we expecting new APIs also duplicate
> > the above pattern?
> >
> > >
>
> How about something like below? __page_frag_cache_refill() and
> __page_frag_cache_reuse() does what their function name suggests
> as much as possible, __page_frag_cache_reload() is added to avoid
> new APIs duplicating similar pattern as much as possible, also
> avoid fetching size multiple times multiple ways as much as possible.
This is better. Still though with the code getting so spread out we
probably need to start adding more comments to explain things.
> static struct page *__page_frag_cache_reuse(unsigned long encoded_va,
> unsigned int pagecnt_bias)
> {
> struct page *page;
>
> page = virt_to_page((void *)encoded_va);
> if (!page_ref_sub_and_test(page, pagecnt_bias))
> return NULL;
>
> if (unlikely(encoded_page_pfmemalloc(encoded_va))) {
> VM_BUG_ON(compound_order(page) !=
> encoded_page_order(encoded_va));
This VM_BUG_ON here makes no sense. If we are going to have this
anywhere it might make more sense in the cache_refill case below to
verify we are setting the order to match when we are generating the
encoded_va.
> free_unref_page(page, encoded_page_order(encoded_va));
> return NULL;
> }
>
> /* OK, page count is 0, we can safely set it */
> set_page_count(page, PAGE_FRAG_CACHE_MAX_SIZE + 1);
> return page;
Why are you returning page here? It isn't used by any of the callers.
We are refilling the page here anyway so any caller should already have
access to the page since it wasn't changed.
> }
>
> static struct page *__page_frag_cache_refill(struct page_frag_cache *nc,
> gfp_t gfp_mask)
> {
> unsigned long order = PAGE_FRAG_CACHE_MAX_ORDER;
> struct page *page = NULL;
> gfp_t gfp = gfp_mask;
>
> #if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE)
> gfp_mask = (gfp_mask & ~__GFP_DIRECT_RECLAIM) | __GFP_COMP |
> __GFP_NOWARN | __GFP_NORETRY | __GFP_NOMEMALLOC;
> page = alloc_pages_node(NUMA_NO_NODE, gfp_mask,
> PAGE_FRAG_CACHE_MAX_ORDER);
I suspect the compliler is probably already doing this, but we should
probably not be updating gfp_mask but instead gfp since that is our
local variable.
> #endif
> if (unlikely(!page)) {
> page = alloc_pages_node(NUMA_NO_NODE, gfp, 0);
> if (unlikely(!page)) {
> memset(nc, 0, sizeof(*nc));
> return NULL;
> }
>
> order = 0;
> }
>
> nc->encoded_va = encode_aligned_va(page_address(page), order,
> page_is_pfmemalloc(page));
>
> /* Even if we own the page, we do not use atomic_set().
> * This would break get_page_unless_zero() users.
> */
> page_ref_add(page, PAGE_FRAG_CACHE_MAX_SIZE);
>
> return page;
Again, returning page here doesn't make much sense. You are better off
not exposing internals as you have essentially locked the page down for
use by the frag API so you shouldn't be handing out the page directly
to callers.
> }
>
> static struct page *__page_frag_cache_reload(struct page_frag_cache *nc,
> gfp_t gfp_mask)
> {
> struct page *page;
>
> if (likely(nc->encoded_va)) {
> page = __page_frag_cache_reuse(nc->encoded_va, nc->pagecnt_bias);
> if (page)
> goto out;
> }
>
> page = __page_frag_cache_refill(nc, gfp_mask);
> if (unlikely(!page))
> return NULL;
>
> out:
> nc->remaining = page_frag_cache_page_size(nc->encoded_va);
> nc->pagecnt_bias = PAGE_FRAG_CACHE_MAX_SIZE + 1;
> return page;
Your one current caller doesn't use the page value provided here. I
would recommend just not bothering with the page variable until you
actually need it.
> }
>
> void *__page_frag_alloc_va_align(struct page_frag_cache *nc,
> unsigned int fragsz, gfp_t gfp_mask,
> unsigned int align_mask)
> {
> unsigned long encoded_va = nc->encoded_va;
> unsigned int remaining;
>
> remaining = nc->remaining & align_mask;
> if (unlikely(remaining < fragsz)) {
You might as well swap the code paths. It would likely be much easier
to read the case where you are handling remaining >= fragsz in here
rather than having more if statements buried within the if statement.
With that you will have more room for the comment and such below.
> if (unlikely(fragsz > PAGE_SIZE)) {
> /*
> * The caller is trying to allocate a fragment
> * with fragsz > PAGE_SIZE but the cache isn't big
> * enough to satisfy the request, this may
> * happen in low memory conditions.
> * We don't release the cache page because
> * it could make memory pressure worse
> * so we simply return NULL here.
> */
> return NULL;
> }
>
> if (unlikely(!__page_frag_cache_reload(nc, gfp_mask)))
> return NULL;
This is what I am talking about in the earlier comments. You go to the
trouble of returning page through all the callers just to not use it
here. So there isn't any point in passing it through the functions.
>
> nc->pagecnt_bias--;
> nc->remaining -= fragsz;
>
> return encoded_page_address(nc->encoded_va);
> }
>
> nc->pagecnt_bias--;
> nc->remaining = remaining - fragsz;
>
> return encoded_page_address(encoded_va) +
> (page_frag_cache_page_size(encoded_va) - remaining);
Parenthesis here shouldn't be needed, addition and subtractions
operations can happen in any order with the result coming out the same.
next prev parent reply other threads:[~2024-07-30 15:12 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20240719093338.55117-1-linyunsheng@huawei.com>
2024-07-19 9:33 ` [RFC v11 01/14] mm: page_frag: add a test module for page_frag Yunsheng Lin
2024-07-21 17:34 ` Alexander Duyck
2024-07-23 13:19 ` Yunsheng Lin
2024-07-19 9:33 ` [RFC v11 02/14] mm: move the page fragment allocator from page_alloc into its own file Yunsheng Lin
2024-07-21 17:58 ` Alexander Duyck
2024-07-27 15:04 ` Yunsheng Lin
2024-07-19 9:33 ` [RFC v11 03/14] mm: page_frag: use initial zero offset for page_frag_alloc_align() Yunsheng Lin
2024-07-21 18:34 ` Alexander Duyck
2024-07-19 9:33 ` [RFC v11 04/14] mm: page_frag: add '_va' suffix to page_frag API Yunsheng Lin
[not found] ` <CAKgT0UcqELiXntRA_uD8eJGjt-OCLO64ax=YFXrCHNnaj9kD8g@mail.gmail.com>
2024-07-25 12:21 ` Yunsheng Lin
2024-07-19 9:33 ` [RFC v11 05/14] mm: page_frag: avoid caller accessing 'page_frag_cache' directly Yunsheng Lin
2024-07-21 23:01 ` Alexander H Duyck
2024-07-19 9:33 ` [RFC v11 07/14] mm: page_frag: reuse existing space for 'size' and 'pfmemalloc' Yunsheng Lin
2024-07-21 22:59 ` Alexander H Duyck
2024-07-19 9:33 ` [RFC v11 08/14] mm: page_frag: some minor refactoring before adding new API Yunsheng Lin
2024-07-21 23:40 ` Alexander H Duyck
2024-07-22 12:55 ` Yunsheng Lin
2024-07-22 15:32 ` Alexander Duyck
2024-07-23 13:19 ` Yunsheng Lin
2024-07-30 13:20 ` Yunsheng Lin
2024-07-30 15:12 ` Alexander H Duyck [this message]
2024-07-31 12:35 ` Yunsheng Lin
2024-07-31 17:02 ` Alexander H Duyck
2024-08-01 12:53 ` Yunsheng Lin
2024-07-19 9:33 ` [RFC v11 09/14] mm: page_frag: use __alloc_pages() to replace alloc_pages_node() Yunsheng Lin
2024-07-21 21:41 ` Alexander H Duyck
2024-07-24 12:54 ` Yunsheng Lin
2024-07-24 15:03 ` Alexander Duyck
2024-07-25 12:19 ` Yunsheng Lin
2024-08-14 18:34 ` Alexander H Duyck
2024-07-19 9:33 ` [RFC v11 11/14] mm: page_frag: introduce prepare/probe/commit API Yunsheng Lin
2024-07-19 9:33 ` [RFC v11 13/14] mm: page_frag: update documentation for page_frag Yunsheng Lin
[not found] ` <CAKgT0UcGvrS7=r0OCGZipzBv8RuwYtRwb2QDXqiF4qW5CNws4g@mail.gmail.com>
[not found] ` <b2001dba-a2d2-4b49-bc9f-59e175e7bba1@huawei.com>
2024-07-22 15:21 ` [RFC v11 00/14] Replace page_frag with page_frag_cache for sk_page_frag() Alexander Duyck
2024-07-23 13:17 ` Yunsheng Lin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ad691cb4a744cbdc7da283c5c068331801482b36.camel@gmail.com \
--to=alexander.duyck@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=davem@davemloft.net \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linyunsheng@huawei.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).