From: Vlastimil Babka <vbabka@suse.cz>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: bpf <bpf@vger.kernel.org>, Andrii Nakryiko <andrii@kernel.org>,
Kumar Kartikeya Dwivedi <memxor@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Peter Zijlstra <peterz@infradead.org>,
Sebastian Sewior <bigeasy@linutronix.de>,
Steven Rostedt <rostedt@goodmis.org>,
Hou Tao <houtao1@huawei.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Shakeel Butt <shakeel.butt@linux.dev>,
Michal Hocko <mhocko@suse.com>,
Matthew Wilcox <willy@infradead.org>,
Thomas Gleixner <tglx@linutronix.de>,
Jann Horn <jannh@google.com>, Tejun Heo <tj@kernel.org>,
linux-mm <linux-mm@kvack.org>, Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH bpf-next v5 2/7] mm, bpf: Introduce free_pages_nolock()
Date: Thu, 16 Jan 2025 09:31:10 +0100 [thread overview]
Message-ID: <478b027a-ef5b-4ed7-9fe3-ad2b627111ef@suse.cz> (raw)
In-Reply-To: <CAADnVQLU5JDUq70nE4+1wGf9Uh27oPahaVXxKHPKLAm5=ptiYQ@mail.gmail.com>
On 1/16/25 00:15, Alexei Starovoitov wrote:
> On Wed, Jan 15, 2025 at 3:47 AM Vlastimil Babka <vbabka@suse.cz> wrote:
>>
>> On 1/15/25 03:17, Alexei Starovoitov wrote:
>> > From: Alexei Starovoitov <ast@kernel.org>
>> >
>> > Introduce free_pages_nolock() that can free pages without taking locks.
>> > It relies on trylock and can be called from any context.
>> > Since spin_trylock() cannot be used in RT from hard IRQ or NMI
>> > it uses lockless link list to stash the pages which will be freed
>> > by subsequent free_pages() from good context.
>> >
>> > Do not use llist unconditionally. BPF maps continuously
>> > allocate/free, so we cannot unconditionally delay the freeing to
>> > llist. When the memory becomes free make it available to the
>> > kernel and BPF users right away if possible, and fallback to
>> > llist as the last resort.
>> >
>> > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
>>
>> Acked-by: Vlastimil Babka <vbabka@suse.cz>
>>
>> With:
>>
>> > @@ -4853,6 +4905,17 @@ void __free_pages(struct page *page, unsigned int order)
>> > }
>> > EXPORT_SYMBOL(__free_pages);
>> >
>> > +/*
>> > + * Can be called while holding raw_spin_lock or from IRQ and NMI,
>> > + * but only for pages that came from try_alloc_pages():
>> > + * order <= 3, !folio, etc
>>
>> I think order > 3 is fine, as !pcp_allowed_order() case is handled too?
>
> try_alloc_page() has:
> if (!pcp_allowed_order(order))
> return NULL;
Ah ok I missed that the comment describes what pages try_alloc_pages()
produces, not what's accepted here.
> to make sure it tries pcp first.
> bpf has no use for order > 1. Even 3 is overkill,
> but it's kinda free to support order <= 3, so why not.
>
>> And
>> what does "!folio" mean?
>
> That's what we discussed last year.
> __free_pages() has all the extra stuff if (!head) and
> support for dropping ref on the middle page.
> !folio captures this more broadly.
Aha! But in that case I realize we're actually wrong. It needs to be a folio
(compound page) if it's order > 0 in order to drop that tricky
!head freeing code. It's order > 0 pages that are not compound, that are
problematic and should be eventually removed from the kernel.
The solution is to add __GFP_COMP in try_alloc_pages_noprof(). This will
have no effect on order-0 pages that BPF uses.
Instead of "!folio" the comment could then say "compound".
next prev parent reply other threads:[~2025-01-16 8:31 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-15 2:17 [PATCH bpf-next v5 0/7] bpf, mm: Introduce try_alloc_pages() Alexei Starovoitov
2025-01-15 2:17 ` [PATCH bpf-next v5 1/7] mm, bpf: Introduce try_alloc_pages() for opportunistic page allocation Alexei Starovoitov
2025-01-15 11:19 ` Vlastimil Babka
2025-01-15 23:00 ` Alexei Starovoitov
2025-01-15 23:47 ` Shakeel Butt
2025-01-16 2:44 ` Alexei Starovoitov
2025-01-15 23:16 ` Shakeel Butt
2025-01-17 18:19 ` Sebastian Andrzej Siewior
2025-01-15 2:17 ` [PATCH bpf-next v5 2/7] mm, bpf: Introduce free_pages_nolock() Alexei Starovoitov
2025-01-15 11:47 ` Vlastimil Babka
2025-01-15 23:15 ` Alexei Starovoitov
2025-01-16 8:31 ` Vlastimil Babka [this message]
2025-01-17 18:20 ` Sebastian Andrzej Siewior
2025-01-15 2:17 ` [PATCH bpf-next v5 3/7] locking/local_lock: Introduce local_trylock_irqsave() Alexei Starovoitov
2025-01-15 2:23 ` Alexei Starovoitov
2025-01-15 7:22 ` Sebastian Sewior
2025-01-15 14:22 ` Vlastimil Babka
2025-01-16 2:20 ` Alexei Starovoitov
2025-01-17 20:33 ` Sebastian Andrzej Siewior
2025-01-21 15:59 ` Vlastimil Babka
2025-01-21 16:43 ` Sebastian Andrzej Siewior
2025-01-22 1:35 ` Alexei Starovoitov
2025-01-15 2:17 ` [PATCH bpf-next v5 4/7] memcg: Use trylock to access memcg stock_lock Alexei Starovoitov
2025-01-15 16:07 ` Vlastimil Babka
2025-01-16 0:12 ` Shakeel Butt
2025-01-16 2:22 ` Alexei Starovoitov
2025-01-16 20:07 ` Joshua Hahn
2025-01-17 17:36 ` Johannes Weiner
2025-01-15 2:17 ` [PATCH bpf-next v5 5/7] mm, bpf: Use memcg in try_alloc_pages() Alexei Starovoitov
2025-01-15 17:51 ` Vlastimil Babka
2025-01-16 0:24 ` Shakeel Butt
2025-01-15 2:17 ` [PATCH bpf-next v5 6/7] mm: Make failslab, kfence, kmemleak aware of trylock mode Alexei Starovoitov
2025-01-15 17:57 ` Vlastimil Babka
2025-01-16 2:23 ` Alexei Starovoitov
2025-01-15 2:17 ` [PATCH bpf-next v5 7/7] bpf: Use try_alloc_pages() to allocate pages for bpf needs Alexei Starovoitov
2025-01-15 18:02 ` Vlastimil Babka
2025-01-16 2:25 ` Alexei Starovoitov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=478b027a-ef5b-4ed7-9fe3-ad2b627111ef@suse.cz \
--to=vbabka@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=bigeasy@linutronix.de \
--cc=bpf@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=houtao1@huawei.com \
--cc=jannh@google.com \
--cc=kernel-team@fb.com \
--cc=linux-mm@kvack.org \
--cc=memxor@gmail.com \
--cc=mhocko@suse.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=shakeel.butt@linux.dev \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox