From: Jakub Kicinski <kuba@kernel.org>
To: Jesper Dangaard Brouer <jbrouer@redhat.com>
Cc: "Toke Høiland-Jørgensen" <toke@redhat.com>,
brouer@redhat.com,
"Ilias Apalodimas" <ilias.apalodimas@linaro.org>,
netdev@vger.kernel.org, "Eric Dumazet" <eric.dumazet@gmail.com>,
linux-mm@kvack.org, "Mel Gorman" <mgorman@techsingularity.net>,
lorenzo@kernel.org, linyunsheng@huawei.com, bpf@vger.kernel.org,
"David S. Miller" <davem@davemloft.net>,
"Paolo Abeni" <pabeni@redhat.com>,
"Andrew Morton" <akpm@linux-foundation.org>,
willy@infradead.org
Subject: Re: [PATCH RFC net-next/mm V3 1/2] page_pool: Remove workqueue in new shutdown scheme
Date: Wed, 3 May 2023 18:47:02 -0700 [thread overview]
Message-ID: <20230503184702.65dceb90@kernel.org> (raw)
In-Reply-To: <3a5a28c4-01a3-793c-6969-475aba3ff3b5@redhat.com>
On Wed, 3 May 2023 17:49:34 +0200 Jesper Dangaard Brouer wrote:
> On 03/05/2023 13.18, Toke Høiland-Jørgensen wrote:
> > Jakub Kicinski <kuba@kernel.org> writes:
> >> We can remove the warning without removing the entire delayed freeing
> >> scheme. I definitely like the SHUTDOWN flag and patch 2 but I'm a bit
> >> less clear on why the complexity of datapath freeing is justified.
> >> Can you explain?
> >
> > You mean just let the workqueue keep rescheduling itself every minute
> > for the (potentially) hours that skbs will stick around? Seems a bit
> > wasteful, doesn't it? :)
>
> I agree that this workqueue that keeps rescheduling is wasteful.
> It actually reschedules every second, even more wasteful.
> NIC drivers will have many HW RX-queues, with separate PP instances,
> that each can start a workqueue that resched every sec.
There's a lot of work items flying around on a working system.
I don't think the rare (and very cheap) PP check work is going
to move the needle. You should see how many work items DIM schedules :(
I'd think that potentially having the extra memory pinned is much more
of an issue than a periodic check, and that does not go away by
changing the checking mechanism.
> Eric have convinced me that SKBs can "stick around" for longer than the
> assumptions in PP. The old PP assumptions came from XDP-return path.
> It is time to cleanup.
I see. Regardless - should we have some count/statistic for the number
of "zombie" page pools sitting around in the system? Could be useful
for debug.
> > We did see an issue where creating and tearing down lots of page pools
> > in a short period of time caused significant slowdowns due to the
> > workqueue mechanism. Lots being "thousands per second". This is possible
> > using the live packet mode of bpf_prog_run() for XDP, which will setup
> > and destroy a page pool for each syscall...
>
> Yes, the XDP live packet mode of bpf_prog_run is IMHO abusing the
> page_pool API. We should fix that somehow, at least the case where live
> packet mode is only injecting a single packet, but still creates a PP
> instance. The PP in live packet mode IMHO only makes sense when
> repeatedly sending packets that gets recycles and are pre-inited by PP.
+1, FWIW, I was surprised that we have a init_callback which sits in
the fastpath exists purely for testing.
> This use of PP does exemplify why is it problematic to keep the workqueue.
>
> I have considered (and could be convinced) delaying the free via
> call_rcu, but it also create an unfortunate backlog of work in the case
> of live packet mode of bpf_prog_run.
Maybe let the pp used by BPF testing be reusable? No real driver will
create thousands of PPs a seconds, that's not sane.
Anyway, you'll choose what you'll choose. I just wanted to cast my vote
for the work queue rather than the tricky lockless release code.
next prev parent reply other threads:[~2023-05-04 1:47 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-28 16:16 [PATCH RFC net-next/mm V3 0/2] page_pool: new approach for leak detection and shutdown phase Jesper Dangaard Brouer
2023-04-28 16:16 ` [PATCH RFC net-next/mm V3 1/2] page_pool: Remove workqueue in new shutdown scheme Jesper Dangaard Brouer
2023-04-28 21:38 ` Toke Høiland-Jørgensen
2023-05-03 15:21 ` Jesper Dangaard Brouer
2023-05-03 2:33 ` Jakub Kicinski
2023-05-03 11:18 ` Toke Høiland-Jørgensen
2023-05-03 15:49 ` Jesper Dangaard Brouer
2023-05-04 1:47 ` Jakub Kicinski [this message]
2023-05-04 2:42 ` Yunsheng Lin
2023-05-04 13:48 ` Jesper Dangaard Brouer
2023-05-05 0:54 ` Yunsheng Lin
2023-05-06 13:11 ` Yunsheng Lin
2023-04-28 16:16 ` [PATCH RFC net-next/mm V3 2/2] mm/page_pool: catch page_pool memory leaks Jesper Dangaard Brouer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230503184702.65dceb90@kernel.org \
--to=kuba@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=bpf@vger.kernel.org \
--cc=brouer@redhat.com \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=ilias.apalodimas@linaro.org \
--cc=jbrouer@redhat.com \
--cc=linux-mm@kvack.org \
--cc=linyunsheng@huawei.com \
--cc=lorenzo@kernel.org \
--cc=mgorman@techsingularity.net \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=toke@redhat.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.