* [PATCH net-next v1] page_pool: avoid infinite loop to schedule delayed worker
@ 2025-02-10 13:09 Jason Xing
2025-02-12 2:37 ` Mina Almasry
0 siblings, 1 reply; 15+ messages in thread
From: Jason Xing @ 2025-02-10 13:09 UTC (permalink / raw)
To: davem, edumazet, kuba, pabeni, hawk, ilias.apalodimas, horms
Cc: netdev, Jason Xing
If the buggy driver causes the inflight less than 0 [1] and warns
us in page_pool_inflight(), it means we should not expect the
whole page_pool to get back to work normally.
We noticed the kworker is waken up repeatedly and infinitely[1]
in production. If the page pool detect the error happening,
probably letting it go is a better way and do not flood the
var log messages. This patch mitigates the adverse effect.
[1]
[Mon Feb 10 20:36:11 2025] ------------[ cut here ]------------
[Mon Feb 10 20:36:11 2025] Negative(-51446) inflight packet-pages
...
[Mon Feb 10 20:36:11 2025] Call Trace:
[Mon Feb 10 20:36:11 2025] page_pool_release_retry+0x23/0x70
[Mon Feb 10 20:36:11 2025] process_one_work+0x1b1/0x370
[Mon Feb 10 20:36:11 2025] worker_thread+0x37/0x3a0
[Mon Feb 10 20:36:11 2025] kthread+0x11a/0x140
[Mon Feb 10 20:36:11 2025] ? process_one_work+0x370/0x370
[Mon Feb 10 20:36:11 2025] ? __kthread_cancel_work+0x40/0x40
[Mon Feb 10 20:36:11 2025] ret_from_fork+0x35/0x40
[Mon Feb 10 20:36:11 2025] ---[ end trace ebffe800f33e7e34 ]---
Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
---
net/core/page_pool.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 1c6fec08bc43..8e9f5801aabb 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -1167,7 +1167,7 @@ void page_pool_destroy(struct page_pool *pool)
page_pool_disable_direct_recycling(pool);
page_pool_free_frag(pool);
- if (!page_pool_release(pool))
+ if (page_pool_release(pool) <= 0)
return;
page_pool_detached(pool);
--
2.43.5
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v1] page_pool: avoid infinite loop to schedule delayed worker
2025-02-10 13:09 [PATCH net-next v1] page_pool: avoid infinite loop to schedule delayed worker Jason Xing
@ 2025-02-12 2:37 ` Mina Almasry
2025-02-12 2:46 ` Jakub Kicinski
2025-02-12 3:14 ` Jason Xing
0 siblings, 2 replies; 15+ messages in thread
From: Mina Almasry @ 2025-02-12 2:37 UTC (permalink / raw)
To: Jason Xing
Cc: davem, edumazet, kuba, pabeni, hawk, ilias.apalodimas, horms,
netdev
On Mon, Feb 10, 2025 at 5:10 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
>
> If the buggy driver causes the inflight less than 0 [1] and warns
How does a buggy driver trigger this?
> us in page_pool_inflight(), it means we should not expect the
> whole page_pool to get back to work normally.
>
> We noticed the kworker is waken up repeatedly and infinitely[1]
> in production. If the page pool detect the error happening,
> probably letting it go is a better way and do not flood the
> var log messages. This patch mitigates the adverse effect.
>
> [1]
> [Mon Feb 10 20:36:11 2025] ------------[ cut here ]------------
> [Mon Feb 10 20:36:11 2025] Negative(-51446) inflight packet-pages
> ...
> [Mon Feb 10 20:36:11 2025] Call Trace:
> [Mon Feb 10 20:36:11 2025] page_pool_release_retry+0x23/0x70
> [Mon Feb 10 20:36:11 2025] process_one_work+0x1b1/0x370
> [Mon Feb 10 20:36:11 2025] worker_thread+0x37/0x3a0
> [Mon Feb 10 20:36:11 2025] kthread+0x11a/0x140
> [Mon Feb 10 20:36:11 2025] ? process_one_work+0x370/0x370
> [Mon Feb 10 20:36:11 2025] ? __kthread_cancel_work+0x40/0x40
> [Mon Feb 10 20:36:11 2025] ret_from_fork+0x35/0x40
> [Mon Feb 10 20:36:11 2025] ---[ end trace ebffe800f33e7e34 ]---
>
> Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
> ---
> net/core/page_pool.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index 1c6fec08bc43..8e9f5801aabb 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -1167,7 +1167,7 @@ void page_pool_destroy(struct page_pool *pool)
> page_pool_disable_direct_recycling(pool);
> page_pool_free_frag(pool);
>
> - if (!page_pool_release(pool))
> + if (page_pool_release(pool) <= 0)
> return;
Isn't it the condition in page_pool_release_retry() that you want. to
modify? That is the one that handles whether the worker keeps spinning
no?
I also wonder also whether if the check in page_pool_release() itself
needs to be:
if (inflight < 0)
__page_pool_destroy();
otherwise the pool will never be destroyed no?
--
Thanks,
Mina
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v1] page_pool: avoid infinite loop to schedule delayed worker
2025-02-12 2:37 ` Mina Almasry
@ 2025-02-12 2:46 ` Jakub Kicinski
2025-02-12 3:20 ` Jason Xing
2025-02-12 3:14 ` Jason Xing
1 sibling, 1 reply; 15+ messages in thread
From: Jakub Kicinski @ 2025-02-12 2:46 UTC (permalink / raw)
To: Mina Almasry
Cc: Jason Xing, davem, edumazet, pabeni, hawk, ilias.apalodimas,
horms, netdev
On Tue, 11 Feb 2025 18:37:22 -0800 Mina Almasry wrote:
> Isn't it the condition in page_pool_release_retry() that you want. to
> modify? That is the one that handles whether the worker keeps spinning
> no?
+1
A code comment may be useful BTW.
> I also wonder also whether if the check in page_pool_release() itself
> needs to be:
>
> if (inflight < 0)
> __page_pool_destroy();
>
> otherwise the pool will never be destroyed no?
It's probably safer to leak the memory than risk a crash if
we undercounted and some page will try to return itself later?
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v1] page_pool: avoid infinite loop to schedule delayed worker
2025-02-12 2:37 ` Mina Almasry
2025-02-12 2:46 ` Jakub Kicinski
@ 2025-02-12 3:14 ` Jason Xing
2025-02-12 19:24 ` Mina Almasry
1 sibling, 1 reply; 15+ messages in thread
From: Jason Xing @ 2025-02-12 3:14 UTC (permalink / raw)
To: Mina Almasry
Cc: davem, edumazet, kuba, pabeni, hawk, ilias.apalodimas, horms,
netdev
On Wed, Feb 12, 2025 at 10:37 AM Mina Almasry <almasrymina@google.com> wrote:
>
> On Mon, Feb 10, 2025 at 5:10 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
> >
> > If the buggy driver causes the inflight less than 0 [1] and warns
>
> How does a buggy driver trigger this?
We're still reproducing and investigating. With a certain version of
driver + XDP installed, we have a very slight chance to see this
happening.
>
> > us in page_pool_inflight(), it means we should not expect the
> > whole page_pool to get back to work normally.
> >
> > We noticed the kworker is waken up repeatedly and infinitely[1]
> > in production. If the page pool detect the error happening,
> > probably letting it go is a better way and do not flood the
> > var log messages. This patch mitigates the adverse effect.
> >
> > [1]
> > [Mon Feb 10 20:36:11 2025] ------------[ cut here ]------------
> > [Mon Feb 10 20:36:11 2025] Negative(-51446) inflight packet-pages
> > ...
> > [Mon Feb 10 20:36:11 2025] Call Trace:
> > [Mon Feb 10 20:36:11 2025] page_pool_release_retry+0x23/0x70
> > [Mon Feb 10 20:36:11 2025] process_one_work+0x1b1/0x370
> > [Mon Feb 10 20:36:11 2025] worker_thread+0x37/0x3a0
> > [Mon Feb 10 20:36:11 2025] kthread+0x11a/0x140
> > [Mon Feb 10 20:36:11 2025] ? process_one_work+0x370/0x370
> > [Mon Feb 10 20:36:11 2025] ? __kthread_cancel_work+0x40/0x40
> > [Mon Feb 10 20:36:11 2025] ret_from_fork+0x35/0x40
> > [Mon Feb 10 20:36:11 2025] ---[ end trace ebffe800f33e7e34 ]---
> >
> > Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
> > ---
> > net/core/page_pool.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > index 1c6fec08bc43..8e9f5801aabb 100644
> > --- a/net/core/page_pool.c
> > +++ b/net/core/page_pool.c
> > @@ -1167,7 +1167,7 @@ void page_pool_destroy(struct page_pool *pool)
> > page_pool_disable_direct_recycling(pool);
> > page_pool_free_frag(pool);
> >
> > - if (!page_pool_release(pool))
> > + if (page_pool_release(pool) <= 0)
> > return;
>
> Isn't it the condition in page_pool_release_retry() that you want. to
> modify? That is the one that handles whether the worker keeps spinning
> no?
Right, do you mean this patch?
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 8e9f5801aabb..7dde3bd5f275 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -1112,7 +1112,7 @@ static void page_pool_release_retry(struct
work_struct *wq)
int inflight;
inflight = page_pool_release(pool);
- if (!inflight)
+ if (inflight < 0)
return;
It has the same behaviour as the current patch does. I thought we
could stop it earlier.
>
> I also wonder also whether if the check in page_pool_release() itself
> needs to be:
>
> if (inflight < 0)
> __page_pool_destroy();
>
> otherwise the pool will never be destroyed no?
I'm worried this would have a more severe impact because it's
uncertain if in this case the page pool can be released? :(
Thanks,
Jason
>
> --
> Thanks,
> Mina
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v1] page_pool: avoid infinite loop to schedule delayed worker
2025-02-12 2:46 ` Jakub Kicinski
@ 2025-02-12 3:20 ` Jason Xing
2025-02-12 3:43 ` Jakub Kicinski
0 siblings, 1 reply; 15+ messages in thread
From: Jason Xing @ 2025-02-12 3:20 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Mina Almasry, davem, edumazet, pabeni, hawk, ilias.apalodimas,
horms, netdev
On Wed, Feb 12, 2025 at 10:46 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Tue, 11 Feb 2025 18:37:22 -0800 Mina Almasry wrote:
> > Isn't it the condition in page_pool_release_retry() that you want. to
> > modify? That is the one that handles whether the worker keeps spinning
> > no?
>
> +1
>
> A code comment may be useful BTW.
I will add it in the next version. Yes, my intention is to avoid
initializing the delayed work since we don't expect the worker in
page_pool_release_retry() to try over and over again.
>
> > I also wonder also whether if the check in page_pool_release() itself
> > needs to be:
> >
> > if (inflight < 0)
> > __page_pool_destroy();
> >
> > otherwise the pool will never be destroyed no?
>
> It's probably safer to leak the memory than risk a crash if
> we undercounted and some page will try to return itself later?
It's also what I'm concerned about.
Thanks,
Jason
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v1] page_pool: avoid infinite loop to schedule delayed worker
2025-02-12 3:20 ` Jason Xing
@ 2025-02-12 3:43 ` Jakub Kicinski
2025-02-12 4:38 ` Jason Xing
0 siblings, 1 reply; 15+ messages in thread
From: Jakub Kicinski @ 2025-02-12 3:43 UTC (permalink / raw)
To: Jason Xing
Cc: Mina Almasry, davem, edumazet, pabeni, hawk, ilias.apalodimas,
horms, netdev
On Wed, 12 Feb 2025 11:20:16 +0800 Jason Xing wrote:
> On Wed, Feb 12, 2025 at 10:46 AM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > On Tue, 11 Feb 2025 18:37:22 -0800 Mina Almasry wrote:
> > > Isn't it the condition in page_pool_release_retry() that you want. to
> > > modify? That is the one that handles whether the worker keeps spinning
> > > no?
> >
> > +1
> >
> > A code comment may be useful BTW.
>
> I will add it in the next version. Yes, my intention is to avoid
> initializing the delayed work since we don't expect the worker in
> page_pool_release_retry() to try over and over again.
Initializing a work isn't much cost, is it?
Just to state the obvious the current patch will not catch the
situation when there is traffic outstanding (inflight is positive)
at the time of detach from the driver. But then the inflight goes
negative before the work / time kicks in.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v1] page_pool: avoid infinite loop to schedule delayed worker
2025-02-12 3:43 ` Jakub Kicinski
@ 2025-02-12 4:38 ` Jason Xing
2025-02-12 18:53 ` Jakub Kicinski
0 siblings, 1 reply; 15+ messages in thread
From: Jason Xing @ 2025-02-12 4:38 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Mina Almasry, davem, edumazet, pabeni, hawk, ilias.apalodimas,
horms, netdev
On Wed, Feb 12, 2025 at 11:43 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Wed, 12 Feb 2025 11:20:16 +0800 Jason Xing wrote:
> > On Wed, Feb 12, 2025 at 10:46 AM Jakub Kicinski <kuba@kernel.org> wrote:
> > >
> > > On Tue, 11 Feb 2025 18:37:22 -0800 Mina Almasry wrote:
> > > > Isn't it the condition in page_pool_release_retry() that you want. to
> > > > modify? That is the one that handles whether the worker keeps spinning
> > > > no?
> > >
> > > +1
> > >
> > > A code comment may be useful BTW.
> >
> > I will add it in the next version. Yes, my intention is to avoid
> > initializing the delayed work since we don't expect the worker in
> > page_pool_release_retry() to try over and over again.
>
> Initializing a work isn't much cost, is it?
Not that much, but it's pointless to start a kworker under this
circumstance, right? And it will flood the dmesg.
>
> Just to state the obvious the current patch will not catch the
> situation when there is traffic outstanding (inflight is positive)
> at the time of detach from the driver. But then the inflight goes
> negative before the work / time kicks in.
Right, only mitigating the side effect. I will add this statement as
well while keeping the code itself as-is.
Thanks,
Jason
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v1] page_pool: avoid infinite loop to schedule delayed worker
2025-02-12 4:38 ` Jason Xing
@ 2025-02-12 18:53 ` Jakub Kicinski
2025-02-12 23:37 ` Jason Xing
0 siblings, 1 reply; 15+ messages in thread
From: Jakub Kicinski @ 2025-02-12 18:53 UTC (permalink / raw)
To: Jason Xing
Cc: Mina Almasry, davem, edumazet, pabeni, hawk, ilias.apalodimas,
horms, netdev
On Wed, 12 Feb 2025 12:38:28 +0800 Jason Xing wrote:
> > Initializing a work isn't much cost, is it?
>
> Not that much, but it's pointless to start a kworker under this
> circumstance, right? And it will flood the dmesg.
There's a seriously buggy driver potentially corrupting memory,
who cares if we start a kworker. Please don't complicate the
code for extremely rare scenarios.
> > Just to state the obvious the current patch will not catch the
> > situation when there is traffic outstanding (inflight is positive)
> > at the time of detach from the driver. But then the inflight goes
> > negative before the work / time kicks in.
>
> Right, only mitigating the side effect. I will add this statement as
> well while keeping the code itself as-is.
What do you mean by that?! We're telling you your code is wrong.
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v1] page_pool: avoid infinite loop to schedule delayed worker
2025-02-12 3:14 ` Jason Xing
@ 2025-02-12 19:24 ` Mina Almasry
2025-02-12 23:38 ` Jason Xing
0 siblings, 1 reply; 15+ messages in thread
From: Mina Almasry @ 2025-02-12 19:24 UTC (permalink / raw)
To: Jason Xing
Cc: davem, edumazet, kuba, pabeni, hawk, ilias.apalodimas, horms,
netdev
On Tue, Feb 11, 2025 at 7:14 PM Jason Xing <kerneljasonxing@gmail.com> wrote:
>
> On Wed, Feb 12, 2025 at 10:37 AM Mina Almasry <almasrymina@google.com> wrote:
> >
> > On Mon, Feb 10, 2025 at 5:10 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
> > >
> > > If the buggy driver causes the inflight less than 0 [1] and warns
> >
> > How does a buggy driver trigger this?
>
> We're still reproducing and investigating. With a certain version of
> driver + XDP installed, we have a very slight chance to see this
> happening.
>
> >
> > > us in page_pool_inflight(), it means we should not expect the
> > > whole page_pool to get back to work normally.
> > >
> > > We noticed the kworker is waken up repeatedly and infinitely[1]
> > > in production. If the page pool detect the error happening,
> > > probably letting it go is a better way and do not flood the
> > > var log messages. This patch mitigates the adverse effect.
> > >
> > > [1]
> > > [Mon Feb 10 20:36:11 2025] ------------[ cut here ]------------
> > > [Mon Feb 10 20:36:11 2025] Negative(-51446) inflight packet-pages
> > > ...
> > > [Mon Feb 10 20:36:11 2025] Call Trace:
> > > [Mon Feb 10 20:36:11 2025] page_pool_release_retry+0x23/0x70
> > > [Mon Feb 10 20:36:11 2025] process_one_work+0x1b1/0x370
> > > [Mon Feb 10 20:36:11 2025] worker_thread+0x37/0x3a0
> > > [Mon Feb 10 20:36:11 2025] kthread+0x11a/0x140
> > > [Mon Feb 10 20:36:11 2025] ? process_one_work+0x370/0x370
> > > [Mon Feb 10 20:36:11 2025] ? __kthread_cancel_work+0x40/0x40
> > > [Mon Feb 10 20:36:11 2025] ret_from_fork+0x35/0x40
> > > [Mon Feb 10 20:36:11 2025] ---[ end trace ebffe800f33e7e34 ]---
> > >
> > > Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
> > > ---
> > > net/core/page_pool.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > > index 1c6fec08bc43..8e9f5801aabb 100644
> > > --- a/net/core/page_pool.c
> > > +++ b/net/core/page_pool.c
> > > @@ -1167,7 +1167,7 @@ void page_pool_destroy(struct page_pool *pool)
> > > page_pool_disable_direct_recycling(pool);
> > > page_pool_free_frag(pool);
> > >
> > > - if (!page_pool_release(pool))
> > > + if (page_pool_release(pool) <= 0)
> > > return;
> >
> > Isn't it the condition in page_pool_release_retry() that you want. to
> > modify? That is the one that handles whether the worker keeps spinning
> > no?
>
> Right, do you mean this patch?
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index 8e9f5801aabb..7dde3bd5f275 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -1112,7 +1112,7 @@ static void page_pool_release_retry(struct
> work_struct *wq)
> int inflight;
>
> inflight = page_pool_release(pool);
> - if (!inflight)
> + if (inflight < 0)
> return;
>
> It has the same behaviour as the current patch does. I thought we
> could stop it earlier.
>
Yes I mean this.
> >
> > I also wonder also whether if the check in page_pool_release() itself
> > needs to be:
> >
> > if (inflight < 0)
> > __page_pool_destroy();
> >
> > otherwise the pool will never be destroyed no?
>
> I'm worried this would have a more severe impact because it's
> uncertain if in this case the page pool can be released? :(
>
Makes sense indeed. We can't be sure if the page_pool has already been
destroyed if inflight < 0. Ignore the earlier suggestion from me,
thanks.
--
Thanks,
Mina
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v1] page_pool: avoid infinite loop to schedule delayed worker
2025-02-12 18:53 ` Jakub Kicinski
@ 2025-02-12 23:37 ` Jason Xing
0 siblings, 0 replies; 15+ messages in thread
From: Jason Xing @ 2025-02-12 23:37 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Mina Almasry, davem, edumazet, pabeni, hawk, ilias.apalodimas,
horms, netdev
On Thu, Feb 13, 2025 at 2:53 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Wed, 12 Feb 2025 12:38:28 +0800 Jason Xing wrote:
> > > Initializing a work isn't much cost, is it?
> >
> > Not that much, but it's pointless to start a kworker under this
> > circumstance, right? And it will flood the dmesg.
>
> There's a seriously buggy driver potentially corrupting memory,
> who cares if we start a kworker. Please don't complicate the
> code for extremely rare scenarios.
My points are:
1) stop the kworker because it's useless.
2) avoid flooding so many warnings and calltraces in the dmesg
The modified code is not complicated.
>
> > > Just to state the obvious the current patch will not catch the
> > > situation when there is traffic outstanding (inflight is positive)
> > > at the time of detach from the driver. But then the inflight goes
> > > negative before the work / time kicks in.
> >
> > Right, only mitigating the side effect. I will add this statement as
> > well while keeping the code itself as-is.
>
> What do you mean by that?! We're telling you your code is wrong.
Sorry, I misunderstood your suggestion. So the patch I replied
yesterday to Mina seems acceptable?
Thanks,
Jason
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v1] page_pool: avoid infinite loop to schedule delayed worker
2025-02-12 19:24 ` Mina Almasry
@ 2025-02-12 23:38 ` Jason Xing
2025-02-13 0:38 ` Mina Almasry
0 siblings, 1 reply; 15+ messages in thread
From: Jason Xing @ 2025-02-12 23:38 UTC (permalink / raw)
To: Mina Almasry
Cc: davem, edumazet, kuba, pabeni, hawk, ilias.apalodimas, horms,
netdev
On Thu, Feb 13, 2025 at 3:24 AM Mina Almasry <almasrymina@google.com> wrote:
>
> On Tue, Feb 11, 2025 at 7:14 PM Jason Xing <kerneljasonxing@gmail.com> wrote:
> >
> > On Wed, Feb 12, 2025 at 10:37 AM Mina Almasry <almasrymina@google.com> wrote:
> > >
> > > On Mon, Feb 10, 2025 at 5:10 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
> > > >
> > > > If the buggy driver causes the inflight less than 0 [1] and warns
> > >
> > > How does a buggy driver trigger this?
> >
> > We're still reproducing and investigating. With a certain version of
> > driver + XDP installed, we have a very slight chance to see this
> > happening.
> >
> > >
> > > > us in page_pool_inflight(), it means we should not expect the
> > > > whole page_pool to get back to work normally.
> > > >
> > > > We noticed the kworker is waken up repeatedly and infinitely[1]
> > > > in production. If the page pool detect the error happening,
> > > > probably letting it go is a better way and do not flood the
> > > > var log messages. This patch mitigates the adverse effect.
> > > >
> > > > [1]
> > > > [Mon Feb 10 20:36:11 2025] ------------[ cut here ]------------
> > > > [Mon Feb 10 20:36:11 2025] Negative(-51446) inflight packet-pages
> > > > ...
> > > > [Mon Feb 10 20:36:11 2025] Call Trace:
> > > > [Mon Feb 10 20:36:11 2025] page_pool_release_retry+0x23/0x70
> > > > [Mon Feb 10 20:36:11 2025] process_one_work+0x1b1/0x370
> > > > [Mon Feb 10 20:36:11 2025] worker_thread+0x37/0x3a0
> > > > [Mon Feb 10 20:36:11 2025] kthread+0x11a/0x140
> > > > [Mon Feb 10 20:36:11 2025] ? process_one_work+0x370/0x370
> > > > [Mon Feb 10 20:36:11 2025] ? __kthread_cancel_work+0x40/0x40
> > > > [Mon Feb 10 20:36:11 2025] ret_from_fork+0x35/0x40
> > > > [Mon Feb 10 20:36:11 2025] ---[ end trace ebffe800f33e7e34 ]---
> > > >
> > > > Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
> > > > ---
> > > > net/core/page_pool.c | 2 +-
> > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > > > index 1c6fec08bc43..8e9f5801aabb 100644
> > > > --- a/net/core/page_pool.c
> > > > +++ b/net/core/page_pool.c
> > > > @@ -1167,7 +1167,7 @@ void page_pool_destroy(struct page_pool *pool)
> > > > page_pool_disable_direct_recycling(pool);
> > > > page_pool_free_frag(pool);
> > > >
> > > > - if (!page_pool_release(pool))
> > > > + if (page_pool_release(pool) <= 0)
> > > > return;
> > >
> > > Isn't it the condition in page_pool_release_retry() that you want. to
> > > modify? That is the one that handles whether the worker keeps spinning
> > > no?
> >
> > Right, do you mean this patch?
> > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > index 8e9f5801aabb..7dde3bd5f275 100644
> > --- a/net/core/page_pool.c
> > +++ b/net/core/page_pool.c
> > @@ -1112,7 +1112,7 @@ static void page_pool_release_retry(struct
> > work_struct *wq)
> > int inflight;
> >
> > inflight = page_pool_release(pool);
> > - if (!inflight)
> > + if (inflight < 0)
> > return;
> >
> > It has the same behaviour as the current patch does. I thought we
> > could stop it earlier.
> >
>
> Yes I mean this.
I'm going to post the above diff patch in V2. Am I understanding right?
>
> > >
> > > I also wonder also whether if the check in page_pool_release() itself
> > > needs to be:
> > >
> > > if (inflight < 0)
> > > __page_pool_destroy();
> > >
> > > otherwise the pool will never be destroyed no?
> >
> > I'm worried this would have a more severe impact because it's
> > uncertain if in this case the page pool can be released? :(
> >
>
> Makes sense indeed. We can't be sure if the page_pool has already been
> destroyed if inflight < 0. Ignore the earlier suggestion from me,
> thanks.
Thanks for the review.
Thanks,
Jason
>
> --
> Thanks,
> Mina
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v1] page_pool: avoid infinite loop to schedule delayed worker
2025-02-12 23:38 ` Jason Xing
@ 2025-02-13 0:38 ` Mina Almasry
2025-02-13 0:43 ` Jason Xing
0 siblings, 1 reply; 15+ messages in thread
From: Mina Almasry @ 2025-02-13 0:38 UTC (permalink / raw)
To: Jason Xing
Cc: davem, edumazet, kuba, pabeni, hawk, ilias.apalodimas, horms,
netdev
On Wed, Feb 12, 2025 at 3:39 PM Jason Xing <kerneljasonxing@gmail.com> wrote:
>
> On Thu, Feb 13, 2025 at 3:24 AM Mina Almasry <almasrymina@google.com> wrote:
> >
> > On Tue, Feb 11, 2025 at 7:14 PM Jason Xing <kerneljasonxing@gmail.com> wrote:
> > >
> > > On Wed, Feb 12, 2025 at 10:37 AM Mina Almasry <almasrymina@google.com> wrote:
> > > >
> > > > On Mon, Feb 10, 2025 at 5:10 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
> > > > >
> > > > > If the buggy driver causes the inflight less than 0 [1] and warns
> > > >
> > > > How does a buggy driver trigger this?
> > >
> > > We're still reproducing and investigating. With a certain version of
> > > driver + XDP installed, we have a very slight chance to see this
> > > happening.
> > >
> > > >
> > > > > us in page_pool_inflight(), it means we should not expect the
> > > > > whole page_pool to get back to work normally.
> > > > >
> > > > > We noticed the kworker is waken up repeatedly and infinitely[1]
> > > > > in production. If the page pool detect the error happening,
> > > > > probably letting it go is a better way and do not flood the
> > > > > var log messages. This patch mitigates the adverse effect.
> > > > >
> > > > > [1]
> > > > > [Mon Feb 10 20:36:11 2025] ------------[ cut here ]------------
> > > > > [Mon Feb 10 20:36:11 2025] Negative(-51446) inflight packet-pages
> > > > > ...
> > > > > [Mon Feb 10 20:36:11 2025] Call Trace:
> > > > > [Mon Feb 10 20:36:11 2025] page_pool_release_retry+0x23/0x70
> > > > > [Mon Feb 10 20:36:11 2025] process_one_work+0x1b1/0x370
> > > > > [Mon Feb 10 20:36:11 2025] worker_thread+0x37/0x3a0
> > > > > [Mon Feb 10 20:36:11 2025] kthread+0x11a/0x140
> > > > > [Mon Feb 10 20:36:11 2025] ? process_one_work+0x370/0x370
> > > > > [Mon Feb 10 20:36:11 2025] ? __kthread_cancel_work+0x40/0x40
> > > > > [Mon Feb 10 20:36:11 2025] ret_from_fork+0x35/0x40
> > > > > [Mon Feb 10 20:36:11 2025] ---[ end trace ebffe800f33e7e34 ]---
> > > > >
> > > > > Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
> > > > > ---
> > > > > net/core/page_pool.c | 2 +-
> > > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > > > > index 1c6fec08bc43..8e9f5801aabb 100644
> > > > > --- a/net/core/page_pool.c
> > > > > +++ b/net/core/page_pool.c
> > > > > @@ -1167,7 +1167,7 @@ void page_pool_destroy(struct page_pool *pool)
> > > > > page_pool_disable_direct_recycling(pool);
> > > > > page_pool_free_frag(pool);
> > > > >
> > > > > - if (!page_pool_release(pool))
> > > > > + if (page_pool_release(pool) <= 0)
> > > > > return;
> > > >
> > > > Isn't it the condition in page_pool_release_retry() that you want. to
> > > > modify? That is the one that handles whether the worker keeps spinning
> > > > no?
> > >
> > > Right, do you mean this patch?
> > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > > index 8e9f5801aabb..7dde3bd5f275 100644
> > > --- a/net/core/page_pool.c
> > > +++ b/net/core/page_pool.c
> > > @@ -1112,7 +1112,7 @@ static void page_pool_release_retry(struct
> > > work_struct *wq)
> > > int inflight;
> > >
> > > inflight = page_pool_release(pool);
> > > - if (!inflight)
> > > + if (inflight < 0)
> > > return;
> > >
> > > It has the same behaviour as the current patch does. I thought we
> > > could stop it earlier.
> > >
> >
> > Yes I mean this.
>
> I'm going to post the above diff patch in V2. Am I understanding right?
>
Please also add Jakub's request, i.e. a code comment indicating why we
return early.
Also, now that I look more closely, we want to make sure we get at
least one warning when inflight goes negative, so, maybe something
like (untested, may need some iteration):
```
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 2ea8041aba7e..6d62ea45571b 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -1113,13 +1113,12 @@ static void page_pool_release_retry(struct
work_struct *wq)
int inflight;
inflight = page_pool_release(pool);
- if (!inflight)
- return;
/* Periodic warning for page pools the user can't see */
netdev = READ_ONCE(pool->slow.netdev);
if (time_after_eq(jiffies, pool->defer_warn) &&
- (!netdev || netdev == NET_PTR_POISON)) {
+ (!netdev || netdev == NET_PTR_POISON) &&
+ inflight != 0) {
int sec = (s32)((u32)jiffies - (u32)pool->defer_start) / HZ;
pr_warn("%s() stalled pool shutdown: id %u, %d
inflight %d sec devmem=%d\n",
@@ -1128,7 +1127,15 @@ static void page_pool_release_retry(struct
work_struct *wq)
pool->defer_warn = jiffies + DEFER_WARN_INTERVAL;
}
- /* Still not ready to be disconnected, retry later */
+ /* In rare cases, a driver bug may cause inflight to go negative. Don't
+ * reschedule release if inflight is 0 or negative.
+ * - If 0, the page_pool has been destroyed
+ * - if negative, we will never recover
+ * in both cases no reschedule necessary.
+ */
+ if (inflight < 1)
+ return;
+
schedule_delayed_work(&pool->release_dw, DEFER_TIME);
}
```
--
Thanks,
Mina
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v1] page_pool: avoid infinite loop to schedule delayed worker
2025-02-13 0:38 ` Mina Almasry
@ 2025-02-13 0:43 ` Jason Xing
2025-02-13 0:51 ` Mina Almasry
0 siblings, 1 reply; 15+ messages in thread
From: Jason Xing @ 2025-02-13 0:43 UTC (permalink / raw)
To: Mina Almasry
Cc: davem, edumazet, kuba, pabeni, hawk, ilias.apalodimas, horms,
netdev
On Thu, Feb 13, 2025 at 8:38 AM Mina Almasry <almasrymina@google.com> wrote:
>
> On Wed, Feb 12, 2025 at 3:39 PM Jason Xing <kerneljasonxing@gmail.com> wrote:
> >
> > On Thu, Feb 13, 2025 at 3:24 AM Mina Almasry <almasrymina@google.com> wrote:
> > >
> > > On Tue, Feb 11, 2025 at 7:14 PM Jason Xing <kerneljasonxing@gmail.com> wrote:
> > > >
> > > > On Wed, Feb 12, 2025 at 10:37 AM Mina Almasry <almasrymina@google.com> wrote:
> > > > >
> > > > > On Mon, Feb 10, 2025 at 5:10 AM Jason Xing <kerneljasonxing@gmail.com> wrote:
> > > > > >
> > > > > > If the buggy driver causes the inflight less than 0 [1] and warns
> > > > >
> > > > > How does a buggy driver trigger this?
> > > >
> > > > We're still reproducing and investigating. With a certain version of
> > > > driver + XDP installed, we have a very slight chance to see this
> > > > happening.
> > > >
> > > > >
> > > > > > us in page_pool_inflight(), it means we should not expect the
> > > > > > whole page_pool to get back to work normally.
> > > > > >
> > > > > > We noticed the kworker is waken up repeatedly and infinitely[1]
> > > > > > in production. If the page pool detect the error happening,
> > > > > > probably letting it go is a better way and do not flood the
> > > > > > var log messages. This patch mitigates the adverse effect.
> > > > > >
> > > > > > [1]
> > > > > > [Mon Feb 10 20:36:11 2025] ------------[ cut here ]------------
> > > > > > [Mon Feb 10 20:36:11 2025] Negative(-51446) inflight packet-pages
> > > > > > ...
> > > > > > [Mon Feb 10 20:36:11 2025] Call Trace:
> > > > > > [Mon Feb 10 20:36:11 2025] page_pool_release_retry+0x23/0x70
> > > > > > [Mon Feb 10 20:36:11 2025] process_one_work+0x1b1/0x370
> > > > > > [Mon Feb 10 20:36:11 2025] worker_thread+0x37/0x3a0
> > > > > > [Mon Feb 10 20:36:11 2025] kthread+0x11a/0x140
> > > > > > [Mon Feb 10 20:36:11 2025] ? process_one_work+0x370/0x370
> > > > > > [Mon Feb 10 20:36:11 2025] ? __kthread_cancel_work+0x40/0x40
> > > > > > [Mon Feb 10 20:36:11 2025] ret_from_fork+0x35/0x40
> > > > > > [Mon Feb 10 20:36:11 2025] ---[ end trace ebffe800f33e7e34 ]---
> > > > > >
> > > > > > Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
> > > > > > ---
> > > > > > net/core/page_pool.c | 2 +-
> > > > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > > > >
> > > > > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > > > > > index 1c6fec08bc43..8e9f5801aabb 100644
> > > > > > --- a/net/core/page_pool.c
> > > > > > +++ b/net/core/page_pool.c
> > > > > > @@ -1167,7 +1167,7 @@ void page_pool_destroy(struct page_pool *pool)
> > > > > > page_pool_disable_direct_recycling(pool);
> > > > > > page_pool_free_frag(pool);
> > > > > >
> > > > > > - if (!page_pool_release(pool))
> > > > > > + if (page_pool_release(pool) <= 0)
> > > > > > return;
> > > > >
> > > > > Isn't it the condition in page_pool_release_retry() that you want. to
> > > > > modify? That is the one that handles whether the worker keeps spinning
> > > > > no?
> > > >
> > > > Right, do you mean this patch?
> > > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > > > index 8e9f5801aabb..7dde3bd5f275 100644
> > > > --- a/net/core/page_pool.c
> > > > +++ b/net/core/page_pool.c
> > > > @@ -1112,7 +1112,7 @@ static void page_pool_release_retry(struct
> > > > work_struct *wq)
> > > > int inflight;
> > > >
> > > > inflight = page_pool_release(pool);
> > > > - if (!inflight)
> > > > + if (inflight < 0)
> > > > return;
> > > >
> > > > It has the same behaviour as the current patch does. I thought we
> > > > could stop it earlier.
> > > >
> > >
> > > Yes I mean this.
> >
> > I'm going to post the above diff patch in V2. Am I understanding right?
> >
>
> Please also add Jakub's request, i.e. a code comment indicating why we
> return early.
Got it.
>
> Also, now that I look more closely, we want to make sure we get at
> least one warning when inflight goes negative, so, maybe something
> like (untested, may need some iteration):
Good suggestion.
>
> ```
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index 2ea8041aba7e..6d62ea45571b 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -1113,13 +1113,12 @@ static void page_pool_release_retry(struct
> work_struct *wq)
> int inflight;
>
> inflight = page_pool_release(pool);
> - if (!inflight)
> - return;
>
> /* Periodic warning for page pools the user can't see */
> netdev = READ_ONCE(pool->slow.netdev);
> if (time_after_eq(jiffies, pool->defer_warn) &&
> - (!netdev || netdev == NET_PTR_POISON)) {
> + (!netdev || netdev == NET_PTR_POISON) &&
> + inflight != 0) {
> int sec = (s32)((u32)jiffies - (u32)pool->defer_start) / HZ;
>
> pr_warn("%s() stalled pool shutdown: id %u, %d
> inflight %d sec devmem=%d\n",
> @@ -1128,7 +1127,15 @@ static void page_pool_release_retry(struct
> work_struct *wq)
> pool->defer_warn = jiffies + DEFER_WARN_INTERVAL;
> }
>
> - /* Still not ready to be disconnected, retry later */
> + /* In rare cases, a driver bug may cause inflight to go negative. Don't
> + * reschedule release if inflight is 0 or negative.
> + * - If 0, the page_pool has been destroyed
> + * - if negative, we will never recover
> + * in both cases no reschedule necessary.
> + */
> + if (inflight < 1)
Maybe I would change the above to 'inflight <= 0' which looks more
obvious at the first glance? :)
> + return;
> +
> schedule_delayed_work(&pool->release_dw, DEFER_TIME);
> }
> ```
I will test it. Thanks, Mina.
Thanks,
Jason
>
> --
> Thanks,
> Mina
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v1] page_pool: avoid infinite loop to schedule delayed worker
2025-02-13 0:43 ` Jason Xing
@ 2025-02-13 0:51 ` Mina Almasry
2025-02-13 1:01 ` Jason Xing
0 siblings, 1 reply; 15+ messages in thread
From: Mina Almasry @ 2025-02-13 0:51 UTC (permalink / raw)
To: Jason Xing
Cc: davem, edumazet, kuba, pabeni, hawk, ilias.apalodimas, horms,
netdev
On Wed, Feb 12, 2025 at 4:44 PM Jason Xing <kerneljasonxing@gmail.com> wrote:
> >
> > ```
> > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > index 2ea8041aba7e..6d62ea45571b 100644
> > --- a/net/core/page_pool.c
> > +++ b/net/core/page_pool.c
> > @@ -1113,13 +1113,12 @@ static void page_pool_release_retry(struct
> > work_struct *wq)
> > int inflight;
> >
> > inflight = page_pool_release(pool);
> > - if (!inflight)
> > - return;
> >
> > /* Periodic warning for page pools the user can't see */
> > netdev = READ_ONCE(pool->slow.netdev);
> > if (time_after_eq(jiffies, pool->defer_warn) &&
> > - (!netdev || netdev == NET_PTR_POISON)) {
> > + (!netdev || netdev == NET_PTR_POISON) &&
> > + inflight != 0) {
> > int sec = (s32)((u32)jiffies - (u32)pool->defer_start) / HZ;
> >
> > pr_warn("%s() stalled pool shutdown: id %u, %d
> > inflight %d sec devmem=%d\n",
> > @@ -1128,7 +1127,15 @@ static void page_pool_release_retry(struct
> > work_struct *wq)
> > pool->defer_warn = jiffies + DEFER_WARN_INTERVAL;
> > }
> >
> > - /* Still not ready to be disconnected, retry later */
> > + /* In rare cases, a driver bug may cause inflight to go negative. Don't
> > + * reschedule release if inflight is 0 or negative.
> > + * - If 0, the page_pool has been destroyed
> > + * - if negative, we will never recover
> > + * in both cases no reschedule necessary.
> > + */
> > + if (inflight < 1)
>
> Maybe I would change the above to 'inflight <= 0' which looks more
> obvious at the first glance? :)
>
Good catch. Yes.
Also, write "no reschedule is necessary" on the last line in the
comment (add "is")
--
Thanks,
Mina
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH net-next v1] page_pool: avoid infinite loop to schedule delayed worker
2025-02-13 0:51 ` Mina Almasry
@ 2025-02-13 1:01 ` Jason Xing
0 siblings, 0 replies; 15+ messages in thread
From: Jason Xing @ 2025-02-13 1:01 UTC (permalink / raw)
To: Mina Almasry
Cc: davem, edumazet, kuba, pabeni, hawk, ilias.apalodimas, horms,
netdev
On Thu, Feb 13, 2025 at 8:51 AM Mina Almasry <almasrymina@google.com> wrote:
>
> On Wed, Feb 12, 2025 at 4:44 PM Jason Xing <kerneljasonxing@gmail.com> wrote:
> > >
> > > ```
> > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > > index 2ea8041aba7e..6d62ea45571b 100644
> > > --- a/net/core/page_pool.c
> > > +++ b/net/core/page_pool.c
> > > @@ -1113,13 +1113,12 @@ static void page_pool_release_retry(struct
> > > work_struct *wq)
> > > int inflight;
> > >
> > > inflight = page_pool_release(pool);
> > > - if (!inflight)
> > > - return;
> > >
> > > /* Periodic warning for page pools the user can't see */
> > > netdev = READ_ONCE(pool->slow.netdev);
> > > if (time_after_eq(jiffies, pool->defer_warn) &&
> > > - (!netdev || netdev == NET_PTR_POISON)) {
> > > + (!netdev || netdev == NET_PTR_POISON) &&
> > > + inflight != 0) {
> > > int sec = (s32)((u32)jiffies - (u32)pool->defer_start) / HZ;
> > >
> > > pr_warn("%s() stalled pool shutdown: id %u, %d
> > > inflight %d sec devmem=%d\n",
> > > @@ -1128,7 +1127,15 @@ static void page_pool_release_retry(struct
> > > work_struct *wq)
> > > pool->defer_warn = jiffies + DEFER_WARN_INTERVAL;
> > > }
> > >
> > > - /* Still not ready to be disconnected, retry later */
> > > + /* In rare cases, a driver bug may cause inflight to go negative. Don't
> > > + * reschedule release if inflight is 0 or negative.
> > > + * - If 0, the page_pool has been destroyed
> > > + * - if negative, we will never recover
> > > + * in both cases no reschedule necessary.
> > > + */
> > > + if (inflight < 1)
> >
> > Maybe I would change the above to 'inflight <= 0' which looks more
> > obvious at the first glance? :)
> >
>
> Good catch. Yes.
>
> Also, write "no reschedule is necessary" on the last line in the
> comment (add "is")
Will adjust it. Thanks!
>
> --
> Thanks,
> Mina
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-02-13 1:01 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-10 13:09 [PATCH net-next v1] page_pool: avoid infinite loop to schedule delayed worker Jason Xing
2025-02-12 2:37 ` Mina Almasry
2025-02-12 2:46 ` Jakub Kicinski
2025-02-12 3:20 ` Jason Xing
2025-02-12 3:43 ` Jakub Kicinski
2025-02-12 4:38 ` Jason Xing
2025-02-12 18:53 ` Jakub Kicinski
2025-02-12 23:37 ` Jason Xing
2025-02-12 3:14 ` Jason Xing
2025-02-12 19:24 ` Mina Almasry
2025-02-12 23:38 ` Jason Xing
2025-02-13 0:38 ` Mina Almasry
2025-02-13 0:43 ` Jason Xing
2025-02-13 0:51 ` Mina Almasry
2025-02-13 1:01 ` Jason Xing
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).