All of lore.kernel.org
 help / color / mirror / Atom feed
* Report on abnormal behavior of "page_pool" API
@ 2024-01-31  7:31 Justin Lai
  2024-01-31  8:56 ` Yunsheng Lin
  2024-01-31 19:15 ` Gerhard Engleder
  0 siblings, 2 replies; 5+ messages in thread
From: Justin Lai @ 2024-01-31  7:31 UTC (permalink / raw)
  To: netdev@vger.kernel.org

To whom it may concern,

I hope this email finds you well. I am writing to report a behavior
which seems to be abnormal.

When I remove the module, I call page_pool_destroy() to release the
page_pool, but this message appears, page_pool_release_retry() stalled 
pool shutdown 1024 inflight 120 sec. Then I tried to return the page to
page_pool before calling page_pool_destroy(), so I called
page_pool_put_full_page() first, but after doing so, this message was
printed, page_pool_empty_ring() page_pool refcnt 0 violation, and the
computer crashed.

I would like to ask what could be causing this and how I should fix it.

The information on my working environment is: Ubuntu23.10,
linux kernel 6.4, 6.5, 6.6

Thank you for your time and efforts, I am looking forward to your reply.

Best regards,
Justin

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Report on abnormal behavior of "page_pool" API
  2024-01-31  7:31 Report on abnormal behavior of "page_pool" API Justin Lai
@ 2024-01-31  8:56 ` Yunsheng Lin
  2024-02-01  2:37   ` Justin Lai
  2024-01-31 19:15 ` Gerhard Engleder
  1 sibling, 1 reply; 5+ messages in thread
From: Yunsheng Lin @ 2024-01-31  8:56 UTC (permalink / raw)
  To: Justin Lai, netdev@vger.kernel.org

On 2024/1/31 15:31, Justin Lai wrote:
> To whom it may concern,
> 
> I hope this email finds you well. I am writing to report a behavior
> which seems to be abnormal.
> 
> When I remove the module, I call page_pool_destroy() to release the

Which module?

> page_pool, but this message appears, page_pool_release_retry() stalled 
> pool shutdown 1024 inflight 120 sec. Then I tried to return the page to
> page_pool before calling page_pool_destroy(), so I called
> page_pool_put_full_page() first, but after doing so, this message was
> printed, page_pool_empty_ring() page_pool refcnt 0 violation, and the

As we have "page_ref_count(page) == 1" checking to allow recycling page
in pool->ring:
https://elixir.bootlin.com/linux/v6.8-rc2/source/net/core/page_pool.c#L654

It seems somebody is still using the page and manipulating _refcount while
the page is sitting in the pool->ring?


> computer crashed.
> 
> I would like to ask what could be causing this and how I should fix it.

Not sure if you read the below doc for page_pool, understanding the internal
detail and the API usages may help you debuging the problem:
Documentation/networking/page_pool.rst

> 
> The information on my working environment is: Ubuntu23.10,
> linux kernel 6.4, 6.5, 6.6
> 
> Thank you for your time and efforts, I am looking forward to your reply.
> 
> Best regards,
> Justin
> 
> .
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Report on abnormal behavior of "page_pool" API
  2024-01-31  7:31 Report on abnormal behavior of "page_pool" API Justin Lai
  2024-01-31  8:56 ` Yunsheng Lin
@ 2024-01-31 19:15 ` Gerhard Engleder
  2024-02-01  2:42   ` Justin Lai
  1 sibling, 1 reply; 5+ messages in thread
From: Gerhard Engleder @ 2024-01-31 19:15 UTC (permalink / raw)
  To: Justin Lai, netdev@vger.kernel.org

On 31.01.24 08:31, Justin Lai wrote:
> To whom it may concern,
> 
> I hope this email finds you well. I am writing to report a behavior
> which seems to be abnormal.
> 
> When I remove the module, I call page_pool_destroy() to release the
> page_pool, but this message appears, page_pool_release_retry() stalled
> pool shutdown 1024 inflight 120 sec.

I had a problem with the same message:
https://lore.kernel.org/netdev/20230311213709.42625-1-gerhard@engleder-embedded.com/

Could it be the same problem?

Gerhard

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: Report on abnormal behavior of "page_pool" API
  2024-01-31  8:56 ` Yunsheng Lin
@ 2024-02-01  2:37   ` Justin Lai
  0 siblings, 0 replies; 5+ messages in thread
From: Justin Lai @ 2024-02-01  2:37 UTC (permalink / raw)
  To: Yunsheng Lin, netdev@vger.kernel.org

> On 2024/1/31 15:31, Justin Lai wrote:
> > To whom it may concern,
> >
> > I hope this email finds you well. I am writing to report a behavior
> > which seems to be abnormal.
> >
> > When I remove the module, I call page_pool_destroy() to release the
> 
> Which module?
The PCIe driver
> 
> > page_pool, but this message appears, page_pool_release_retry() stalled
> > pool shutdown 1024 inflight 120 sec. Then I tried to return the page
> > to page_pool before calling page_pool_destroy(), so I called
> > page_pool_put_full_page() first, but after doing so, this message was
> > printed, page_pool_empty_ring() page_pool refcnt 0 violation, and the
> 
> As we have "page_ref_count(page) == 1" checking to allow recycling page in
> pool->ring:
> https://elixir.bootlin.com/linux/v6.8-rc2/source/net/core/page_pool.c#L654
> 
> It seems somebody is still using the page and manipulating _refcount while the
> page is sitting in the pool->ring?

I will confirm this part again, thank you for your reply.
> 
> 
> > computer crashed.
> >
> > I would like to ask what could be causing this and how I should fix it.
> 
> Not sure if you read the below doc for page_pool, understanding the internal
> detail and the API usages may help you debuging the problem:
> Documentation/networking/page_pool.rst

Thank you for your reply. I have read this document, but I will study it again.
> 
> >
> > The information on my working environment is: Ubuntu23.10, linux
> > kernel 6.4, 6.5, 6.6
> >
> > Thank you for your time and efforts, I am looking forward to your reply.
> >
> > Best regards,
> > Justin
> >
> > .
> >

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: Report on abnormal behavior of "page_pool" API
  2024-01-31 19:15 ` Gerhard Engleder
@ 2024-02-01  2:42   ` Justin Lai
  0 siblings, 0 replies; 5+ messages in thread
From: Justin Lai @ 2024-02-01  2:42 UTC (permalink / raw)
  To: Gerhard Engleder, netdev@vger.kernel.org

> On 31.01.24 08:31, Justin Lai wrote:
> > To whom it may concern,
> >
> > I hope this email finds you well. I am writing to report a behavior
> > which seems to be abnormal.
> >
> > When I remove the module, I call page_pool_destroy() to release the
> > page_pool, but this message appears, page_pool_release_retry() stalled
> > pool shutdown 1024 inflight 120 sec.
> 
> I had a problem with the same message:
> https://lore.kernel.org/netdev/20230311213709.42625-1-gerhard@engleder-e
> mbedded.com/
> 
> Could it be the same problem?

It doesn't seem to be the same. Although the error message is similar, I am not using the xdp api.
> 
> Gerhard


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-02-01  2:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-01-31  7:31 Report on abnormal behavior of "page_pool" API Justin Lai
2024-01-31  8:56 ` Yunsheng Lin
2024-02-01  2:37   ` Justin Lai
2024-01-31 19:15 ` Gerhard Engleder
2024-02-01  2:42   ` Justin Lai

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.