* [PATCH net-next] net: Don't return pfmemalloc pages to the page pool.
@ 2018-12-19 20:06 Jonathan Lemon
2018-12-20 13:03 ` Jesper Dangaard Brouer
0 siblings, 1 reply; 8+ messages in thread
From: Jonathan Lemon @ 2018-12-19 20:06 UTC (permalink / raw)
To: netdev; +Cc: brouer, Jonathan Lemon
Return pfmemalloc pages back to the page allocator, instead of holding them
in the page pool. While here, also use the __page_pool_return_page() API.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
---
net/core/page_pool.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 43a932cb609b..091007ff14a3 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -233,7 +233,7 @@ void __page_pool_put_page(struct page_pool *pool,
*
* refcnt == 1 means page_pool owns page, and can recycle it.
*/
- if (likely(page_ref_count(page) == 1)) {
+ if (likely(page_ref_count(page) == 1 && !page_is_pfmemalloc(page))) {
/* Read barrier done in page_ref_count / READ_ONCE */
if (allow_direct && in_serving_softirq())
@@ -259,8 +259,7 @@ void __page_pool_put_page(struct page_pool *pool,
* doing refcnt based recycle tricks, meaning another process
* will be invoking put_page.
*/
- __page_pool_clean_page(pool, page);
- put_page(page);
+ __page_pool_return_page(pool, page);
}
EXPORT_SYMBOL(__page_pool_put_page);
--
2.17.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH net-next] net: Don't return pfmemalloc pages to the page pool.
2018-12-19 20:06 [PATCH net-next] net: Don't return pfmemalloc pages to the page pool Jonathan Lemon
@ 2018-12-20 13:03 ` Jesper Dangaard Brouer
2018-12-20 22:11 ` Jonathan Lemon
0 siblings, 1 reply; 8+ messages in thread
From: Jesper Dangaard Brouer @ 2018-12-20 13:03 UTC (permalink / raw)
To: Jonathan Lemon; +Cc: netdev, brouer
On Wed, 19 Dec 2018 12:06:51 -0800
Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
> Return pfmemalloc pages back to the page allocator, instead of
> holding them in the page pool.
Have you experience this issue in practice or is it theory?
> While here, also use the __page_pool_return_page() API.
Don't combine several unrelated changed in one patch.
> Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
> ---
> net/core/page_pool.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index 43a932cb609b..091007ff14a3 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -233,7 +233,7 @@ void __page_pool_put_page(struct page_pool *pool,
> *
> * refcnt == 1 means page_pool owns page, and can recycle it.
> */
> - if (likely(page_ref_count(page) == 1)) {
> + if (likely(page_ref_count(page) == 1 && !page_is_pfmemalloc(page))) {
I don't like adding this in the hot-path. Instead we could move this
to the page alloc slow-path, and reject allocating pages with
pgmemalloc in the first place.
> /* Read barrier done in page_ref_count / READ_ONCE */
>
> if (allow_direct && in_serving_softirq())
> @@ -259,8 +259,7 @@ void __page_pool_put_page(struct page_pool *pool,
> * doing refcnt based recycle tricks, meaning another process
> * will be invoking put_page.
> */
> - __page_pool_clean_page(pool, page);
> - put_page(page);
> + __page_pool_return_page(pool, page);
> }
> EXPORT_SYMBOL(__page_pool_put_page);
>
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net-next] net: Don't return pfmemalloc pages to the page pool.
2018-12-20 13:03 ` Jesper Dangaard Brouer
@ 2018-12-20 22:11 ` Jonathan Lemon
2018-12-20 23:41 ` Saeed Mahameed
2018-12-21 14:42 ` Jesper Dangaard Brouer
0 siblings, 2 replies; 8+ messages in thread
From: Jonathan Lemon @ 2018-12-20 22:11 UTC (permalink / raw)
To: Jesper Dangaard Brouer; +Cc: netdev
(Resending due to mailer issues)
On 20 Dec 2018, at 5:03, Jesper Dangaard Brouer wrote:
> On Wed, 19 Dec 2018 12:06:51 -0800
> Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
>
>> Return pfmemalloc pages back to the page allocator, instead of
>> holding them in the page pool.
>
> Have you experience this issue in practice or is it theory?
We're seeing the mlx5 driver use pfmemalloc pages with 4.11, and then
return them
back to the page allocator. (it's triggering the
mlx5e_page_is_reserved() test).
The page pool code isn't in production use, but the code paths appear
identical.
>
>> While here, also use the __page_pool_return_page() API.
>
> Don't combine several unrelated changed in one patch.
Okay - will send as 2 separate patches
>> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
>> index 43a932cb609b..091007ff14a3 100644
>> --- a/net/core/page_pool.c
>> +++ b/net/core/page_pool.c
>> @@ -233,7 +233,7 @@ void __page_pool_put_page(struct page_pool *pool,
>> *
>> * refcnt == 1 means page_pool owns page, and can recycle it.
>> */
>> - if (likely(page_ref_count(page) == 1)) {
>> + if (likely(page_ref_count(page) == 1 && !page_is_pfmemalloc(page)))
>> {
>
> I don't like adding this in the hot-path. Instead we could move this
> to the page alloc slow-path, and reject allocating pages with
> pgmemalloc in the first place.
No real objection to that - but then why bother with pfmemalloc? If the
driver can't
obtain pages for emergency use, then they might as well not exist.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net-next] net: Don't return pfmemalloc pages to the page pool.
2018-12-20 22:11 ` Jonathan Lemon
@ 2018-12-20 23:41 ` Saeed Mahameed
2018-12-21 0:56 ` Jonathan Lemon
2018-12-21 14:42 ` Jesper Dangaard Brouer
1 sibling, 1 reply; 8+ messages in thread
From: Saeed Mahameed @ 2018-12-20 23:41 UTC (permalink / raw)
To: brouer@redhat.com, jonathan.lemon@gmail.com; +Cc: netdev@vger.kernel.org
On Thu, 2018-12-20 at 14:11 -0800, Jonathan Lemon wrote:
> (Resending due to mailer issues)
>
> On 20 Dec 2018, at 5:03, Jesper Dangaard Brouer wrote:
>
> > On Wed, 19 Dec 2018 12:06:51 -0800
> > Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
> >
> > > Return pfmemalloc pages back to the page allocator, instead of
> > > holding them in the page pool.
> >
> > Have you experience this issue in practice or is it theory?
>
> We're seeing the mlx5 driver use pfmemalloc pages with 4.11, and
> then
> return them
> back to the page allocator. (it's triggering the
> mlx5e_page_is_reserved() test).
> The page pool code isn't in production use, but the code paths
> appear
> identical.
>
>
> > > While here, also use the __page_pool_return_page() API.
> >
> > Don't combine several unrelated changed in one patch.
>
> Okay - will send as 2 separate patches
>
>
> > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > > index 43a932cb609b..091007ff14a3 100644
> > > --- a/net/core/page_pool.c
> > > +++ b/net/core/page_pool.c
> > > @@ -233,7 +233,7 @@ void __page_pool_put_page(struct page_pool
> > > *pool,
> > > *
> > > * refcnt == 1 means page_pool owns page, and can recycle it.
> > > */
> > > - if (likely(page_ref_count(page) == 1)) {
> > > + if (likely(page_ref_count(page) == 1 &&
> > > !page_is_pfmemalloc(page)))
> > > {
> >
I think this is wrong, if refcount is 1, then this page belongs to
pagepool and you must enter this statement's true block, and test
page_is_pfmemalloc inside (mark it unlikely),
to return a pfmemalloc page, you need to call __page_pool_return_page()
to dma_unmap and other cleanups if required.
> > I don't like adding this in the hot-path. Instead we could move
> > this
> > to the page alloc slow-path, and reject allocating pages with
> > pgmemalloc in the first place.
>
I prefer to enhance the above solution, no need to risk pagepool users
going out of memory under emergencies.
> No real objection to that - but then why bother with pfmemalloc? If
> the
> driver can't
> obtain pages for emergency use, then they might as well not exist.
>
> --
> Jonathan
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net-next] net: Don't return pfmemalloc pages to the page pool.
2018-12-20 23:41 ` Saeed Mahameed
@ 2018-12-21 0:56 ` Jonathan Lemon
2018-12-21 2:13 ` Saeed Mahameed
0 siblings, 1 reply; 8+ messages in thread
From: Jonathan Lemon @ 2018-12-21 0:56 UTC (permalink / raw)
To: Saeed Mahameed; +Cc: brouer, netdev
On 20 Dec 2018, at 15:41, Saeed Mahameed wrote:
> On Thu, 2018-12-20 at 14:11 -0800, Jonathan Lemon wrote:
>> (Resending due to mailer issues)
>>
>> On 20 Dec 2018, at 5:03, Jesper Dangaard Brouer wrote:
>>
>>> On Wed, 19 Dec 2018 12:06:51 -0800
>>> Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
>>>
>>>> Return pfmemalloc pages back to the page allocator, instead of
>>>> holding them in the page pool.
>>>
>>> Have you experience this issue in practice or is it theory?
>>
>> We're seeing the mlx5 driver use pfmemalloc pages with 4.11, and
>> then
>> return them
>> back to the page allocator. (it's triggering the
>> mlx5e_page_is_reserved() test).
>> The page pool code isn't in production use, but the code paths
>> appear
>> identical.
>>
>>
>>>> While here, also use the __page_pool_return_page() API.
>>>
>>> Don't combine several unrelated changed in one patch.
>>
>> Okay - will send as 2 separate patches
>>
>>
>>>> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
>>>> index 43a932cb609b..091007ff14a3 100644
>>>> --- a/net/core/page_pool.c
>>>> +++ b/net/core/page_pool.c
>>>> @@ -233,7 +233,7 @@ void __page_pool_put_page(struct page_pool
>>>> *pool,
>>>> *
>>>> * refcnt == 1 means page_pool owns page, and can recycle it.
>>>> */
>>>> - if (likely(page_ref_count(page) == 1)) {
>>>> + if (likely(page_ref_count(page) == 1 &&
>>>> !page_is_pfmemalloc(page)))
>>>> {
>>>
>
> I think this is wrong, if refcount is 1, then this page belongs to
> pagepool and you must enter this statement's true block, and test
> page_is_pfmemalloc inside (mark it unlikely),
> to return a pfmemalloc page, you need to call __page_pool_return_page()
> to dma_unmap and other cleanups if required.
If the page belongs to the pool, but it isn't recyclable, the fallback
path is "__page_pool_return_page()". If it doesn't belong to the pool,
it goes to the non-XDP mode, which is also __page_pool_return_page().
Does your dislike stem from the fact that the "non-XDP" mode is taken
for the "refcount=1, pfmemalloc=T" case?
--
Jonathan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net-next] net: Don't return pfmemalloc pages to the page pool.
2018-12-21 0:56 ` Jonathan Lemon
@ 2018-12-21 2:13 ` Saeed Mahameed
2018-12-21 2:26 ` Saeed Mahameed
0 siblings, 1 reply; 8+ messages in thread
From: Saeed Mahameed @ 2018-12-21 2:13 UTC (permalink / raw)
To: jonathan.lemon@gmail.com; +Cc: netdev@vger.kernel.org, brouer@redhat.com
On Thu, 2018-12-20 at 16:56 -0800, Jonathan Lemon wrote:
> On 20 Dec 2018, at 15:41, Saeed Mahameed wrote:
>
> > On Thu, 2018-12-20 at 14:11 -0800, Jonathan Lemon wrote:
> > > (Resending due to mailer issues)
> > >
> > > On 20 Dec 2018, at 5:03, Jesper Dangaard Brouer wrote:
> > >
> > > > On Wed, 19 Dec 2018 12:06:51 -0800
> > > > Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
> > > >
> > > > > Return pfmemalloc pages back to the page allocator, instead
> > > > > of
> > > > > holding them in the page pool.
> > > >
> > > > Have you experience this issue in practice or is it theory?
> > >
> > > We're seeing the mlx5 driver use pfmemalloc pages with 4.11, and
> > > then
> > > return them
> > > back to the page allocator. (it's triggering the
> > > mlx5e_page_is_reserved() test).
> > > The page pool code isn't in production use, but the code paths
> > > appear
> > > identical.
> > >
> > >
> > > > > While here, also use the __page_pool_return_page() API.
> > > >
> > > > Don't combine several unrelated changed in one patch.
> > >
> > > Okay - will send as 2 separate patches
> > >
> > >
> > > > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > > > > index 43a932cb609b..091007ff14a3 100644
> > > > > --- a/net/core/page_pool.c
> > > > > +++ b/net/core/page_pool.c
> > > > > @@ -233,7 +233,7 @@ void __page_pool_put_page(struct
> > > > > page_pool
> > > > > *pool,
> > > > > *
> > > > > * refcnt == 1 means page_pool owns page, and can
> > > > > recycle it.
> > > > > */
> > > > > - if (likely(page_ref_count(page) == 1)) {
> > > > > + if (likely(page_ref_count(page) == 1 &&
> > > > > !page_is_pfmemalloc(page)))
> > > > > {
> >
> > I think this is wrong, if refcount is 1, then this page belongs to
> > pagepool and you must enter this statement's true block, and test
> > page_is_pfmemalloc inside (mark it unlikely),
> > to return a pfmemalloc page, you need to call
> > __page_pool_return_page()
> > to dma_unmap and other cleanups if required.
>
> If the page belongs to the pool, but it isn't recyclable, the
> fallback
> path is "__page_pool_return_page()". If it doesn't belong to the
> pool,
> it goes to the non-XDP mode, which is also __page_pool_return_page().
>
non-XDP mode doesn't explicitly call __page_pool_return_page() but it
does exactly what __page_pool_return_page() is doing now, which is:
__page_pool_clean_page(pool, page);
put_page(page);
your code is logically correct for now, but semantically speaking
__page_pool_return_page might change in the future to have special
handling of pages that actually belong only to the pagepool
(refcount==1), so better be safe and handle such pages in the pagepool
path and not in the non-XDP path
all you need is:
if(unlikely(!page_is_pfmemalloc(page))
__page_pool_return_page()
early inside if (likely(page_ref_count(page) == 1)) pagepool recycling
block; I don't think this will affect performance.
> Does your dislike stem from the fact that the "non-XDP" mode is taken
> for the "refcount=1, pfmemalloc=T" case?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net-next] net: Don't return pfmemalloc pages to the page pool.
2018-12-21 2:13 ` Saeed Mahameed
@ 2018-12-21 2:26 ` Saeed Mahameed
0 siblings, 0 replies; 8+ messages in thread
From: Saeed Mahameed @ 2018-12-21 2:26 UTC (permalink / raw)
To: jonathan.lemon@gmail.com; +Cc: netdev@vger.kernel.org, brouer@redhat.com
On Fri, 2018-12-21 at 02:13 +0000, Saeed Mahameed wrote:
> On Thu, 2018-12-20 at 16:56 -0800, Jonathan Lemon wrote:
> > On 20 Dec 2018, at 15:41, Saeed Mahameed wrote:
> >
> > > On Thu, 2018-12-20 at 14:11 -0800, Jonathan Lemon wrote:
> > > > (Resending due to mailer issues)
> > > >
> > > > On 20 Dec 2018, at 5:03, Jesper Dangaard Brouer wrote:
> > > >
> > > > > On Wed, 19 Dec 2018 12:06:51 -0800
> > > > > Jonathan Lemon <jonathan.lemon@gmail.com> wrote:
> > > > >
> > > > > > Return pfmemalloc pages back to the page allocator, instead
> > > > > > of
> > > > > > holding them in the page pool.
> > > > >
> > > > > Have you experience this issue in practice or is it theory?
> > > >
> > > > We're seeing the mlx5 driver use pfmemalloc pages with 4.11,
> > > > and
> > > > then
> > > > return them
> > > > back to the page allocator. (it's triggering the
> > > > mlx5e_page_is_reserved() test).
> > > > The page pool code isn't in production use, but the code paths
> > > > appear
> > > > identical.
> > > >
> > > >
> > > > > > While here, also use the __page_pool_return_page() API.
> > > > >
> > > > > Don't combine several unrelated changed in one patch.
> > > >
> > > > Okay - will send as 2 separate patches
> > > >
> > > >
> > > > > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > > > > > index 43a932cb609b..091007ff14a3 100644
> > > > > > --- a/net/core/page_pool.c
> > > > > > +++ b/net/core/page_pool.c
> > > > > > @@ -233,7 +233,7 @@ void __page_pool_put_page(struct
> > > > > > page_pool
> > > > > > *pool,
> > > > > > *
> > > > > > * refcnt == 1 means page_pool owns page, and can
> > > > > > recycle it.
> > > > > > */
> > > > > > - if (likely(page_ref_count(page) == 1)) {
> > > > > > + if (likely(page_ref_count(page) == 1 &&
> > > > > > !page_is_pfmemalloc(page)))
> > > > > > {
> > >
> > > I think this is wrong, if refcount is 1, then this page belongs
> > > to
> > > pagepool and you must enter this statement's true block, and test
> > > page_is_pfmemalloc inside (mark it unlikely),
> > > to return a pfmemalloc page, you need to call
> > > __page_pool_return_page()
> > > to dma_unmap and other cleanups if required.
> >
> > If the page belongs to the pool, but it isn't recyclable, the
> > fallback
> > path is "__page_pool_return_page()". If it doesn't belong to the
> > pool,
> > it goes to the non-XDP mode, which is also
> > __page_pool_return_page().
> >
>
> non-XDP mode doesn't explicitly call __page_pool_return_page() but it
> does exactly what __page_pool_return_page() is doing now, which is:
> __page_pool_clean_page(pool, page);
> put_page(page);
>
> your code is logically correct for now, but semantically speaking
> __page_pool_return_page might change in the future to have special
> handling of pages that actually belong only to the pagepool
> (refcount==1), so better be safe and handle such pages in the
> pagepool
> path and not in the non-XDP path
>
> all you need is:
> if(unlikely(!page_is_pfmemalloc(page))
> __page_pool_return_page()
>
Maybe something like:
@@ -236,6 +236,12 @@ void __page_pool_put_page(struct page_pool *pool,
if (likely(page_ref_count(page) == 1)) {
/* Read barrier done in page_ref_count / READ_ONCE */
+ /* Don't recycle emergency pages */
+ if(unlikely(page_is_pfmemalloc(page)) {
+ __page_pool_return_page(pool, page)
+ return;
+ }
+
** Not tested :) **
> early inside if (likely(page_ref_count(page) == 1)) pagepool
> recycling
> block; I don't think this will affect performance.
>
> > Does your dislike stem from the fact that the "non-XDP" mode is
> > taken
> > for the "refcount=1, pfmemalloc=T" case?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net-next] net: Don't return pfmemalloc pages to the page pool.
2018-12-20 22:11 ` Jonathan Lemon
2018-12-20 23:41 ` Saeed Mahameed
@ 2018-12-21 14:42 ` Jesper Dangaard Brouer
1 sibling, 0 replies; 8+ messages in thread
From: Jesper Dangaard Brouer @ 2018-12-21 14:42 UTC (permalink / raw)
To: Jonathan Lemon; +Cc: netdev, brouer
On Thu, 20 Dec 2018 14:11:35 -0800 "Jonathan Lemon" <jonathan.lemon@gmail.com> wrote:
> On 20 Dec 2018, at 5:03, Jesper Dangaard Brouer wrote:
>
[...]
> > I don't like adding this in the hot-path. Instead we could move this
> > to the page alloc slow-path, and reject allocating pages with
> > pgmemalloc in the first place.
>
> No real objection to that - but then why bother with pfmemalloc? If the
> driver can't obtain pages for emergency use, then they might as well
> not exist.
I've changed my mind. There is an interesting opportunity in allowing
pfmemalloc-pages to be used by the driver. (So, I'm saying I'm okay
with adding this to the hot-path. And this hopefully doesn't affect
performance (too much), as page_is_pfmemalloc() is reading from the same
cache-line).
The opportunity is that XDP can handle/operate at wirespeed. We could
allow XDP to get this info (simply via helper call, so we don't affect
users not using this). When seeing PFMEMALLOC, which indicate a bad
situation is occurring really soon, then we can react at a earlier
stage (spending less cycles on reacting).
One idea is to reduce-size of XDP frame, and use XDP_TX to send-back
the frame to the sender as a congestion/drop notification, which inform
sender to slowdown. If this is incast happening within the same
data-center then the XDP_TX-feedback can reach the sender really fast.
One example of such an approach: https://youtu.be/BO0QhaxBRr0
This is one example of how XDP can allow us to do stuff that was not
possible before...
--
Best regards,
Jesper Dangaard Brouer
MSc.CS, Principal Kernel Engineer at Red Hat
LinkedIn: http://www.linkedin.com/in/brouer
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2018-12-21 14:42 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-12-19 20:06 [PATCH net-next] net: Don't return pfmemalloc pages to the page pool Jonathan Lemon
2018-12-20 13:03 ` Jesper Dangaard Brouer
2018-12-20 22:11 ` Jonathan Lemon
2018-12-20 23:41 ` Saeed Mahameed
2018-12-21 0:56 ` Jonathan Lemon
2018-12-21 2:13 ` Saeed Mahameed
2018-12-21 2:26 ` Saeed Mahameed
2018-12-21 14:42 ` Jesper Dangaard Brouer
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).