* [PATCH net-next] net: Don't return pfmemalloc pages to the page pool. @ 2018-12-19 20:06 Jonathan Lemon 2018-12-20 13:03 ` Jesper Dangaard Brouer 0 siblings, 1 reply; 8+ messages in thread From: Jonathan Lemon @ 2018-12-19 20:06 UTC (permalink / raw) To: netdev; +Cc: brouer, Jonathan Lemon Return pfmemalloc pages back to the page allocator, instead of holding them in the page pool. While here, also use the __page_pool_return_page() API. Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> --- net/core/page_pool.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/net/core/page_pool.c b/net/core/page_pool.c index 43a932cb609b..091007ff14a3 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -233,7 +233,7 @@ void __page_pool_put_page(struct page_pool *pool, * * refcnt == 1 means page_pool owns page, and can recycle it. */ - if (likely(page_ref_count(page) == 1)) { + if (likely(page_ref_count(page) == 1 && !page_is_pfmemalloc(page))) { /* Read barrier done in page_ref_count / READ_ONCE */ if (allow_direct && in_serving_softirq()) @@ -259,8 +259,7 @@ void __page_pool_put_page(struct page_pool *pool, * doing refcnt based recycle tricks, meaning another process * will be invoking put_page. */ - __page_pool_clean_page(pool, page); - put_page(page); + __page_pool_return_page(pool, page); } EXPORT_SYMBOL(__page_pool_put_page); -- 2.17.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH net-next] net: Don't return pfmemalloc pages to the page pool. 2018-12-19 20:06 [PATCH net-next] net: Don't return pfmemalloc pages to the page pool Jonathan Lemon @ 2018-12-20 13:03 ` Jesper Dangaard Brouer 2018-12-20 22:11 ` Jonathan Lemon 0 siblings, 1 reply; 8+ messages in thread From: Jesper Dangaard Brouer @ 2018-12-20 13:03 UTC (permalink / raw) To: Jonathan Lemon; +Cc: netdev, brouer On Wed, 19 Dec 2018 12:06:51 -0800 Jonathan Lemon <jonathan.lemon@gmail.com> wrote: > Return pfmemalloc pages back to the page allocator, instead of > holding them in the page pool. Have you experience this issue in practice or is it theory? > While here, also use the __page_pool_return_page() API. Don't combine several unrelated changed in one patch. > Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> > --- > net/core/page_pool.c | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c > index 43a932cb609b..091007ff14a3 100644 > --- a/net/core/page_pool.c > +++ b/net/core/page_pool.c > @@ -233,7 +233,7 @@ void __page_pool_put_page(struct page_pool *pool, > * > * refcnt == 1 means page_pool owns page, and can recycle it. > */ > - if (likely(page_ref_count(page) == 1)) { > + if (likely(page_ref_count(page) == 1 && !page_is_pfmemalloc(page))) { I don't like adding this in the hot-path. Instead we could move this to the page alloc slow-path, and reject allocating pages with pgmemalloc in the first place. > /* Read barrier done in page_ref_count / READ_ONCE */ > > if (allow_direct && in_serving_softirq()) > @@ -259,8 +259,7 @@ void __page_pool_put_page(struct page_pool *pool, > * doing refcnt based recycle tricks, meaning another process > * will be invoking put_page. > */ > - __page_pool_clean_page(pool, page); > - put_page(page); > + __page_pool_return_page(pool, page); > } > EXPORT_SYMBOL(__page_pool_put_page); > -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net-next] net: Don't return pfmemalloc pages to the page pool. 2018-12-20 13:03 ` Jesper Dangaard Brouer @ 2018-12-20 22:11 ` Jonathan Lemon 2018-12-20 23:41 ` Saeed Mahameed 2018-12-21 14:42 ` Jesper Dangaard Brouer 0 siblings, 2 replies; 8+ messages in thread From: Jonathan Lemon @ 2018-12-20 22:11 UTC (permalink / raw) To: Jesper Dangaard Brouer; +Cc: netdev (Resending due to mailer issues) On 20 Dec 2018, at 5:03, Jesper Dangaard Brouer wrote: > On Wed, 19 Dec 2018 12:06:51 -0800 > Jonathan Lemon <jonathan.lemon@gmail.com> wrote: > >> Return pfmemalloc pages back to the page allocator, instead of >> holding them in the page pool. > > Have you experience this issue in practice or is it theory? We're seeing the mlx5 driver use pfmemalloc pages with 4.11, and then return them back to the page allocator. (it's triggering the mlx5e_page_is_reserved() test). The page pool code isn't in production use, but the code paths appear identical. > >> While here, also use the __page_pool_return_page() API. > > Don't combine several unrelated changed in one patch. Okay - will send as 2 separate patches >> diff --git a/net/core/page_pool.c b/net/core/page_pool.c >> index 43a932cb609b..091007ff14a3 100644 >> --- a/net/core/page_pool.c >> +++ b/net/core/page_pool.c >> @@ -233,7 +233,7 @@ void __page_pool_put_page(struct page_pool *pool, >> * >> * refcnt == 1 means page_pool owns page, and can recycle it. >> */ >> - if (likely(page_ref_count(page) == 1)) { >> + if (likely(page_ref_count(page) == 1 && !page_is_pfmemalloc(page))) >> { > > I don't like adding this in the hot-path. Instead we could move this > to the page alloc slow-path, and reject allocating pages with > pgmemalloc in the first place. No real objection to that - but then why bother with pfmemalloc? If the driver can't obtain pages for emergency use, then they might as well not exist. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net-next] net: Don't return pfmemalloc pages to the page pool. 2018-12-20 22:11 ` Jonathan Lemon @ 2018-12-20 23:41 ` Saeed Mahameed 2018-12-21 0:56 ` Jonathan Lemon 2018-12-21 14:42 ` Jesper Dangaard Brouer 1 sibling, 1 reply; 8+ messages in thread From: Saeed Mahameed @ 2018-12-20 23:41 UTC (permalink / raw) To: brouer@redhat.com, jonathan.lemon@gmail.com; +Cc: netdev@vger.kernel.org On Thu, 2018-12-20 at 14:11 -0800, Jonathan Lemon wrote: > (Resending due to mailer issues) > > On 20 Dec 2018, at 5:03, Jesper Dangaard Brouer wrote: > > > On Wed, 19 Dec 2018 12:06:51 -0800 > > Jonathan Lemon <jonathan.lemon@gmail.com> wrote: > > > > > Return pfmemalloc pages back to the page allocator, instead of > > > holding them in the page pool. > > > > Have you experience this issue in practice or is it theory? > > We're seeing the mlx5 driver use pfmemalloc pages with 4.11, and > then > return them > back to the page allocator. (it's triggering the > mlx5e_page_is_reserved() test). > The page pool code isn't in production use, but the code paths > appear > identical. > > > > > While here, also use the __page_pool_return_page() API. > > > > Don't combine several unrelated changed in one patch. > > Okay - will send as 2 separate patches > > > > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c > > > index 43a932cb609b..091007ff14a3 100644 > > > --- a/net/core/page_pool.c > > > +++ b/net/core/page_pool.c > > > @@ -233,7 +233,7 @@ void __page_pool_put_page(struct page_pool > > > *pool, > > > * > > > * refcnt == 1 means page_pool owns page, and can recycle it. > > > */ > > > - if (likely(page_ref_count(page) == 1)) { > > > + if (likely(page_ref_count(page) == 1 && > > > !page_is_pfmemalloc(page))) > > > { > > I think this is wrong, if refcount is 1, then this page belongs to pagepool and you must enter this statement's true block, and test page_is_pfmemalloc inside (mark it unlikely), to return a pfmemalloc page, you need to call __page_pool_return_page() to dma_unmap and other cleanups if required. > > I don't like adding this in the hot-path. Instead we could move > > this > > to the page alloc slow-path, and reject allocating pages with > > pgmemalloc in the first place. > I prefer to enhance the above solution, no need to risk pagepool users going out of memory under emergencies. > No real objection to that - but then why bother with pfmemalloc? If > the > driver can't > obtain pages for emergency use, then they might as well not exist. > > -- > Jonathan > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net-next] net: Don't return pfmemalloc pages to the page pool. 2018-12-20 23:41 ` Saeed Mahameed @ 2018-12-21 0:56 ` Jonathan Lemon 2018-12-21 2:13 ` Saeed Mahameed 0 siblings, 1 reply; 8+ messages in thread From: Jonathan Lemon @ 2018-12-21 0:56 UTC (permalink / raw) To: Saeed Mahameed; +Cc: brouer, netdev On 20 Dec 2018, at 15:41, Saeed Mahameed wrote: > On Thu, 2018-12-20 at 14:11 -0800, Jonathan Lemon wrote: >> (Resending due to mailer issues) >> >> On 20 Dec 2018, at 5:03, Jesper Dangaard Brouer wrote: >> >>> On Wed, 19 Dec 2018 12:06:51 -0800 >>> Jonathan Lemon <jonathan.lemon@gmail.com> wrote: >>> >>>> Return pfmemalloc pages back to the page allocator, instead of >>>> holding them in the page pool. >>> >>> Have you experience this issue in practice or is it theory? >> >> We're seeing the mlx5 driver use pfmemalloc pages with 4.11, and >> then >> return them >> back to the page allocator. (it's triggering the >> mlx5e_page_is_reserved() test). >> The page pool code isn't in production use, but the code paths >> appear >> identical. >> >> >>>> While here, also use the __page_pool_return_page() API. >>> >>> Don't combine several unrelated changed in one patch. >> >> Okay - will send as 2 separate patches >> >> >>>> diff --git a/net/core/page_pool.c b/net/core/page_pool.c >>>> index 43a932cb609b..091007ff14a3 100644 >>>> --- a/net/core/page_pool.c >>>> +++ b/net/core/page_pool.c >>>> @@ -233,7 +233,7 @@ void __page_pool_put_page(struct page_pool >>>> *pool, >>>> * >>>> * refcnt == 1 means page_pool owns page, and can recycle it. >>>> */ >>>> - if (likely(page_ref_count(page) == 1)) { >>>> + if (likely(page_ref_count(page) == 1 && >>>> !page_is_pfmemalloc(page))) >>>> { >>> > > I think this is wrong, if refcount is 1, then this page belongs to > pagepool and you must enter this statement's true block, and test > page_is_pfmemalloc inside (mark it unlikely), > to return a pfmemalloc page, you need to call __page_pool_return_page() > to dma_unmap and other cleanups if required. If the page belongs to the pool, but it isn't recyclable, the fallback path is "__page_pool_return_page()". If it doesn't belong to the pool, it goes to the non-XDP mode, which is also __page_pool_return_page(). Does your dislike stem from the fact that the "non-XDP" mode is taken for the "refcount=1, pfmemalloc=T" case? -- Jonathan ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net-next] net: Don't return pfmemalloc pages to the page pool. 2018-12-21 0:56 ` Jonathan Lemon @ 2018-12-21 2:13 ` Saeed Mahameed 2018-12-21 2:26 ` Saeed Mahameed 0 siblings, 1 reply; 8+ messages in thread From: Saeed Mahameed @ 2018-12-21 2:13 UTC (permalink / raw) To: jonathan.lemon@gmail.com; +Cc: netdev@vger.kernel.org, brouer@redhat.com On Thu, 2018-12-20 at 16:56 -0800, Jonathan Lemon wrote: > On 20 Dec 2018, at 15:41, Saeed Mahameed wrote: > > > On Thu, 2018-12-20 at 14:11 -0800, Jonathan Lemon wrote: > > > (Resending due to mailer issues) > > > > > > On 20 Dec 2018, at 5:03, Jesper Dangaard Brouer wrote: > > > > > > > On Wed, 19 Dec 2018 12:06:51 -0800 > > > > Jonathan Lemon <jonathan.lemon@gmail.com> wrote: > > > > > > > > > Return pfmemalloc pages back to the page allocator, instead > > > > > of > > > > > holding them in the page pool. > > > > > > > > Have you experience this issue in practice or is it theory? > > > > > > We're seeing the mlx5 driver use pfmemalloc pages with 4.11, and > > > then > > > return them > > > back to the page allocator. (it's triggering the > > > mlx5e_page_is_reserved() test). > > > The page pool code isn't in production use, but the code paths > > > appear > > > identical. > > > > > > > > > > > While here, also use the __page_pool_return_page() API. > > > > > > > > Don't combine several unrelated changed in one patch. > > > > > > Okay - will send as 2 separate patches > > > > > > > > > > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c > > > > > index 43a932cb609b..091007ff14a3 100644 > > > > > --- a/net/core/page_pool.c > > > > > +++ b/net/core/page_pool.c > > > > > @@ -233,7 +233,7 @@ void __page_pool_put_page(struct > > > > > page_pool > > > > > *pool, > > > > > * > > > > > * refcnt == 1 means page_pool owns page, and can > > > > > recycle it. > > > > > */ > > > > > - if (likely(page_ref_count(page) == 1)) { > > > > > + if (likely(page_ref_count(page) == 1 && > > > > > !page_is_pfmemalloc(page))) > > > > > { > > > > I think this is wrong, if refcount is 1, then this page belongs to > > pagepool and you must enter this statement's true block, and test > > page_is_pfmemalloc inside (mark it unlikely), > > to return a pfmemalloc page, you need to call > > __page_pool_return_page() > > to dma_unmap and other cleanups if required. > > If the page belongs to the pool, but it isn't recyclable, the > fallback > path is "__page_pool_return_page()". If it doesn't belong to the > pool, > it goes to the non-XDP mode, which is also __page_pool_return_page(). > non-XDP mode doesn't explicitly call __page_pool_return_page() but it does exactly what __page_pool_return_page() is doing now, which is: __page_pool_clean_page(pool, page); put_page(page); your code is logically correct for now, but semantically speaking __page_pool_return_page might change in the future to have special handling of pages that actually belong only to the pagepool (refcount==1), so better be safe and handle such pages in the pagepool path and not in the non-XDP path all you need is: if(unlikely(!page_is_pfmemalloc(page)) __page_pool_return_page() early inside if (likely(page_ref_count(page) == 1)) pagepool recycling block; I don't think this will affect performance. > Does your dislike stem from the fact that the "non-XDP" mode is taken > for the "refcount=1, pfmemalloc=T" case? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net-next] net: Don't return pfmemalloc pages to the page pool. 2018-12-21 2:13 ` Saeed Mahameed @ 2018-12-21 2:26 ` Saeed Mahameed 0 siblings, 0 replies; 8+ messages in thread From: Saeed Mahameed @ 2018-12-21 2:26 UTC (permalink / raw) To: jonathan.lemon@gmail.com; +Cc: netdev@vger.kernel.org, brouer@redhat.com On Fri, 2018-12-21 at 02:13 +0000, Saeed Mahameed wrote: > On Thu, 2018-12-20 at 16:56 -0800, Jonathan Lemon wrote: > > On 20 Dec 2018, at 15:41, Saeed Mahameed wrote: > > > > > On Thu, 2018-12-20 at 14:11 -0800, Jonathan Lemon wrote: > > > > (Resending due to mailer issues) > > > > > > > > On 20 Dec 2018, at 5:03, Jesper Dangaard Brouer wrote: > > > > > > > > > On Wed, 19 Dec 2018 12:06:51 -0800 > > > > > Jonathan Lemon <jonathan.lemon@gmail.com> wrote: > > > > > > > > > > > Return pfmemalloc pages back to the page allocator, instead > > > > > > of > > > > > > holding them in the page pool. > > > > > > > > > > Have you experience this issue in practice or is it theory? > > > > > > > > We're seeing the mlx5 driver use pfmemalloc pages with 4.11, > > > > and > > > > then > > > > return them > > > > back to the page allocator. (it's triggering the > > > > mlx5e_page_is_reserved() test). > > > > The page pool code isn't in production use, but the code paths > > > > appear > > > > identical. > > > > > > > > > > > > > > While here, also use the __page_pool_return_page() API. > > > > > > > > > > Don't combine several unrelated changed in one patch. > > > > > > > > Okay - will send as 2 separate patches > > > > > > > > > > > > > > diff --git a/net/core/page_pool.c b/net/core/page_pool.c > > > > > > index 43a932cb609b..091007ff14a3 100644 > > > > > > --- a/net/core/page_pool.c > > > > > > +++ b/net/core/page_pool.c > > > > > > @@ -233,7 +233,7 @@ void __page_pool_put_page(struct > > > > > > page_pool > > > > > > *pool, > > > > > > * > > > > > > * refcnt == 1 means page_pool owns page, and can > > > > > > recycle it. > > > > > > */ > > > > > > - if (likely(page_ref_count(page) == 1)) { > > > > > > + if (likely(page_ref_count(page) == 1 && > > > > > > !page_is_pfmemalloc(page))) > > > > > > { > > > > > > I think this is wrong, if refcount is 1, then this page belongs > > > to > > > pagepool and you must enter this statement's true block, and test > > > page_is_pfmemalloc inside (mark it unlikely), > > > to return a pfmemalloc page, you need to call > > > __page_pool_return_page() > > > to dma_unmap and other cleanups if required. > > > > If the page belongs to the pool, but it isn't recyclable, the > > fallback > > path is "__page_pool_return_page()". If it doesn't belong to the > > pool, > > it goes to the non-XDP mode, which is also > > __page_pool_return_page(). > > > > non-XDP mode doesn't explicitly call __page_pool_return_page() but it > does exactly what __page_pool_return_page() is doing now, which is: > __page_pool_clean_page(pool, page); > put_page(page); > > your code is logically correct for now, but semantically speaking > __page_pool_return_page might change in the future to have special > handling of pages that actually belong only to the pagepool > (refcount==1), so better be safe and handle such pages in the > pagepool > path and not in the non-XDP path > > all you need is: > if(unlikely(!page_is_pfmemalloc(page)) > __page_pool_return_page() > Maybe something like: @@ -236,6 +236,12 @@ void __page_pool_put_page(struct page_pool *pool, if (likely(page_ref_count(page) == 1)) { /* Read barrier done in page_ref_count / READ_ONCE */ + /* Don't recycle emergency pages */ + if(unlikely(page_is_pfmemalloc(page)) { + __page_pool_return_page(pool, page) + return; + } + ** Not tested :) ** > early inside if (likely(page_ref_count(page) == 1)) pagepool > recycling > block; I don't think this will affect performance. > > > Does your dislike stem from the fact that the "non-XDP" mode is > > taken > > for the "refcount=1, pfmemalloc=T" case? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH net-next] net: Don't return pfmemalloc pages to the page pool. 2018-12-20 22:11 ` Jonathan Lemon 2018-12-20 23:41 ` Saeed Mahameed @ 2018-12-21 14:42 ` Jesper Dangaard Brouer 1 sibling, 0 replies; 8+ messages in thread From: Jesper Dangaard Brouer @ 2018-12-21 14:42 UTC (permalink / raw) To: Jonathan Lemon; +Cc: netdev, brouer On Thu, 20 Dec 2018 14:11:35 -0800 "Jonathan Lemon" <jonathan.lemon@gmail.com> wrote: > On 20 Dec 2018, at 5:03, Jesper Dangaard Brouer wrote: > [...] > > I don't like adding this in the hot-path. Instead we could move this > > to the page alloc slow-path, and reject allocating pages with > > pgmemalloc in the first place. > > No real objection to that - but then why bother with pfmemalloc? If the > driver can't obtain pages for emergency use, then they might as well > not exist. I've changed my mind. There is an interesting opportunity in allowing pfmemalloc-pages to be used by the driver. (So, I'm saying I'm okay with adding this to the hot-path. And this hopefully doesn't affect performance (too much), as page_is_pfmemalloc() is reading from the same cache-line). The opportunity is that XDP can handle/operate at wirespeed. We could allow XDP to get this info (simply via helper call, so we don't affect users not using this). When seeing PFMEMALLOC, which indicate a bad situation is occurring really soon, then we can react at a earlier stage (spending less cycles on reacting). One idea is to reduce-size of XDP frame, and use XDP_TX to send-back the frame to the sender as a congestion/drop notification, which inform sender to slowdown. If this is incast happening within the same data-center then the XDP_TX-feedback can reach the sender really fast. One example of such an approach: https://youtu.be/BO0QhaxBRr0 This is one example of how XDP can allow us to do stuff that was not possible before... -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2018-12-21 14:42 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-12-19 20:06 [PATCH net-next] net: Don't return pfmemalloc pages to the page pool Jonathan Lemon 2018-12-20 13:03 ` Jesper Dangaard Brouer 2018-12-20 22:11 ` Jonathan Lemon 2018-12-20 23:41 ` Saeed Mahameed 2018-12-21 0:56 ` Jonathan Lemon 2018-12-21 2:13 ` Saeed Mahameed 2018-12-21 2:26 ` Saeed Mahameed 2018-12-21 14:42 ` Jesper Dangaard Brouer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).