* [PATCH v5 RFC 1/6] page_pool: frag API support for 32-bit arch with 64-bit DMA [not found] <20230629120226.14854-1-linyunsheng@huawei.com> @ 2023-06-29 12:02 ` Yunsheng Lin 2023-07-07 23:59 ` Jakub Kicinski 2023-07-08 0:01 ` Jakub Kicinski 2023-06-29 12:02 ` [PATCH v5 RFC 4/6] page_pool: remove PP_FLAG_PAGE_FRAG flag Yunsheng Lin 1 sibling, 2 replies; 16+ messages in thread From: Yunsheng Lin @ 2023-06-29 12:02 UTC (permalink / raw) To: davem, kuba, pabeni Cc: netdev, linux-kernel, Yunsheng Lin, Lorenzo Bianconi, Alexander Duyck, Liang Chen, Alexander Lobakin, Saeed Mahameed, Leon Romanovsky, Eric Dumazet, Jesper Dangaard Brouer, Ilias Apalodimas, linux-rdma Currently page_pool_alloc_frag() is not supported in 32-bit arch with 64-bit DMA because of the overlap issue between pp_frag_count and dma_addr_upper in 'struct page' for those arches, which seems to be quite common, see [1], which means driver may need to handle it when using frag API. In order to simplify the driver's work when using frag API this patch allows page_pool_alloc_frag() to call page_pool_alloc_pages() to return pages for those arches. mlx5 calls page_pool_create() with PP_FLAG_PAGE_FRAG and is not using the frag API, and the PP_FLAG_PAGE_FRAG checking will now be removed in this patch, a new checking for the mlx5 driver is needed to retain the old behavior. As we have not come up with a better name yet, so PAGE_POOL_DMA_USE_PP_FRAG_COUNT is reused here. Note that it may aggravate truesize underestimate problem for skb as there is no page splitting for those pages, if driver need a accurate truesize, it may calculate that according to frag size, page order and PAGE_POOL_DMA_USE_PP_FRAG_COUNT being true or not. And we may provide a helper for that if it turns out to be helpful. 1. https://lore.kernel.org/all/20211117075652.58299-1-linyunsheng@huawei.com/ Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> CC: Lorenzo Bianconi <lorenzo@kernel.org> CC: Alexander Duyck <alexander.duyck@gmail.com> CC: Liang Chen <liangchen.linux@gmail.com> CC: Alexander Lobakin <aleksander.lobakin@intel.com> --- .../net/ethernet/mellanox/mlx5/core/en_main.c | 10 +++++ include/net/page_pool.h | 45 ++++++++++++++++--- net/core/page_pool.c | 18 ++------ 3 files changed, 52 insertions(+), 21 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index a7c526ee5024..42ccb77a1b6c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -832,6 +832,16 @@ static int mlx5e_alloc_rq(struct mlx5e_params *params, /* Create a page_pool and register it with rxq */ struct page_pool_params pp_params = { 0 }; + /* Return error here to avoid mlx5e_page_release_fragmented() + * calling page_pool_defrag_page() to write to pp_frag_count + * which is overlapped with dma_addr_upper in 'struct page' for + * arch with PAGE_POOL_DMA_USE_PP_FRAG_COUNT being true. + */ + if (PAGE_POOL_DMA_USE_PP_FRAG_COUNT) { + err = -EINVAL; + goto err_free_by_rq_type; + } + pp_params.order = 0; pp_params.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV | PP_FLAG_PAGE_FRAG; pp_params.pool_size = pool_size; diff --git a/include/net/page_pool.h b/include/net/page_pool.h index 126f9e294389..83bd13491105 100644 --- a/include/net/page_pool.h +++ b/include/net/page_pool.h @@ -32,7 +32,7 @@ #include <linux/mm.h> /* Needed by ptr_ring */ #include <linux/ptr_ring.h> -#include <linux/dma-direction.h> +#include <linux/dma-mapping.h> #define PP_FLAG_DMA_MAP BIT(0) /* Should page_pool do the DMA * map/unmap @@ -50,6 +50,9 @@ PP_FLAG_DMA_SYNC_DEV |\ PP_FLAG_PAGE_FRAG) +#define PAGE_POOL_DMA_USE_PP_FRAG_COUNT \ + (sizeof(dma_addr_t) > sizeof(unsigned long)) + /* * Fast allocation side cache array/stack * @@ -219,8 +222,33 @@ static inline struct page *page_pool_dev_alloc_pages(struct page_pool *pool) return page_pool_alloc_pages(pool, gfp); } -struct page *page_pool_alloc_frag(struct page_pool *pool, unsigned int *offset, - unsigned int size, gfp_t gfp); +struct page *__page_pool_alloc_frag(struct page_pool *pool, + unsigned int *offset, unsigned int size, + gfp_t gfp); + +static inline struct page *page_pool_alloc_frag(struct page_pool *pool, + unsigned int *offset, + unsigned int size, gfp_t gfp) +{ + unsigned int max_size = PAGE_SIZE << pool->p.order; + + size = ALIGN(size, dma_get_cache_alignment()); + + if (WARN_ON(!(pool->p.flags & PP_FLAG_PAGE_FRAG) || + size > max_size)) + return NULL; + + /* Don't allow page splitting and allocate one big frag + * for 32-bit arch with 64-bit DMA, corresponding to + * the checking in page_pool_is_last_frag(). + */ + if (PAGE_POOL_DMA_USE_PP_FRAG_COUNT) { + *offset = 0; + return page_pool_alloc_pages(pool, gfp); + } + + return __page_pool_alloc_frag(pool, offset, size, gfp); +} static inline struct page *page_pool_dev_alloc_frag(struct page_pool *pool, unsigned int *offset, @@ -322,8 +350,14 @@ static inline long page_pool_defrag_page(struct page *page, long nr) static inline bool page_pool_is_last_frag(struct page_pool *pool, struct page *page) { - /* If fragments aren't enabled or count is 0 we were the last user */ + /* We assume we are the last frag user that is still holding + * on to the page if: + * 1. Fragments aren't enabled. + * 2. We are running in 32-bit arch with 64-bit DMA. + * 3. page_pool_defrag_page() indicate we are the last user. + */ return !(pool->p.flags & PP_FLAG_PAGE_FRAG) || + PAGE_POOL_DMA_USE_PP_FRAG_COUNT || (page_pool_defrag_page(page, 1) == 0); } @@ -357,9 +391,6 @@ static inline void page_pool_recycle_direct(struct page_pool *pool, page_pool_put_full_page(pool, page, true); } -#define PAGE_POOL_DMA_USE_PP_FRAG_COUNT \ - (sizeof(dma_addr_t) > sizeof(unsigned long)) - static inline dma_addr_t page_pool_get_dma_addr(struct page *page) { dma_addr_t ret = page->dma_addr; diff --git a/net/core/page_pool.c b/net/core/page_pool.c index a3e12a61d456..9c4118c62997 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -14,7 +14,6 @@ #include <net/xdp.h> #include <linux/dma-direction.h> -#include <linux/dma-mapping.h> #include <linux/page-flags.h> #include <linux/mm.h> /* for put_page() */ #include <linux/poison.h> @@ -200,10 +199,6 @@ static int page_pool_init(struct page_pool *pool, */ } - if (PAGE_POOL_DMA_USE_PP_FRAG_COUNT && - pool->p.flags & PP_FLAG_PAGE_FRAG) - return -EINVAL; - #ifdef CONFIG_PAGE_POOL_STATS pool->recycle_stats = alloc_percpu(struct page_pool_recycle_stats); if (!pool->recycle_stats) @@ -715,18 +710,13 @@ static void page_pool_free_frag(struct page_pool *pool) page_pool_return_page(pool, page); } -struct page *page_pool_alloc_frag(struct page_pool *pool, - unsigned int *offset, - unsigned int size, gfp_t gfp) +struct page *__page_pool_alloc_frag(struct page_pool *pool, + unsigned int *offset, + unsigned int size, gfp_t gfp) { unsigned int max_size = PAGE_SIZE << pool->p.order; struct page *page = pool->frag_page; - if (WARN_ON(!(pool->p.flags & PP_FLAG_PAGE_FRAG) || - size > max_size)) - return NULL; - - size = ALIGN(size, dma_get_cache_alignment()); *offset = pool->frag_offset; if (page && *offset + size > max_size) { @@ -759,7 +749,7 @@ struct page *page_pool_alloc_frag(struct page_pool *pool, alloc_stat_inc(pool, fast); return page; } -EXPORT_SYMBOL(page_pool_alloc_frag); +EXPORT_SYMBOL(__page_pool_alloc_frag); static void page_pool_empty_ring(struct page_pool *pool) { -- 2.33.0 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH v5 RFC 1/6] page_pool: frag API support for 32-bit arch with 64-bit DMA 2023-06-29 12:02 ` [PATCH v5 RFC 1/6] page_pool: frag API support for 32-bit arch with 64-bit DMA Yunsheng Lin @ 2023-07-07 23:59 ` Jakub Kicinski 2023-07-09 12:39 ` Yunsheng Lin 2023-07-08 0:01 ` Jakub Kicinski 1 sibling, 1 reply; 16+ messages in thread From: Jakub Kicinski @ 2023-07-07 23:59 UTC (permalink / raw) To: Yunsheng Lin Cc: davem, pabeni, netdev, linux-kernel, Lorenzo Bianconi, Alexander Duyck, Liang Chen, Alexander Lobakin, Saeed Mahameed, Leon Romanovsky, Eric Dumazet, Jesper Dangaard Brouer, Ilias Apalodimas, linux-rdma On Thu, 29 Jun 2023 20:02:21 +0800 Yunsheng Lin wrote: > + /* Return error here to avoid mlx5e_page_release_fragmented() > + * calling page_pool_defrag_page() to write to pp_frag_count > + * which is overlapped with dma_addr_upper in 'struct page' for > + * arch with PAGE_POOL_DMA_USE_PP_FRAG_COUNT being true. > + */ > + if (PAGE_POOL_DMA_USE_PP_FRAG_COUNT) { > + err = -EINVAL; > + goto err_free_by_rq_type; > + } I told you not to do this in a comment on v4. Keep the flag in page pool params and let the creation fail. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v5 RFC 1/6] page_pool: frag API support for 32-bit arch with 64-bit DMA 2023-07-07 23:59 ` Jakub Kicinski @ 2023-07-09 12:39 ` Yunsheng Lin 2023-07-10 18:36 ` Jakub Kicinski 0 siblings, 1 reply; 16+ messages in thread From: Yunsheng Lin @ 2023-07-09 12:39 UTC (permalink / raw) To: Jakub Kicinski, Yunsheng Lin Cc: davem, pabeni, netdev, linux-kernel, Lorenzo Bianconi, Alexander Duyck, Liang Chen, Alexander Lobakin, Saeed Mahameed, Leon Romanovsky, Eric Dumazet, Jesper Dangaard Brouer, Ilias Apalodimas, linux-rdma On 2023/7/8 7:59, Jakub Kicinski wrote: > On Thu, 29 Jun 2023 20:02:21 +0800 Yunsheng Lin wrote: >> + /* Return error here to avoid mlx5e_page_release_fragmented() >> + * calling page_pool_defrag_page() to write to pp_frag_count >> + * which is overlapped with dma_addr_upper in 'struct page' for >> + * arch with PAGE_POOL_DMA_USE_PP_FRAG_COUNT being true. >> + */ >> + if (PAGE_POOL_DMA_USE_PP_FRAG_COUNT) { >> + err = -EINVAL; >> + goto err_free_by_rq_type; >> + } > > I told you not to do this in a comment on v4. > Keep the flag in page pool params and let the creation fail. There seems to be naming disagreement in the previous discussion, The simplest way seems to be reuse the PAGE_POOL_DMA_USE_PP_FRAG_COUNT and do the checking in the driver without introducing new macro or changing macro name. Let's be more specific about what is your suggestion here: Do you mean keep the PP_FLAG_PAGE_FRAG flag and keep the below checking in page_pool_init(), right? if (PAGE_POOL_DMA_USE_PP_FRAG_COUNT && pool->p.flags & PP_FLAG_PAGE_FRAG) return -EINVAL; Isn't it confusing to still say page frag is not supported for PAGE_POOL_DMA_USE_PP_FRAG_COUNT being true case when this patch will now add support for it, at least from API POV, the page_pool_alloc_frag() is always supported now. Maybe remove the PP_FLAG_PAGE_FRAG and add a new macro named PP_FLAG_PAGE_SPLIT_IN_DRIVER, and do the checking as before in page_pool_init() like below? if (PAGE_POOL_DMA_USE_PP_FRAG_COUNT && pool->p.flags & PP_FLAG_PAGE_SPLIT_IN_DRIVER) return -EINVAL; Or any better suggestion? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v5 RFC 1/6] page_pool: frag API support for 32-bit arch with 64-bit DMA 2023-07-09 12:39 ` Yunsheng Lin @ 2023-07-10 18:36 ` Jakub Kicinski 0 siblings, 0 replies; 16+ messages in thread From: Jakub Kicinski @ 2023-07-10 18:36 UTC (permalink / raw) To: Yunsheng Lin Cc: Yunsheng Lin, davem, pabeni, netdev, linux-kernel, Lorenzo Bianconi, Alexander Duyck, Liang Chen, Alexander Lobakin, Saeed Mahameed, Leon Romanovsky, Eric Dumazet, Jesper Dangaard Brouer, Ilias Apalodimas, linux-rdma On Sun, 9 Jul 2023 20:39:45 +0800 Yunsheng Lin wrote: > On 2023/7/8 7:59, Jakub Kicinski wrote: > > On Thu, 29 Jun 2023 20:02:21 +0800 Yunsheng Lin wrote: > >> + /* Return error here to avoid mlx5e_page_release_fragmented() > >> + * calling page_pool_defrag_page() to write to pp_frag_count > >> + * which is overlapped with dma_addr_upper in 'struct page' for > >> + * arch with PAGE_POOL_DMA_USE_PP_FRAG_COUNT being true. > >> + */ > >> + if (PAGE_POOL_DMA_USE_PP_FRAG_COUNT) { > >> + err = -EINVAL; > >> + goto err_free_by_rq_type; > >> + } > > > > I told you not to do this in a comment on v4. > > Keep the flag in page pool params and let the creation fail. > > There seems to be naming disagreement in the previous discussion, > The simplest way seems to be reuse the > PAGE_POOL_DMA_USE_PP_FRAG_COUNT and do the checking in the driver > without introducing new macro or changing macro name. > > Let's be more specific about what is your suggestion here: > Do you mean keep the PP_FLAG_PAGE_FRAG flag and keep the below > checking in page_pool_init(), right? > if (PAGE_POOL_DMA_USE_PP_FRAG_COUNT && > pool->p.flags & PP_FLAG_PAGE_FRAG) > return -EINVAL; > > Isn't it confusing to still say page frag is not supported > for PAGE_POOL_DMA_USE_PP_FRAG_COUNT being true case when this > patch will now add support for it, at least from API POV, the > page_pool_alloc_frag() is always supported now. I don't mind what the flag is called, I just want the check to stay inside the page_pool code, acting on driver info passed inside pp_params. > Maybe remove the PP_FLAG_PAGE_FRAG and add a new macro named > PP_FLAG_PAGE_SPLIT_IN_DRIVER, and do the checking as before in > page_pool_init() like below? > if (PAGE_POOL_DMA_USE_PP_FRAG_COUNT && > pool->p.flags & PP_FLAG_PAGE_SPLIT_IN_DRIVER) > return -EINVAL; Yup, that sound good. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v5 RFC 1/6] page_pool: frag API support for 32-bit arch with 64-bit DMA 2023-06-29 12:02 ` [PATCH v5 RFC 1/6] page_pool: frag API support for 32-bit arch with 64-bit DMA Yunsheng Lin 2023-07-07 23:59 ` Jakub Kicinski @ 2023-07-08 0:01 ` Jakub Kicinski 2023-07-09 12:54 ` Yunsheng Lin 1 sibling, 1 reply; 16+ messages in thread From: Jakub Kicinski @ 2023-07-08 0:01 UTC (permalink / raw) To: Yunsheng Lin Cc: davem, pabeni, netdev, linux-kernel, Lorenzo Bianconi, Alexander Duyck, Liang Chen, Alexander Lobakin, Saeed Mahameed, Leon Romanovsky, Eric Dumazet, Jesper Dangaard Brouer, Ilias Apalodimas, linux-rdma On Thu, 29 Jun 2023 20:02:21 +0800 Yunsheng Lin wrote: > -#include <linux/dma-direction.h> > +#include <linux/dma-mapping.h> And the include is still here, too, eh.. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v5 RFC 1/6] page_pool: frag API support for 32-bit arch with 64-bit DMA 2023-07-08 0:01 ` Jakub Kicinski @ 2023-07-09 12:54 ` Yunsheng Lin 2023-07-10 18:38 ` Jakub Kicinski 0 siblings, 1 reply; 16+ messages in thread From: Yunsheng Lin @ 2023-07-09 12:54 UTC (permalink / raw) To: Jakub Kicinski, Yunsheng Lin Cc: davem, pabeni, netdev, linux-kernel, Lorenzo Bianconi, Alexander Duyck, Liang Chen, Alexander Lobakin, Saeed Mahameed, Leon Romanovsky, Eric Dumazet, Jesper Dangaard Brouer, Ilias Apalodimas, linux-rdma On 2023/7/8 8:01, Jakub Kicinski wrote: > On Thu, 29 Jun 2023 20:02:21 +0800 Yunsheng Lin wrote: >> -#include <linux/dma-direction.h> >> +#include <linux/dma-mapping.h> > > And the include is still here, too, eh.. In V4, it has: --- a/include/net/page_pool.h +++ b/include/net/page_pool.h @@ -33,6 +33,7 @@ #include <linux/mm.h> /* Needed by ptr_ring */ #include <linux/ptr_ring.h> #include <linux/dma-direction.h> +#include <linux/dma-mapping.h> As dma_get_cache_alignment() defined in dma-mapping.h is used here, so we need to include dma-mapping.h. I though the agreement is that this patch only remove the "#include <linux/dma-direction.h>" as we dma-mapping.h has included dma-direction.h. And Alexander will work on excluding page_pool.h from skbuff.h https://lore.kernel.org/all/09842498-b3ba-320d-be8d-348b85e8d525@intel.com/ Did I miss something obvious here? Or there is better way to do it than the method discussed in the above thread? > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v5 RFC 1/6] page_pool: frag API support for 32-bit arch with 64-bit DMA 2023-07-09 12:54 ` Yunsheng Lin @ 2023-07-10 18:38 ` Jakub Kicinski 2023-07-11 10:59 ` Alexander Lobakin 0 siblings, 1 reply; 16+ messages in thread From: Jakub Kicinski @ 2023-07-10 18:38 UTC (permalink / raw) To: Yunsheng Lin, Alexander Lobakin Cc: Yunsheng Lin, davem, pabeni, netdev, linux-kernel, Lorenzo Bianconi, Alexander Duyck, Liang Chen, Saeed Mahameed, Leon Romanovsky, Eric Dumazet, Jesper Dangaard Brouer, Ilias Apalodimas, linux-rdma On Sun, 9 Jul 2023 20:54:12 +0800 Yunsheng Lin wrote: > > And the include is still here, too, eh.. > > In V4, it has: > > --- a/include/net/page_pool.h > +++ b/include/net/page_pool.h > @@ -33,6 +33,7 @@ > #include <linux/mm.h> /* Needed by ptr_ring */ > #include <linux/ptr_ring.h> > #include <linux/dma-direction.h> > +#include <linux/dma-mapping.h> > > As dma_get_cache_alignment() defined in dma-mapping.h is used > here, so we need to include dma-mapping.h. > > I though the agreement is that this patch only remove the > "#include <linux/dma-direction.h>" as we dma-mapping.h has included > dma-direction.h. > > And Alexander will work on excluding page_pool.h from skbuff.h > https://lore.kernel.org/all/09842498-b3ba-320d-be8d-348b85e8d525@intel.com/ > > Did I miss something obvious here? Or there is better way to do it > than the method discussed in the above thread? We're adding a ton of static inline functions to what is a fairly core header for networking, that's what re-triggered by complaint: include/net/page_pool.h | 179 ++++++++++++++---- Maybe we should revisit the idea of creating a new header file for inline helpers... Olek, WDYT? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v5 RFC 1/6] page_pool: frag API support for 32-bit arch with 64-bit DMA 2023-07-10 18:38 ` Jakub Kicinski @ 2023-07-11 10:59 ` Alexander Lobakin 2023-07-11 16:37 ` Jakub Kicinski 0 siblings, 1 reply; 16+ messages in thread From: Alexander Lobakin @ 2023-07-11 10:59 UTC (permalink / raw) To: Jakub Kicinski, Yunsheng Lin Cc: Yunsheng Lin, davem, pabeni, netdev, linux-kernel, Lorenzo Bianconi, Alexander Duyck, Liang Chen, Saeed Mahameed, Leon Romanovsky, Eric Dumazet, Jesper Dangaard Brouer, Ilias Apalodimas, linux-rdma From: Jakub Kicinski <kuba@kernel.org> Date: Mon, 10 Jul 2023 11:38:41 -0700 > On Sun, 9 Jul 2023 20:54:12 +0800 Yunsheng Lin wrote: >>> And the include is still here, too, eh.. >> >> In V4, it has: >> >> --- a/include/net/page_pool.h >> +++ b/include/net/page_pool.h >> @@ -33,6 +33,7 @@ >> #include <linux/mm.h> /* Needed by ptr_ring */ >> #include <linux/ptr_ring.h> >> #include <linux/dma-direction.h> >> +#include <linux/dma-mapping.h> >> >> As dma_get_cache_alignment() defined in dma-mapping.h is used >> here, so we need to include dma-mapping.h. >> >> I though the agreement is that this patch only remove the >> "#include <linux/dma-direction.h>" as we dma-mapping.h has included >> dma-direction.h. >> >> And Alexander will work on excluding page_pool.h from skbuff.h >> https://lore.kernel.org/all/09842498-b3ba-320d-be8d-348b85e8d525@intel.com/ >> >> Did I miss something obvious here? Or there is better way to do it >> than the method discussed in the above thread? > > We're adding a ton of static inline functions to what is a fairly core > header for networking, that's what re-triggered by complaint: > > include/net/page_pool.h | 179 ++++++++++++++---- > > Maybe we should revisit the idea of creating a new header file for > inline helpers... Olek, WDYT? I'm fine with that, although ain't really able to work on this myself now :s (BTW I almost finished Netlink bigints, just some more libie/IAVF crap). It just needs to be carefully designed, because if we want move ALL the inlines to a new header, we may end up including 2 PP's headers in each file. That's why I'd prefer "core/driver" separation. Let's say skbuff.c doesn't need page_pool_create(), page_pool_alloc(), and so on, while drivers don't need some of its internal functions. OTOH after my patch it's included in only around 20-30 files on allmodconfig. That is literally nothing comparing to e.g. kernel.h (w/includes) :D Thanks, Olek ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v5 RFC 1/6] page_pool: frag API support for 32-bit arch with 64-bit DMA 2023-07-11 10:59 ` Alexander Lobakin @ 2023-07-11 16:37 ` Jakub Kicinski 2023-07-11 16:59 ` Alexander Lobakin 0 siblings, 1 reply; 16+ messages in thread From: Jakub Kicinski @ 2023-07-11 16:37 UTC (permalink / raw) To: Alexander Lobakin Cc: Yunsheng Lin, Yunsheng Lin, davem, pabeni, netdev, linux-kernel, Lorenzo Bianconi, Alexander Duyck, Liang Chen, Saeed Mahameed, Leon Romanovsky, Eric Dumazet, Jesper Dangaard Brouer, Ilias Apalodimas, linux-rdma On Tue, 11 Jul 2023 12:59:00 +0200 Alexander Lobakin wrote: > I'm fine with that, although ain't really able to work on this myself > now :s (BTW I almost finished Netlink bigints, just some more libie/IAVF > crap). FWIW I was thinking about the bigints recently, and from ynl perspective I think we may want two flavors :( One which is at most the length of platform's long long, and another which is always a bigint. The latter will be more work for user space to handle, so given 99% of use cases don't need more than 64b we should make its life easier? > It just needs to be carefully designed, because if we want move ALL the > inlines to a new header, we may end up including 2 PP's headers in each > file. That's why I'd prefer "core/driver" separation. Let's say skbuff.c > doesn't need page_pool_create(), page_pool_alloc(), and so on, while > drivers don't need some of its internal functions. > OTOH after my patch it's included in only around 20-30 files on > allmodconfig. That is literally nothing comparing to e.g. kernel.h > (w/includes) :D Well, once you have to rebuilding 100+ files it gets pretty hard to clean things up ;) I think I described the preferred setup, previously: $path/page_pool.h: #include <$path/page_pool/types.h> #include <$path/page_pool/helpers.h> $path/page_pool/types.h - has types $path/page_pool/helpers.h - has all the inlines C sources can include $path/page_pool.h, headers should generally only include $path/page_pool/types.h. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v5 RFC 1/6] page_pool: frag API support for 32-bit arch with 64-bit DMA 2023-07-11 16:37 ` Jakub Kicinski @ 2023-07-11 16:59 ` Alexander Lobakin 2023-07-11 20:09 ` Jakub Kicinski 2023-07-12 12:34 ` Yunsheng Lin 0 siblings, 2 replies; 16+ messages in thread From: Alexander Lobakin @ 2023-07-11 16:59 UTC (permalink / raw) To: Jakub Kicinski Cc: Yunsheng Lin, Yunsheng Lin, davem, pabeni, netdev, linux-kernel, Lorenzo Bianconi, Alexander Duyck, Liang Chen, Saeed Mahameed, Leon Romanovsky, Eric Dumazet, Jesper Dangaard Brouer, Ilias Apalodimas, linux-rdma From: Jakub Kicinski <kuba@kernel.org> Date: Tue, 11 Jul 2023 09:37:05 -0700 > On Tue, 11 Jul 2023 12:59:00 +0200 Alexander Lobakin wrote: >> I'm fine with that, although ain't really able to work on this myself >> now :s (BTW I almost finished Netlink bigints, just some more libie/IAVF >> crap). > > FWIW I was thinking about the bigints recently, and from ynl > perspective I think we may want two flavors :( One which is at > most the length of platform's long long, and another which is (not sure we shouldn't split a separate thread off this one at this point :D) `long long` or `long`? `long long` is always 64-bit unless I'm missing something. On my 32-bit MIPS they were :D If `long long`, what's the point then if we have %NLA_U64 and would still have to add dumb padding attrs? :D I thought the idea was to carry 64+ bits encapsulated in 32-bit primitives. > always a bigint. The latter will be more work for user space > to handle, so given 99% of use cases don't need more than 64b > we should make its life easier? > >> It just needs to be carefully designed, because if we want move ALL the >> inlines to a new header, we may end up including 2 PP's headers in each >> file. That's why I'd prefer "core/driver" separation. Let's say skbuff.c >> doesn't need page_pool_create(), page_pool_alloc(), and so on, while >> drivers don't need some of its internal functions. >> OTOH after my patch it's included in only around 20-30 files on >> allmodconfig. That is literally nothing comparing to e.g. kernel.h >> (w/includes) :D > > Well, once you have to rebuilding 100+ files it gets pretty hard to > clean things up ;) > > I think I described the preferred setup, previously: > > $path/page_pool.h: > > #include <$path/page_pool/types.h> > #include <$path/page_pool/helpers.h> > > $path/page_pool/types.h - has types > $path/page_pool/helpers.h - has all the inlines > > C sources can include $path/page_pool.h, headers should generally only > include $path/page_pool/types.h. Aaah okay, I did read it backwards ._. Moreover, generic stack barely uses PP's inlines, it needs externals mostly. Thanks, Olek ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v5 RFC 1/6] page_pool: frag API support for 32-bit arch with 64-bit DMA 2023-07-11 16:59 ` Alexander Lobakin @ 2023-07-11 20:09 ` Jakub Kicinski 2023-07-12 12:34 ` Yunsheng Lin 1 sibling, 0 replies; 16+ messages in thread From: Jakub Kicinski @ 2023-07-11 20:09 UTC (permalink / raw) To: Alexander Lobakin Cc: Yunsheng Lin, Yunsheng Lin, davem, pabeni, netdev, linux-kernel, Lorenzo Bianconi, Alexander Duyck, Liang Chen, Saeed Mahameed, Leon Romanovsky, Eric Dumazet, Jesper Dangaard Brouer, Ilias Apalodimas, linux-rdma On Tue, 11 Jul 2023 18:59:51 +0200 Alexander Lobakin wrote: > From: Jakub Kicinski <kuba@kernel.org> > Date: Tue, 11 Jul 2023 09:37:05 -0700 > > > On Tue, 11 Jul 2023 12:59:00 +0200 Alexander Lobakin wrote: > >> I'm fine with that, although ain't really able to work on this myself > >> now :s (BTW I almost finished Netlink bigints, just some more libie/IAVF > >> crap). > > > > FWIW I was thinking about the bigints recently, and from ynl > > perspective I think we may want two flavors :( One which is at > > most the length of platform's long long, and another which is > > `long long` or `long`? `long long` is always 64-bit unless I'm missing > something. On my 32-bit MIPS they were :D > If `long long`, what's the point then if we have %NLA_U64 and would > still have to add dumb padding attrs? :D I thought the idea was to carry > 64+ bits encapsulated in 32-bit primitives. Sorry I confused things. Keep in mind we're only talking about what the generated YNL code ends up looking like, not the "wire" format. So we still "transport" things as multiple 32b chunks at netlink level. No padding. The question is how to render the C / C++ code on the YNL side (or any practical library). Are we storing all those values as bigints and require users to coerce them to a more natural type on each access? That'd defeat the goal of the new int type becoming the default / "don't overthink the sizing" type. If we have a subtype with a max size of 64b, it can be 32b or 64b on the wire, as needed, but user space can feel assured that u64 will always be able to store the result. The long long is my misguided attempt to be platform dependent. I think a better way of putting it would actually be 2 * sizeof(long). That way we can use u128 as max, which seems to only be defined on 64b platforms. But that's just a random thought, I'm not sure how useful it would be. Perhaps we need two types, one "basic" which tops out at 64b and one "really bigint" which can be used as bitmaps as well? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v5 RFC 1/6] page_pool: frag API support for 32-bit arch with 64-bit DMA 2023-07-11 16:59 ` Alexander Lobakin 2023-07-11 20:09 ` Jakub Kicinski @ 2023-07-12 12:34 ` Yunsheng Lin 2023-07-12 17:26 ` Jakub Kicinski 1 sibling, 1 reply; 16+ messages in thread From: Yunsheng Lin @ 2023-07-12 12:34 UTC (permalink / raw) To: Alexander Lobakin, Jakub Kicinski Cc: Yunsheng Lin, davem, pabeni, netdev, linux-kernel, Lorenzo Bianconi, Alexander Duyck, Liang Chen, Saeed Mahameed, Leon Romanovsky, Eric Dumazet, Jesper Dangaard Brouer, Ilias Apalodimas, linux-rdma On 2023/7/12 0:59, Alexander Lobakin wrote: > From: Jakub Kicinski <kuba@kernel.org> > Date: Tue, 11 Jul 2023 09:37:05 -0700 > >> On Tue, 11 Jul 2023 12:59:00 +0200 Alexander Lobakin wrote: >>> I'm fine with that, although ain't really able to work on this myself >>> now :s (BTW I almost finished Netlink bigints, just some more libie/IAVF >>> crap). >> >> FWIW I was thinking about the bigints recently, and from ynl >> perspective I think we may want two flavors :( One which is at >> most the length of platform's long long, and another which is > > (not sure we shouldn't split a separate thread off this one at this > point :D) > > `long long` or `long`? `long long` is always 64-bit unless I'm missing > something. On my 32-bit MIPS they were :D > If `long long`, what's the point then if we have %NLA_U64 and would > still have to add dumb padding attrs? :D I thought the idea was to carry > 64+ bits encapsulated in 32-bit primitives. > >> always a bigint. The latter will be more work for user space >> to handle, so given 99% of use cases don't need more than 64b >> we should make its life easier? >> >>> It just needs to be carefully designed, because if we want move ALL the >>> inlines to a new header, we may end up including 2 PP's headers in each >>> file. That's why I'd prefer "core/driver" separation. Let's say skbuff.c >>> doesn't need page_pool_create(), page_pool_alloc(), and so on, while >>> drivers don't need some of its internal functions. >>> OTOH after my patch it's included in only around 20-30 files on >>> allmodconfig. That is literally nothing comparing to e.g. kernel.h >>> (w/includes) :D >> >> Well, once you have to rebuilding 100+ files it gets pretty hard to >> clean things up ;) >> >> I think I described the preferred setup, previously: >> >> $path/page_pool.h: >> >> #include <$path/page_pool/types.h> >> #include <$path/page_pool/helpers.h> >> >> $path/page_pool/types.h - has types >> $path/page_pool/helpers.h - has all the inlines >> >> C sources can include $path/page_pool.h, headers should generally only >> include $path/page_pool/types.h. Does spliting the page_pool.h as above fix the problem about including a ton of static inline functions from "linux/dma-mapping.h" in skbuff.c? As the $path/page_pool/helpers.h which uses dma_get_cache_alignment() must include the "linux/dma-mapping.h" which has dma_get_cache_alignment() defining as a static inline function. and if skbuff.c include $path/page_pool.h or $path/page_pool/helpers.h, doesn't we still have the same problem? Or do I misunderstand something here? > > Aaah okay, I did read it backwards ._. Moreover, generic stack barely > uses PP's inlines, it needs externals mostly. > > Thanks, > Olek > > . > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v5 RFC 1/6] page_pool: frag API support for 32-bit arch with 64-bit DMA 2023-07-12 12:34 ` Yunsheng Lin @ 2023-07-12 17:26 ` Jakub Kicinski 2023-07-14 12:16 ` Yunsheng Lin 0 siblings, 1 reply; 16+ messages in thread From: Jakub Kicinski @ 2023-07-12 17:26 UTC (permalink / raw) To: Yunsheng Lin Cc: Alexander Lobakin, Yunsheng Lin, davem, pabeni, netdev, linux-kernel, Lorenzo Bianconi, Alexander Duyck, Liang Chen, Saeed Mahameed, Leon Romanovsky, Eric Dumazet, Jesper Dangaard Brouer, Ilias Apalodimas, linux-rdma On Wed, 12 Jul 2023 20:34:12 +0800 Yunsheng Lin wrote: > >> C sources can include $path/page_pool.h, headers should generally only > >> include $path/page_pool/types.h. > > Does spliting the page_pool.h as above fix the problem about including > a ton of static inline functions from "linux/dma-mapping.h" in skbuff.c? > > As the $path/page_pool/helpers.h which uses dma_get_cache_alignment() > must include the "linux/dma-mapping.h" which has dma_get_cache_alignment() > defining as a static inline function. > and if skbuff.c include $path/page_pool.h or $path/page_pool/helpers.h, > doesn't we still have the same problem? Or do I misunderstand something > here? I should have clarified that "types.h" should also include pure function declarations (and possibly static line wrappers like pure get/set functions which only need locally defined types). The skbuff.h only needs to include $path/page_pool/types.h, right? I know that Olek has a plan to remove the skbuff dependency completely but functionally / for any future dependencies - this should work? ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v5 RFC 1/6] page_pool: frag API support for 32-bit arch with 64-bit DMA 2023-07-12 17:26 ` Jakub Kicinski @ 2023-07-14 12:16 ` Yunsheng Lin 2023-07-14 13:44 ` Jesper Dangaard Brouer 0 siblings, 1 reply; 16+ messages in thread From: Yunsheng Lin @ 2023-07-14 12:16 UTC (permalink / raw) To: Jakub Kicinski Cc: Alexander Lobakin, Yunsheng Lin, davem, pabeni, netdev, linux-kernel, Lorenzo Bianconi, Alexander Duyck, Liang Chen, Saeed Mahameed, Leon Romanovsky, Eric Dumazet, Jesper Dangaard Brouer, Ilias Apalodimas, linux-rdma On 2023/7/13 1:26, Jakub Kicinski wrote: > On Wed, 12 Jul 2023 20:34:12 +0800 Yunsheng Lin wrote: >>>> C sources can include $path/page_pool.h, headers should generally only >>>> include $path/page_pool/types.h. >> >> Does spliting the page_pool.h as above fix the problem about including >> a ton of static inline functions from "linux/dma-mapping.h" in skbuff.c? >> >> As the $path/page_pool/helpers.h which uses dma_get_cache_alignment() >> must include the "linux/dma-mapping.h" which has dma_get_cache_alignment() >> defining as a static inline function. >> and if skbuff.c include $path/page_pool.h or $path/page_pool/helpers.h, >> doesn't we still have the same problem? Or do I misunderstand something >> here? > > I should have clarified that "types.h" should also include pure > function declarations (and possibly static line wrappers like > pure get/set functions which only need locally defined types). So "types.h" is not supposed/allowed to include any header and it can include any function declarations and static line wrappers which do not depend on any other header? It means we need to forward declaring a lot of 'struct' type for function declarations, right? If it is the case, the "types.h" does not seems to match it's naming when we can not really define most of the 'struct' in "types.h", such as 'struct page_pool' need to include some header in order to have definition of 'struct delayed_work'. Similar issue for 'helpers.h', as it will include most of the definition of 'struct', which are not really helpers, right? > > The skbuff.h only needs to include $path/page_pool/types.h, right? It seems doable, it need trying to prove it is indeed that case. > > I know that Olek has a plan to remove the skbuff dependency completely > but functionally / for any future dependencies - this should work? I am not experienced and confident enough to say about this for now. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v5 RFC 1/6] page_pool: frag API support for 32-bit arch with 64-bit DMA 2023-07-14 12:16 ` Yunsheng Lin @ 2023-07-14 13:44 ` Jesper Dangaard Brouer 0 siblings, 0 replies; 16+ messages in thread From: Jesper Dangaard Brouer @ 2023-07-14 13:44 UTC (permalink / raw) To: Yunsheng Lin, Jakub Kicinski Cc: brouer, Alexander Lobakin, Yunsheng Lin, davem, pabeni, netdev, linux-kernel, Lorenzo Bianconi, Alexander Duyck, Liang Chen, Saeed Mahameed, Leon Romanovsky, Eric Dumazet, Jesper Dangaard Brouer, Ilias Apalodimas, linux-rdma On 14/07/2023 14.16, Yunsheng Lin wrote: >> I know that Olek has a plan to remove the skbuff dependency completely >> but functionally / for any future dependencies - this should work? > > I am not experienced and confident enough to say about this for now. > A trick Eric once shared with me is this make command: make net/core/page_pool.i It will generate a file "net/core/page_pool.i" that kind of shows how the includes gets processed, I believe it is the C preprocess output. --Jesper ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v5 RFC 4/6] page_pool: remove PP_FLAG_PAGE_FRAG flag [not found] <20230629120226.14854-1-linyunsheng@huawei.com> 2023-06-29 12:02 ` [PATCH v5 RFC 1/6] page_pool: frag API support for 32-bit arch with 64-bit DMA Yunsheng Lin @ 2023-06-29 12:02 ` Yunsheng Lin 1 sibling, 0 replies; 16+ messages in thread From: Yunsheng Lin @ 2023-06-29 12:02 UTC (permalink / raw) To: davem, kuba, pabeni Cc: netdev, linux-kernel, Yunsheng Lin, Lorenzo Bianconi, Alexander Duyck, Liang Chen, Alexander Lobakin, Yisen Zhuang, Salil Mehta, Eric Dumazet, Sunil Goutham, Geetha sowjanya, Subbaraya Sundeep, hariprasad, Saeed Mahameed, Leon Romanovsky, Felix Fietkau, Ryder Lee, Shayne Chen, Sean Wang, Kalle Valo, Matthias Brugger, AngeloGioacchino Del Regno, Jesper Dangaard Brouer, Ilias Apalodimas, linux-rdma, linux-wireless, linux-arm-kernel, linux-mediatek PP_FLAG_PAGE_FRAG is not really needed after pp_frag_count handling is unified and page_pool_alloc_frag() is supported in 32-bit arch with 64-bit DMA, so remove it. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> CC: Lorenzo Bianconi <lorenzo@kernel.org> CC: Alexander Duyck <alexander.duyck@gmail.com> CC: Liang Chen <liangchen.linux@gmail.com> CC: Alexander Lobakin <aleksander.lobakin@intel.com> --- drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 3 +-- drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 2 +- drivers/net/wireless/mediatek/mt76/mac80211.c | 2 +- include/net/page_pool.h | 7 ++----- net/core/skbuff.c | 2 +- 6 files changed, 7 insertions(+), 11 deletions(-) diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c index b676496ec6d7..4e613d5bf1fd 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c @@ -4925,8 +4925,7 @@ static void hns3_put_ring_config(struct hns3_nic_priv *priv) static void hns3_alloc_page_pool(struct hns3_enet_ring *ring) { struct page_pool_params pp_params = { - .flags = PP_FLAG_DMA_MAP | PP_FLAG_PAGE_FRAG | - PP_FLAG_DMA_SYNC_DEV, + .flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV, .order = hns3_page_order(ring), .pool_size = ring->desc_num * hns3_buf_size(ring) / (PAGE_SIZE << hns3_page_order(ring)), diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c index a79cb680bb23..404caec467af 100644 --- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c +++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c @@ -1426,7 +1426,7 @@ int otx2_pool_init(struct otx2_nic *pfvf, u16 pool_id, return 0; } - pp_params.flags = PP_FLAG_PAGE_FRAG | PP_FLAG_DMA_MAP; + pp_params.flags = PP_FLAG_DMA_MAP; pp_params.pool_size = numptrs; pp_params.nid = NUMA_NO_NODE; pp_params.dev = pfvf->dev; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 42ccb77a1b6c..cdd00848c0cb 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -843,7 +843,7 @@ static int mlx5e_alloc_rq(struct mlx5e_params *params, } pp_params.order = 0; - pp_params.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV | PP_FLAG_PAGE_FRAG; + pp_params.flags = PP_FLAG_DMA_MAP | PP_FLAG_DMA_SYNC_DEV; pp_params.pool_size = pool_size; pp_params.nid = node; pp_params.dev = rq->pdev; diff --git a/drivers/net/wireless/mediatek/mt76/mac80211.c b/drivers/net/wireless/mediatek/mt76/mac80211.c index 467afef98ba2..ee72869e5572 100644 --- a/drivers/net/wireless/mediatek/mt76/mac80211.c +++ b/drivers/net/wireless/mediatek/mt76/mac80211.c @@ -566,7 +566,7 @@ int mt76_create_page_pool(struct mt76_dev *dev, struct mt76_queue *q) { struct page_pool_params pp_params = { .order = 0, - .flags = PP_FLAG_PAGE_FRAG, + .flags = 0, .nid = NUMA_NO_NODE, .dev = dev->dma_dev, }; diff --git a/include/net/page_pool.h b/include/net/page_pool.h index 53486ef9074d..e9fb95d62ed5 100644 --- a/include/net/page_pool.h +++ b/include/net/page_pool.h @@ -45,10 +45,8 @@ * Please note DMA-sync-for-CPU is still * device driver responsibility */ -#define PP_FLAG_PAGE_FRAG BIT(2) /* for page frag feature */ #define PP_FLAG_ALL (PP_FLAG_DMA_MAP |\ - PP_FLAG_DMA_SYNC_DEV |\ - PP_FLAG_PAGE_FRAG) + PP_FLAG_DMA_SYNC_DEV) #define PAGE_POOL_DMA_USE_PP_FRAG_COUNT \ (sizeof(dma_addr_t) > sizeof(unsigned long)) @@ -234,8 +232,7 @@ static inline struct page *page_pool_alloc_frag(struct page_pool *pool, size = ALIGN(size, dma_get_cache_alignment()); - if (WARN_ON(!(pool->p.flags & PP_FLAG_PAGE_FRAG) || - size > max_size)) + if (WARN_ON(size > max_size)) return NULL; /* Don't allow page splitting and allocate one big frag diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 7c4338221b17..ca2316cc1e7e 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -5652,7 +5652,7 @@ bool skb_try_coalesce(struct sk_buff *to, struct sk_buff *from, /* In general, avoid mixing page_pool and non-page_pool allocated * pages within the same SKB. Additionally avoid dealing with clones * with page_pool pages, in case the SKB is using page_pool fragment - * references (PP_FLAG_PAGE_FRAG). Since we only take full page + * references (page_pool_alloc_frag()). Since we only take full page * references for cloned SKBs at the moment that would result in * inconsistent reference counts. * In theory we could take full references if @from is cloned and -- 2.33.0 ^ permalink raw reply related [flat|nested] 16+ messages in thread
end of thread, other threads:[~2023-07-14 13:45 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20230629120226.14854-1-linyunsheng@huawei.com>
2023-06-29 12:02 ` [PATCH v5 RFC 1/6] page_pool: frag API support for 32-bit arch with 64-bit DMA Yunsheng Lin
2023-07-07 23:59 ` Jakub Kicinski
2023-07-09 12:39 ` Yunsheng Lin
2023-07-10 18:36 ` Jakub Kicinski
2023-07-08 0:01 ` Jakub Kicinski
2023-07-09 12:54 ` Yunsheng Lin
2023-07-10 18:38 ` Jakub Kicinski
2023-07-11 10:59 ` Alexander Lobakin
2023-07-11 16:37 ` Jakub Kicinski
2023-07-11 16:59 ` Alexander Lobakin
2023-07-11 20:09 ` Jakub Kicinski
2023-07-12 12:34 ` Yunsheng Lin
2023-07-12 17:26 ` Jakub Kicinski
2023-07-14 12:16 ` Yunsheng Lin
2023-07-14 13:44 ` Jesper Dangaard Brouer
2023-06-29 12:02 ` [PATCH v5 RFC 4/6] page_pool: remove PP_FLAG_PAGE_FRAG flag Yunsheng Lin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox