From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qc0-f171.google.com (mail-qc0-f171.google.com [209.85.216.171]) by kanga.kvack.org (Postfix) with ESMTP id 73E196B0038 for ; Thu, 7 May 2015 00:11:44 -0400 (EDT) Received: by qctx5 with SMTP id x5so15576833qct.0 for ; Wed, 06 May 2015 21:11:44 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id g31si878561qkh.66.2015.05.06.21.11.42 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 06 May 2015 21:11:43 -0700 (PDT) Subject: [PATCH 00/10] Refactor netdev page frags and move them into mm/ From: Alexander Duyck Date: Wed, 06 May 2015 21:11:34 -0700 Message-ID: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: netdev@vger.kernel.org, linux-mm@kvack.org Cc: akpm@linux-foundation.org, davem@davemloft.net, eric.dumazet@gmail.com This patch series addresses several things. First I found an issue in the performance of the pfmemalloc check from build_skb. To work around it I have provided a cached copy of pfmemalloc to be used in __netdev_alloc_skb and __napi_alloc_skb. Second I moved the page fragment allocation logic into the mm tree and added functionality for freeing page fragments. I had to fix igb before I could do this as it was using a reference to NETDEV_FRAG_PAGE_MAX_SIZE incorrectly. Finally I went through and replaced all of the duplicate code that was calling put_page and replaced it with calls to skb_free_frag. With these changes in place a simple receive and drop test increased from a packet rate of 8.9Mpps to 9.8Mpps. The gains breakdown as follows: 8.9Mpps Before 9.8Mpps After ------------------------ ------------------------ 7.8% put_compound_page 9.1% __free_page_frag 3.9% skb_free_head 1.1% put_page 4.9% build_skb 3.8% __napi_alloc_skb 2.5% __alloc_rx_skb 1.9% __napi_alloc_skb --- Alexander Duyck (10): net: Use cached copy of pfmemalloc to avoid accessing page igb: Don't use NETDEV_FRAG_PAGE_MAX_SIZE in descriptor calculation net: Store virtual address instead of page in netdev_alloc_cache mm/net: Rename and move page fragment handling from net/ to mm/ net: Add skb_free_frag to replace use of put_page in freeing skb->head netcp: Replace put_page(virt_to_head_page(ptr)) w/ skb_free_frag mvneta: Replace put_page(virt_to_head_page(ptr)) w/ skb_free_frag e1000: Replace e1000_free_frag with skb_free_frag hisilicon: Replace put_page(virt_to_head_page()) with skb_free_frag() bnx2x, tg3: Replace put_page(virt_to_head_page()) with skb_free_frag() drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 2 drivers/net/ethernet/broadcom/tg3.c | 2 drivers/net/ethernet/hisilicon/hip04_eth.c | 2 drivers/net/ethernet/intel/e1000/e1000_main.c | 19 +- drivers/net/ethernet/intel/igb/igb_main.c | 11 - drivers/net/ethernet/marvell/mvneta.c | 2 drivers/net/ethernet/ti/netcp_core.c | 2 include/linux/gfp.h | 5 + include/linux/mm_types.h | 18 ++ include/linux/skbuff.h | 9 + mm/page_alloc.c | 98 ++++++++++ net/core/skbuff.c | 224 ++++++++--------------- 12 files changed, 223 insertions(+), 171 deletions(-) -- -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wg0-f49.google.com (mail-wg0-f49.google.com [74.125.82.49]) by kanga.kvack.org (Postfix) with ESMTP id 82DB86B006C for ; Thu, 7 May 2015 00:11:48 -0400 (EDT) Received: by wgyo15 with SMTP id o15so30962024wgy.2 for ; Wed, 06 May 2015 21:11:48 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id co3si1239042wjb.161.2015.05.06.21.11.45 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 06 May 2015 21:11:46 -0700 (PDT) Subject: [PATCH 01/10] net: Use cached copy of pfmemalloc to avoid accessing page From: Alexander Duyck Date: Wed, 06 May 2015 21:11:40 -0700 Message-ID: <20150507041140.1873.58533.stgit@ahduyck-vm-fedora22> In-Reply-To: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> References: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: netdev@vger.kernel.org, linux-mm@kvack.org Cc: akpm@linux-foundation.org, davem@davemloft.net, eric.dumazet@gmail.com While testing I found that the testing for pfmemalloc in build_skb was rather expensive. I found the issue to be two-fold. First we have to get from the virtual address to the head page and that comes at the cost of something like 11 cycles. Then there is the cost for reading pfmemalloc out of the head page which can be cache cold due to the fact that put_page_testzero is likely invalidating the cache-line on one or more CPUs as the fragments can be shared. To avoid this extra expense I have added a pfmemalloc member to the netdev_alloc_cache. I then pushed pieces of __alloc_rx_skb into __napi_alloc_skb and __netdev_alloc_skb so that I could rewrite them to make use of the cached pfmemalloc value. The result is that my perf traces show a reduction from 9.28% overhead to 3.7% for the code covered by build_skb, __alloc_rx_skb, and __napi_alloc_skb when performing a test with the packet being dropped instead of being handed to napi_gro_receive. Signed-off-by: Alexander Duyck --- net/core/skbuff.c | 141 ++++++++++++++++++++++++++++++----------------------- 1 file changed, 79 insertions(+), 62 deletions(-) diff --git a/net/core/skbuff.c b/net/core/skbuff.c index b9eb90b39ac7..d6851ca32598 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -353,6 +353,7 @@ struct netdev_alloc_cache { * containing page->_count every time we allocate a fragment. */ unsigned int pagecnt_bias; + bool pfmemalloc; }; static DEFINE_PER_CPU(struct netdev_alloc_cache, netdev_alloc_cache); static DEFINE_PER_CPU(struct netdev_alloc_cache, napi_alloc_cache); @@ -379,10 +380,9 @@ static struct page *__page_frag_refill(struct netdev_alloc_cache *nc, return page; } -static void *__alloc_page_frag(struct netdev_alloc_cache __percpu *cache, +static void *__alloc_page_frag(struct netdev_alloc_cache *nc, unsigned int fragsz, gfp_t gfp_mask) { - struct netdev_alloc_cache *nc = this_cpu_ptr(cache); struct page *page = nc->frag.page; unsigned int size; int offset; @@ -402,6 +402,7 @@ refill: atomic_add(size - 1, &page->_count); /* reset page count bias and offset to start of new frag */ + nc->pfmemalloc = page->pfmemalloc; nc->pagecnt_bias = size; nc->frag.offset = size; } @@ -430,11 +431,13 @@ refill: static void *__netdev_alloc_frag(unsigned int fragsz, gfp_t gfp_mask) { + struct netdev_alloc_cache *nc; unsigned long flags; void *data; local_irq_save(flags); - data = __alloc_page_frag(&netdev_alloc_cache, fragsz, gfp_mask); + nc = this_cpu_ptr(&netdev_alloc_cache); + data = __alloc_page_frag(nc, fragsz, gfp_mask); local_irq_restore(flags); return data; } @@ -454,7 +457,9 @@ EXPORT_SYMBOL(netdev_alloc_frag); static void *__napi_alloc_frag(unsigned int fragsz, gfp_t gfp_mask) { - return __alloc_page_frag(&napi_alloc_cache, fragsz, gfp_mask); + struct netdev_alloc_cache *nc = this_cpu_ptr(&napi_alloc_cache); + + return __alloc_page_frag(nc, fragsz, gfp_mask); } void *napi_alloc_frag(unsigned int fragsz) @@ -464,76 +469,64 @@ void *napi_alloc_frag(unsigned int fragsz) EXPORT_SYMBOL(napi_alloc_frag); /** - * __alloc_rx_skb - allocate an skbuff for rx + * __netdev_alloc_skb - allocate an skbuff for rx on a specific device + * @dev: network device to receive on * @length: length to allocate * @gfp_mask: get_free_pages mask, passed to alloc_skb - * @flags: If SKB_ALLOC_RX is set, __GFP_MEMALLOC will be used for - * allocations in case we have to fallback to __alloc_skb() - * If SKB_ALLOC_NAPI is set, page fragment will be allocated - * from napi_cache instead of netdev_cache. * * Allocate a new &sk_buff and assign it a usage count of one. The - * buffer has unspecified headroom built in. Users should allocate + * buffer has NET_SKB_PAD headroom built in. Users should allocate * the headroom they think they need without accounting for the * built in space. The built in space is used for optimisations. * * %NULL is returned if there is no free memory. */ -static struct sk_buff *__alloc_rx_skb(unsigned int length, gfp_t gfp_mask, - int flags) +struct sk_buff *__netdev_alloc_skb(struct net_device *dev, unsigned int len, + gfp_t gfp_mask) { - struct sk_buff *skb = NULL; - unsigned int fragsz = SKB_DATA_ALIGN(length) + - SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); + struct netdev_alloc_cache *nc; + unsigned long flags; + struct sk_buff *skb; + bool pfmemalloc; + void *data; - if (fragsz <= PAGE_SIZE && !(gfp_mask & (__GFP_WAIT | GFP_DMA))) { - void *data; + len += NET_SKB_PAD; - if (sk_memalloc_socks()) - gfp_mask |= __GFP_MEMALLOC; + if ((len > SKB_WITH_OVERHEAD(PAGE_SIZE)) || + (gfp_mask & (__GFP_WAIT | GFP_DMA))) + return __alloc_skb(len, gfp_mask, SKB_ALLOC_RX, NUMA_NO_NODE); - data = (flags & SKB_ALLOC_NAPI) ? - __napi_alloc_frag(fragsz, gfp_mask) : - __netdev_alloc_frag(fragsz, gfp_mask); + len += SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); + len = SKB_DATA_ALIGN(len); - if (likely(data)) { - skb = build_skb(data, fragsz); - if (unlikely(!skb)) - put_page(virt_to_head_page(data)); - } - } else { - skb = __alloc_skb(length, gfp_mask, - SKB_ALLOC_RX, NUMA_NO_NODE); - } - return skb; -} + if (sk_memalloc_socks()) + gfp_mask |= __GFP_MEMALLOC; -/** - * __netdev_alloc_skb - allocate an skbuff for rx on a specific device - * @dev: network device to receive on - * @length: length to allocate - * @gfp_mask: get_free_pages mask, passed to alloc_skb - * - * Allocate a new &sk_buff and assign it a usage count of one. The - * buffer has NET_SKB_PAD headroom built in. Users should allocate - * the headroom they think they need without accounting for the - * built in space. The built in space is used for optimisations. - * - * %NULL is returned if there is no free memory. - */ -struct sk_buff *__netdev_alloc_skb(struct net_device *dev, - unsigned int length, gfp_t gfp_mask) -{ - struct sk_buff *skb; + local_irq_save(flags); - length += NET_SKB_PAD; - skb = __alloc_rx_skb(length, gfp_mask, 0); + nc = this_cpu_ptr(&netdev_alloc_cache); + data = __alloc_page_frag(nc, len, gfp_mask); + pfmemalloc = nc->pfmemalloc; - if (likely(skb)) { - skb_reserve(skb, NET_SKB_PAD); - skb->dev = dev; + local_irq_restore(flags); + + if (unlikely(!data)) + return NULL; + + skb = __build_skb(data, len); + if (unlikely(!skb)) { + put_page(virt_to_head_page(data)); + return NULL; } + /* use OR instead of assignment to avoid clearing of bits in mask */ + if (pfmemalloc) + skb->pfmemalloc = 1; + skb->head_frag = 1; + + skb_reserve(skb, NET_SKB_PAD); + skb->dev = dev; + return skb; } EXPORT_SYMBOL(__netdev_alloc_skb); @@ -551,19 +544,43 @@ EXPORT_SYMBOL(__netdev_alloc_skb); * * %NULL is returned if there is no free memory. */ -struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, - unsigned int length, gfp_t gfp_mask) +struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len, + gfp_t gfp_mask) { + struct netdev_alloc_cache *nc = this_cpu_ptr(&napi_alloc_cache); struct sk_buff *skb; + void *data; + + len += NET_SKB_PAD + NET_IP_ALIGN; + + if ((len > SKB_WITH_OVERHEAD(PAGE_SIZE)) || + (gfp_mask & (__GFP_WAIT | GFP_DMA))) + return __alloc_skb(len, gfp_mask, SKB_ALLOC_RX, NUMA_NO_NODE); + + len += SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); + len = SKB_DATA_ALIGN(len); - length += NET_SKB_PAD + NET_IP_ALIGN; - skb = __alloc_rx_skb(length, gfp_mask, SKB_ALLOC_NAPI); + if (sk_memalloc_socks()) + gfp_mask |= __GFP_MEMALLOC; + + data = __alloc_page_frag(nc, len, gfp_mask); + if (unlikely(!data)) + return NULL; - if (likely(skb)) { - skb_reserve(skb, NET_SKB_PAD + NET_IP_ALIGN); - skb->dev = napi->dev; + skb = __build_skb(data, len); + if (unlikely(!skb)) { + put_page(virt_to_head_page(data)); + return NULL; } + /* use OR instead of assignment to avoid clearing of bits in mask */ + if (nc->pfmemalloc) + skb->pfmemalloc = 1; + skb->head_frag = 1; + + skb_reserve(skb, NET_SKB_PAD + NET_IP_ALIGN); + skb->dev = napi->dev; + return skb; } EXPORT_SYMBOL(__napi_alloc_skb); -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qg0-f44.google.com (mail-qg0-f44.google.com [209.85.192.44]) by kanga.kvack.org (Postfix) with ESMTP id 07BBA6B0070 for ; Thu, 7 May 2015 00:11:49 -0400 (EDT) Received: by qgfi89 with SMTP id i89so15380298qgf.1 for ; Wed, 06 May 2015 21:11:48 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id u5si856148qgu.115.2015.05.06.21.11.47 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 06 May 2015 21:11:48 -0700 (PDT) Subject: [PATCH 02/10] igb: Don't use NETDEV_FRAG_PAGE_MAX_SIZE in descriptor calculation From: Alexander Duyck Date: Wed, 06 May 2015 21:11:45 -0700 Message-ID: <20150507041145.1873.66455.stgit@ahduyck-vm-fedora22> In-Reply-To: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> References: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: netdev@vger.kernel.org, linux-mm@kvack.org Cc: akpm@linux-foundation.org, davem@davemloft.net, eric.dumazet@gmail.com This change updates igb so that it will correctly perform the descriptor count calculation. Previously it was taking NETDEV_FRAG_PAGE_MAX_SIZE into account with isn't really correct since a different value is used to determine the size of the pages used for TCP. That is actually determined by SKB_FRAG_PAGE_ORDER. Signed-off-by: Alexander Duyck --- drivers/net/ethernet/intel/igb/igb_main.c | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index d3b4098fce48..1394642e4859 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -4974,6 +4974,7 @@ netdev_tx_t igb_xmit_frame_ring(struct sk_buff *skb, struct igb_tx_buffer *first; int tso; u32 tx_flags = 0; + unsigned short f; u16 count = TXD_USE_COUNT(skb_headlen(skb)); __be16 protocol = vlan_get_protocol(skb); u8 hdr_len = 0; @@ -4984,14 +4985,8 @@ netdev_tx_t igb_xmit_frame_ring(struct sk_buff *skb, * + 1 desc for context descriptor, * otherwise try next time */ - if (NETDEV_FRAG_PAGE_MAX_SIZE > IGB_MAX_DATA_PER_TXD) { - unsigned short f; - - for (f = 0; f < skb_shinfo(skb)->nr_frags; f++) - count += TXD_USE_COUNT(skb_shinfo(skb)->frags[f].size); - } else { - count += skb_shinfo(skb)->nr_frags; - } + for (f = 0; f < skb_shinfo(skb)->nr_frags; f++) + count += TXD_USE_COUNT(skb_shinfo(skb)->frags[f].size); if (igb_maybe_stop_tx(tx_ring, count + 3)) { /* this is a hard error */ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qg0-f46.google.com (mail-qg0-f46.google.com [209.85.192.46]) by kanga.kvack.org (Postfix) with ESMTP id 738AC6B0070 for ; Thu, 7 May 2015 00:11:54 -0400 (EDT) Received: by qgeb100 with SMTP id b100so15370858qge.3 for ; Wed, 06 May 2015 21:11:54 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id y39si874017qgy.79.2015.05.06.21.11.53 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 06 May 2015 21:11:53 -0700 (PDT) Subject: [PATCH 03/10] net: Store virtual address instead of page in netdev_alloc_cache From: Alexander Duyck Date: Wed, 06 May 2015 21:11:51 -0700 Message-ID: <20150507041151.1873.2487.stgit@ahduyck-vm-fedora22> In-Reply-To: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> References: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: netdev@vger.kernel.org, linux-mm@kvack.org Cc: akpm@linux-foundation.org, davem@davemloft.net, eric.dumazet@gmail.com This change makes it so that we store the virtual address of the page in the netdev_alloc_cache instead of the page pointer. The idea behind this is to avoid multiple calls to page_address since the virtual address is required for every access, but the page pointer is only needed at allocation or reset of the page. While I was at it I also reordered the netdev_alloc_cache structure a bit so that the size is always 16 bytes by dropping size in the case where PAGE_SIZE is greater than or equal to 32KB. Signed-off-by: Alexander Duyck --- include/linux/skbuff.h | 5 ++-- net/core/skbuff.c | 55 ++++++++++++++++++++++++++++-------------------- 2 files changed, 34 insertions(+), 26 deletions(-) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 9c2f793573fa..8b9a2c35a9d7 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -2128,9 +2128,8 @@ static inline void __skb_queue_purge(struct sk_buff_head *list) kfree_skb(skb); } -#define NETDEV_FRAG_PAGE_MAX_ORDER get_order(32768) -#define NETDEV_FRAG_PAGE_MAX_SIZE (PAGE_SIZE << NETDEV_FRAG_PAGE_MAX_ORDER) -#define NETDEV_PAGECNT_MAX_BIAS NETDEV_FRAG_PAGE_MAX_SIZE +#define NETDEV_FRAG_PAGE_MAX_SIZE __ALIGN_MASK(32768, ~PAGE_MASK) +#define NETDEV_FRAG_PAGE_MAX_ORDER get_order(NETDEV_FRAG_PAGE_MAX_SIZE) void *netdev_alloc_frag(unsigned int fragsz); diff --git a/net/core/skbuff.c b/net/core/skbuff.c index d6851ca32598..a3062ec341c3 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -348,7 +348,13 @@ struct sk_buff *build_skb(void *data, unsigned int frag_size) EXPORT_SYMBOL(build_skb); struct netdev_alloc_cache { - struct page_frag frag; + void * va; +#if (PAGE_SIZE < NETDEV_FRAG_PAGE_MAX_SIZE) + __u16 offset; + __u16 size; +#else + __u32 offset; +#endif /* we maintain a pagecount bias, so that we dont dirty cache line * containing page->_count every time we allocate a fragment. */ @@ -361,21 +367,20 @@ static DEFINE_PER_CPU(struct netdev_alloc_cache, napi_alloc_cache); static struct page *__page_frag_refill(struct netdev_alloc_cache *nc, gfp_t gfp_mask) { - const unsigned int order = NETDEV_FRAG_PAGE_MAX_ORDER; struct page *page = NULL; gfp_t gfp = gfp_mask; - if (order) { - gfp_mask |= __GFP_COMP | __GFP_NOWARN | __GFP_NORETRY | - __GFP_NOMEMALLOC; - page = alloc_pages_node(NUMA_NO_NODE, gfp_mask, order); - nc->frag.size = PAGE_SIZE << (page ? order : 0); - } - +#if (PAGE_SIZE < NETDEV_FRAG_PAGE_MAX_SIZE) + gfp_mask |= __GFP_COMP | __GFP_NOWARN | __GFP_NORETRY | + __GFP_NOMEMALLOC; + page = alloc_pages_node(NUMA_NO_NODE, gfp_mask, + NETDEV_FRAG_PAGE_MAX_ORDER); + nc->size = page ? NETDEV_FRAG_PAGE_MAX_SIZE : PAGE_SIZE; +#endif if (unlikely(!page)) page = alloc_pages_node(NUMA_NO_NODE, gfp, 0); - nc->frag.page = page; + nc->va = page ? page_address(page) : NULL; return page; } @@ -383,19 +388,20 @@ static struct page *__page_frag_refill(struct netdev_alloc_cache *nc, static void *__alloc_page_frag(struct netdev_alloc_cache *nc, unsigned int fragsz, gfp_t gfp_mask) { - struct page *page = nc->frag.page; - unsigned int size; + unsigned int size = PAGE_SIZE; + struct page *page; int offset; - if (unlikely(!page)) { + if (unlikely(!nc->va)) { refill: page = __page_frag_refill(nc, gfp_mask); if (!page) return NULL; - /* if size can vary use frag.size else just use PAGE_SIZE */ - size = NETDEV_FRAG_PAGE_MAX_ORDER ? nc->frag.size : PAGE_SIZE; - +#if (PAGE_SIZE < NETDEV_FRAG_PAGE_MAX_SIZE) + /* if size can vary use size else just use PAGE_SIZE */ + size = nc->size; +#endif /* Even if we own the page, we do not use atomic_set(). * This would break get_page_unless_zero() users. */ @@ -404,17 +410,20 @@ refill: /* reset page count bias and offset to start of new frag */ nc->pfmemalloc = page->pfmemalloc; nc->pagecnt_bias = size; - nc->frag.offset = size; + nc->offset = size; } - offset = nc->frag.offset - fragsz; + offset = nc->offset - fragsz; if (unlikely(offset < 0)) { + page = virt_to_page(nc->va); + if (!atomic_sub_and_test(nc->pagecnt_bias, &page->_count)) goto refill; - /* if size can vary use frag.size else just use PAGE_SIZE */ - size = NETDEV_FRAG_PAGE_MAX_ORDER ? nc->frag.size : PAGE_SIZE; - +#if (PAGE_SIZE < NETDEV_FRAG_PAGE_MAX_SIZE) + /* if size can vary use size else just use PAGE_SIZE */ + size = nc->size; +#endif /* OK, page count is 0, we can safely set it */ atomic_set(&page->_count, size); @@ -424,9 +433,9 @@ refill: } nc->pagecnt_bias--; - nc->frag.offset = offset; + nc->offset = offset; - return page_address(page) + offset; + return nc->va + offset; } static void *__netdev_alloc_frag(unsigned int fragsz, gfp_t gfp_mask) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk0-f173.google.com (mail-qk0-f173.google.com [209.85.220.173]) by kanga.kvack.org (Postfix) with ESMTP id 4E1796B0071 for ; Thu, 7 May 2015 00:12:00 -0400 (EDT) Received: by qkgx75 with SMTP id x75so20374480qkg.1 for ; Wed, 06 May 2015 21:12:00 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id 82si858667qhw.114.2015.05.06.21.11.58 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 06 May 2015 21:11:59 -0700 (PDT) Subject: [PATCH 04/10] mm/net: Rename and move page fragment handling from net/ to mm/ From: Alexander Duyck Date: Wed, 06 May 2015 21:11:57 -0700 Message-ID: <20150507041157.1873.63151.stgit@ahduyck-vm-fedora22> In-Reply-To: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> References: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: netdev@vger.kernel.org, linux-mm@kvack.org Cc: akpm@linux-foundation.org, davem@davemloft.net, eric.dumazet@gmail.com This change moves the __alloc_page_frag functionality out of the networking stack and into the page allocation portion of mm. The idea it so help make this maintainable by placing it with other page allocation functions. Since we are moving it from skbuff.c to page_alloc.c I have also renamed the basic defines and structure from netdev_alloc_cache to page_frag_cache to reflect that this is now part of a different kernel subsystem. I have also added a simple __free_page_frag function which can handle freeing the frags based on the skb->head pointer. The model for this is based off of __free_pages since we don't actually need to deal with all of the cases that put_page handles. I incorporated the virt_to_head_page call and compound_order into the function as it actually allows for a signficant size reduction by reducing code duplication. Signed-off-by: Alexander Duyck --- include/linux/gfp.h | 5 ++ include/linux/mm_types.h | 18 ++++++++ include/linux/skbuff.h | 3 - mm/page_alloc.c | 98 +++++++++++++++++++++++++++++++++++++++++++++ net/core/skbuff.c | 100 +++------------------------------------------- 5 files changed, 127 insertions(+), 97 deletions(-) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index 97a9373e61e8..70a7fee1efb3 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -366,6 +366,11 @@ extern void free_pages(unsigned long addr, unsigned int order); extern void free_hot_cold_page(struct page *page, bool cold); extern void free_hot_cold_page_list(struct list_head *list, bool cold); +struct page_frag_cache; +extern void *__alloc_page_frag(struct page_frag_cache *nc, + unsigned int fragsz, gfp_t gfp_mask); +extern void __free_page_frag(void *addr); + extern void __free_kmem_pages(struct page *page, unsigned int order); extern void free_kmem_pages(unsigned long addr, unsigned int order); diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 8d37e26a1007..0038ac7466fd 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -226,6 +226,24 @@ struct page_frag { #endif }; +#define PAGE_FRAG_CACHE_MAX_SIZE __ALIGN_MASK(32768, ~PAGE_MASK) +#define PAGE_FRAG_CACHE_MAX_ORDER get_order(PAGE_FRAG_CACHE_MAX_SIZE) + +struct page_frag_cache { + void * va; +#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) + __u16 offset; + __u16 size; +#else + __u32 offset; +#endif + /* we maintain a pagecount bias, so that we dont dirty cache line + * containing page->_count every time we allocate a fragment. + */ + unsigned int pagecnt_bias; + bool pfmemalloc; +}; + typedef unsigned long __nocast vm_flags_t; /* diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 8b9a2c35a9d7..0039fcc45b3b 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -2128,9 +2128,6 @@ static inline void __skb_queue_purge(struct sk_buff_head *list) kfree_skb(skb); } -#define NETDEV_FRAG_PAGE_MAX_SIZE __ALIGN_MASK(32768, ~PAGE_MASK) -#define NETDEV_FRAG_PAGE_MAX_ORDER get_order(NETDEV_FRAG_PAGE_MAX_SIZE) - void *netdev_alloc_frag(unsigned int fragsz); struct sk_buff *__netdev_alloc_skb(struct net_device *dev, unsigned int length, diff --git a/mm/page_alloc.c b/mm/page_alloc.c index ebffa0e4a9c0..2fd31aebef30 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2967,6 +2967,104 @@ void free_pages(unsigned long addr, unsigned int order) EXPORT_SYMBOL(free_pages); /* + * Page Fragment: + * An arbitrary-length arbitrary-offset area of memory which resides + * within a 0 or higher order page. Multiple fragments within that page + * are individually refcounted, in the page's reference counter. + * + * The page_frag functions below provide a simple allocation framework for + * page fragments. This is used by the network stack and network device + * drivers to provide a backing region of memory for use as either an + * sk_buff->head, or to be used in the "frags" portion of skb_shared_info. + */ +static struct page *__page_frag_refill(struct page_frag_cache *nc, + gfp_t gfp_mask) +{ + struct page *page = NULL; + gfp_t gfp = gfp_mask; + +#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) + gfp_mask |= __GFP_COMP | __GFP_NOWARN | __GFP_NORETRY | + __GFP_NOMEMALLOC; + page = alloc_pages_node(NUMA_NO_NODE, gfp_mask, + PAGE_FRAG_CACHE_MAX_ORDER); + nc->size = page ? PAGE_FRAG_CACHE_MAX_SIZE : PAGE_SIZE; +#endif + if (unlikely(!page)) + page = alloc_pages_node(NUMA_NO_NODE, gfp, 0); + + nc->va = page ? page_address(page) : NULL; + + return page; +} + +void *__alloc_page_frag(struct page_frag_cache *nc, + unsigned int fragsz, gfp_t gfp_mask) +{ + unsigned int size = PAGE_SIZE; + struct page *page; + int offset; + + if (unlikely(!nc->va)) { +refill: + page = __page_frag_refill(nc, gfp_mask); + if (!page) + return NULL; + +#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) + /* if size can vary use size else just use PAGE_SIZE */ + size = nc->size; +#endif + /* Even if we own the page, we do not use atomic_set(). + * This would break get_page_unless_zero() users. + */ + atomic_add(size - 1, &page->_count); + + /* reset page count bias and offset to start of new frag */ + nc->pfmemalloc = page->pfmemalloc; + nc->pagecnt_bias = size; + nc->offset = size; + } + + offset = nc->offset - fragsz; + if (unlikely(offset < 0)) { + page = virt_to_page(nc->va); + + if (!atomic_sub_and_test(nc->pagecnt_bias, &page->_count)) + goto refill; + +#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) + /* if size can vary use size else just use PAGE_SIZE */ + size = nc->size; +#endif + /* OK, page count is 0, we can safely set it */ + atomic_set(&page->_count, size); + + /* reset page count bias and offset to start of new frag */ + nc->pagecnt_bias = size; + offset = size - fragsz; + } + + nc->pagecnt_bias--; + nc->offset = offset; + + return nc->va + offset; +} +EXPORT_SYMBOL(__alloc_page_frag); + +/* + * Frees a page fragment allocated out of either a compound or order 0 page. + */ +void __free_page_frag(void *addr) +{ + struct page *page = virt_to_head_page(addr); + + if (unlikely(put_page_testzero(page))) + __free_pages_ok(page, compound_order(page)); +} +EXPORT_SYMBOL(__free_page_frag); + +/* * alloc_kmem_pages charges newly allocated pages to the kmem resource counter * of the current memory cgroup. * diff --git a/net/core/skbuff.c b/net/core/skbuff.c index a3062ec341c3..dcc0e07abf47 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -347,100 +347,12 @@ struct sk_buff *build_skb(void *data, unsigned int frag_size) } EXPORT_SYMBOL(build_skb); -struct netdev_alloc_cache { - void * va; -#if (PAGE_SIZE < NETDEV_FRAG_PAGE_MAX_SIZE) - __u16 offset; - __u16 size; -#else - __u32 offset; -#endif - /* we maintain a pagecount bias, so that we dont dirty cache line - * containing page->_count every time we allocate a fragment. - */ - unsigned int pagecnt_bias; - bool pfmemalloc; -}; -static DEFINE_PER_CPU(struct netdev_alloc_cache, netdev_alloc_cache); -static DEFINE_PER_CPU(struct netdev_alloc_cache, napi_alloc_cache); - -static struct page *__page_frag_refill(struct netdev_alloc_cache *nc, - gfp_t gfp_mask) -{ - struct page *page = NULL; - gfp_t gfp = gfp_mask; - -#if (PAGE_SIZE < NETDEV_FRAG_PAGE_MAX_SIZE) - gfp_mask |= __GFP_COMP | __GFP_NOWARN | __GFP_NORETRY | - __GFP_NOMEMALLOC; - page = alloc_pages_node(NUMA_NO_NODE, gfp_mask, - NETDEV_FRAG_PAGE_MAX_ORDER); - nc->size = page ? NETDEV_FRAG_PAGE_MAX_SIZE : PAGE_SIZE; -#endif - if (unlikely(!page)) - page = alloc_pages_node(NUMA_NO_NODE, gfp, 0); - - nc->va = page ? page_address(page) : NULL; - - return page; -} - -static void *__alloc_page_frag(struct netdev_alloc_cache *nc, - unsigned int fragsz, gfp_t gfp_mask) -{ - unsigned int size = PAGE_SIZE; - struct page *page; - int offset; - - if (unlikely(!nc->va)) { -refill: - page = __page_frag_refill(nc, gfp_mask); - if (!page) - return NULL; - -#if (PAGE_SIZE < NETDEV_FRAG_PAGE_MAX_SIZE) - /* if size can vary use size else just use PAGE_SIZE */ - size = nc->size; -#endif - /* Even if we own the page, we do not use atomic_set(). - * This would break get_page_unless_zero() users. - */ - atomic_add(size - 1, &page->_count); - - /* reset page count bias and offset to start of new frag */ - nc->pfmemalloc = page->pfmemalloc; - nc->pagecnt_bias = size; - nc->offset = size; - } - - offset = nc->offset - fragsz; - if (unlikely(offset < 0)) { - page = virt_to_page(nc->va); - - if (!atomic_sub_and_test(nc->pagecnt_bias, &page->_count)) - goto refill; - -#if (PAGE_SIZE < NETDEV_FRAG_PAGE_MAX_SIZE) - /* if size can vary use size else just use PAGE_SIZE */ - size = nc->size; -#endif - /* OK, page count is 0, we can safely set it */ - atomic_set(&page->_count, size); - - /* reset page count bias and offset to start of new frag */ - nc->pagecnt_bias = size; - offset = size - fragsz; - } - - nc->pagecnt_bias--; - nc->offset = offset; - - return nc->va + offset; -} +static DEFINE_PER_CPU(struct page_frag_cache, netdev_alloc_cache); +static DEFINE_PER_CPU(struct page_frag_cache, napi_alloc_cache); static void *__netdev_alloc_frag(unsigned int fragsz, gfp_t gfp_mask) { - struct netdev_alloc_cache *nc; + struct page_frag_cache *nc; unsigned long flags; void *data; @@ -466,7 +378,7 @@ EXPORT_SYMBOL(netdev_alloc_frag); static void *__napi_alloc_frag(unsigned int fragsz, gfp_t gfp_mask) { - struct netdev_alloc_cache *nc = this_cpu_ptr(&napi_alloc_cache); + struct page_frag_cache *nc = this_cpu_ptr(&napi_alloc_cache); return __alloc_page_frag(nc, fragsz, gfp_mask); } @@ -493,7 +405,7 @@ EXPORT_SYMBOL(napi_alloc_frag); struct sk_buff *__netdev_alloc_skb(struct net_device *dev, unsigned int len, gfp_t gfp_mask) { - struct netdev_alloc_cache *nc; + struct page_frag_cache *nc; unsigned long flags; struct sk_buff *skb; bool pfmemalloc; @@ -556,7 +468,7 @@ EXPORT_SYMBOL(__netdev_alloc_skb); struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len, gfp_t gfp_mask) { - struct netdev_alloc_cache *nc = this_cpu_ptr(&napi_alloc_cache); + struct page_frag_cache *nc = this_cpu_ptr(&napi_alloc_cache); struct sk_buff *skb; void *data; -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qg0-f54.google.com (mail-qg0-f54.google.com [209.85.192.54]) by kanga.kvack.org (Postfix) with ESMTP id A989E6B0072 for ; Thu, 7 May 2015 00:12:11 -0400 (EDT) Received: by qgeb100 with SMTP id b100so15372933qge.3 for ; Wed, 06 May 2015 21:12:11 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id e76si861033qka.106.2015.05.06.21.12.09 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 06 May 2015 21:12:11 -0700 (PDT) Subject: [PATCH 05/10] net: Add skb_free_frag to replace use of put_page in freeing skb->head From: Alexander Duyck Date: Wed, 06 May 2015 21:12:03 -0700 Message-ID: <20150507041203.1873.67584.stgit@ahduyck-vm-fedora22> In-Reply-To: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> References: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: netdev@vger.kernel.org, linux-mm@kvack.org Cc: akpm@linux-foundation.org, davem@davemloft.net, eric.dumazet@gmail.com This change adds a function called skb_free_frag which is meant to compliment the function netdev_alloc_frag. The general idea is to enable a more lightweight version of page freeing since we don't actually need all the overhead of a put_page, and we don't quite fit the model of __free_pages. Signed-off-by: Alexander Duyck --- include/linux/skbuff.h | 5 +++++ net/core/skbuff.c | 10 ++++++---- 2 files changed, 11 insertions(+), 4 deletions(-) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 0039fcc45b3b..c0b574a414e7 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -2182,6 +2182,11 @@ static inline struct sk_buff *netdev_alloc_skb_ip_align(struct net_device *dev, return __netdev_alloc_skb_ip_align(dev, length, GFP_ATOMIC); } +static inline void skb_free_frag(void *addr) +{ + __free_page_frag(addr); +} + void *napi_alloc_frag(unsigned int fragsz); struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int length, gfp_t gfp_mask); diff --git a/net/core/skbuff.c b/net/core/skbuff.c index dcc0e07abf47..d67e612bf0ef 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -436,7 +436,7 @@ struct sk_buff *__netdev_alloc_skb(struct net_device *dev, unsigned int len, skb = __build_skb(data, len); if (unlikely(!skb)) { - put_page(virt_to_head_page(data)); + skb_free_frag(data); return NULL; } @@ -490,7 +490,7 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len, skb = __build_skb(data, len); if (unlikely(!skb)) { - put_page(virt_to_head_page(data)); + skb_free_frag(data); return NULL; } @@ -549,10 +549,12 @@ static void skb_clone_fraglist(struct sk_buff *skb) static void skb_free_head(struct sk_buff *skb) { + unsigned char *head = skb->head; + if (skb->head_frag) - put_page(virt_to_head_page(skb->head)); + skb_free_frag(head); else - kfree(skb->head); + kfree(head); } static void skb_release_data(struct sk_buff *skb) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qg0-f53.google.com (mail-qg0-f53.google.com [209.85.192.53]) by kanga.kvack.org (Postfix) with ESMTP id 6DE596B0073 for ; Thu, 7 May 2015 00:12:12 -0400 (EDT) Received: by qgfi89 with SMTP id i89so15383162qgf.1 for ; Wed, 06 May 2015 21:12:12 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id hz3si4444qcb.10.2015.05.06.21.12.10 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 06 May 2015 21:12:10 -0700 (PDT) Subject: [PATCH 06/10] netcp: Replace put_page(virt_to_head_page(ptr)) w/ skb_free_frag From: Alexander Duyck Date: Wed, 06 May 2015 21:12:08 -0700 Message-ID: <20150507041208.1873.92668.stgit@ahduyck-vm-fedora22> In-Reply-To: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> References: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: netdev@vger.kernel.org, linux-mm@kvack.org Cc: akpm@linux-foundation.org, davem@davemloft.net, eric.dumazet@gmail.com Signed-off-by: Alexander Duyck --- drivers/net/ethernet/ti/netcp_core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/ti/netcp_core.c b/drivers/net/ethernet/ti/netcp_core.c index 43efc3a0cda5..0a28c07361cf 100644 --- a/drivers/net/ethernet/ti/netcp_core.c +++ b/drivers/net/ethernet/ti/netcp_core.c @@ -537,7 +537,7 @@ int netcp_unregister_rxhook(struct netcp_intf *netcp_priv, int order, static void netcp_frag_free(bool is_frag, void *ptr) { if (is_frag) - put_page(virt_to_head_page(ptr)); + skb_free_frag(ptr); else kfree(ptr); } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk0-f177.google.com (mail-qk0-f177.google.com [209.85.220.177]) by kanga.kvack.org (Postfix) with ESMTP id BAC676B0074 for ; Thu, 7 May 2015 00:12:16 -0400 (EDT) Received: by qkhg7 with SMTP id g7so20359864qkh.2 for ; Wed, 06 May 2015 21:12:16 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id fb7si889087qcb.28.2015.05.06.21.12.15 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 06 May 2015 21:12:16 -0700 (PDT) Subject: [PATCH 07/10] mvneta: Replace put_page(virt_to_head_page(ptr)) w/ skb_free_frag From: Alexander Duyck Date: Wed, 06 May 2015 21:12:14 -0700 Message-ID: <20150507041214.1873.71461.stgit@ahduyck-vm-fedora22> In-Reply-To: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> References: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: netdev@vger.kernel.org, linux-mm@kvack.org Cc: akpm@linux-foundation.org, davem@davemloft.net, eric.dumazet@gmail.com Signed-off-by: Alexander Duyck --- drivers/net/ethernet/marvell/mvneta.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c index ce5f7f9cff06..ecce8261ce3b 100644 --- a/drivers/net/ethernet/marvell/mvneta.c +++ b/drivers/net/ethernet/marvell/mvneta.c @@ -1359,7 +1359,7 @@ static void *mvneta_frag_alloc(const struct mvneta_port *pp) static void mvneta_frag_free(const struct mvneta_port *pp, void *data) { if (likely(pp->frag_size <= PAGE_SIZE)) - put_page(virt_to_head_page(data)); + skb_free_frag(data); else kfree(data); } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qg0-f53.google.com (mail-qg0-f53.google.com [209.85.192.53]) by kanga.kvack.org (Postfix) with ESMTP id 60F3A6B0075 for ; Thu, 7 May 2015 00:12:22 -0400 (EDT) Received: by qgeb100 with SMTP id b100so15374310qge.3 for ; Wed, 06 May 2015 21:12:22 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id j93si893867qkh.41.2015.05.06.21.12.21 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 06 May 2015 21:12:21 -0700 (PDT) Subject: [PATCH 08/10] e1000: Replace e1000_free_frag with skb_free_frag From: Alexander Duyck Date: Wed, 06 May 2015 21:12:20 -0700 Message-ID: <20150507041220.1873.80925.stgit@ahduyck-vm-fedora22> In-Reply-To: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> References: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: netdev@vger.kernel.org, linux-mm@kvack.org Cc: akpm@linux-foundation.org, davem@davemloft.net, eric.dumazet@gmail.com Signed-off-by: Alexander Duyck --- drivers/net/ethernet/intel/e1000/e1000_main.c | 19 +++++++------------ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c index 983eb4e6f7aa..74dc15055971 100644 --- a/drivers/net/ethernet/intel/e1000/e1000_main.c +++ b/drivers/net/ethernet/intel/e1000/e1000_main.c @@ -2079,11 +2079,6 @@ static void *e1000_alloc_frag(const struct e1000_adapter *a) return data; } -static void e1000_free_frag(const void *data) -{ - put_page(virt_to_head_page(data)); -} - /** * e1000_clean_rx_ring - Free Rx Buffers per Queue * @adapter: board private structure @@ -2107,7 +2102,7 @@ static void e1000_clean_rx_ring(struct e1000_adapter *adapter, adapter->rx_buffer_len, DMA_FROM_DEVICE); if (buffer_info->rxbuf.data) { - e1000_free_frag(buffer_info->rxbuf.data); + skb_free_frag(buffer_info->rxbuf.data); buffer_info->rxbuf.data = NULL; } } else if (adapter->clean_rx == e1000_clean_jumbo_rx_irq) { @@ -4594,28 +4589,28 @@ static void e1000_alloc_rx_buffers(struct e1000_adapter *adapter, data = e1000_alloc_frag(adapter); /* Failed allocation, critical failure */ if (!data) { - e1000_free_frag(olddata); + skb_free_frag(olddata); adapter->alloc_rx_buff_failed++; break; } if (!e1000_check_64k_bound(adapter, data, bufsz)) { /* give up */ - e1000_free_frag(data); - e1000_free_frag(olddata); + skb_free_frag(data); + skb_free_frag(olddata); adapter->alloc_rx_buff_failed++; break; } /* Use new allocation */ - e1000_free_frag(olddata); + skb_free_frag(olddata); } buffer_info->dma = dma_map_single(&pdev->dev, data, adapter->rx_buffer_len, DMA_FROM_DEVICE); if (dma_mapping_error(&pdev->dev, buffer_info->dma)) { - e1000_free_frag(data); + skb_free_frag(data); buffer_info->dma = 0; adapter->alloc_rx_buff_failed++; break; @@ -4637,7 +4632,7 @@ static void e1000_alloc_rx_buffers(struct e1000_adapter *adapter, adapter->rx_buffer_len, DMA_FROM_DEVICE); - e1000_free_frag(data); + skb_free_frag(data); buffer_info->rxbuf.data = NULL; buffer_info->dma = 0; -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qg0-f52.google.com (mail-qg0-f52.google.com [209.85.192.52]) by kanga.kvack.org (Postfix) with ESMTP id B65F06B007B for ; Thu, 7 May 2015 00:12:33 -0400 (EDT) Received: by qgeb100 with SMTP id b100so15375739qge.3 for ; Wed, 06 May 2015 21:12:33 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id a78si856574qka.120.2015.05.06.21.12.32 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 06 May 2015 21:12:33 -0700 (PDT) Subject: [PATCH 10/10] bnx2x, tg3: Replace put_page(virt_to_head_page()) with skb_free_frag() From: Alexander Duyck Date: Wed, 06 May 2015 21:12:31 -0700 Message-ID: <20150507041231.1873.76848.stgit@ahduyck-vm-fedora22> In-Reply-To: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> References: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: netdev@vger.kernel.org, linux-mm@kvack.org Cc: akpm@linux-foundation.org, davem@davemloft.net, eric.dumazet@gmail.com Signed-off-by: Alexander Duyck --- drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 2 +- drivers/net/ethernet/broadcom/tg3.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c index a8bb8f664d3d..b10d1744e5ae 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c @@ -662,7 +662,7 @@ static int bnx2x_fill_frag_skb(struct bnx2x *bp, struct bnx2x_fastpath *fp, static void bnx2x_frag_free(const struct bnx2x_fastpath *fp, void *data) { if (fp->rx_frag_size) - put_page(virt_to_head_page(data)); + skb_free_frag(data); else kfree(data); } diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c index 069952fa5d64..73c934cf6c61 100644 --- a/drivers/net/ethernet/broadcom/tg3.c +++ b/drivers/net/ethernet/broadcom/tg3.c @@ -6618,7 +6618,7 @@ static void tg3_tx(struct tg3_napi *tnapi) static void tg3_frag_free(bool is_frag, void *data) { if (is_frag) - put_page(virt_to_head_page(data)); + skb_free_frag(data); else kfree(data); } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qk0-f178.google.com (mail-qk0-f178.google.com [209.85.220.178]) by kanga.kvack.org (Postfix) with ESMTP id 52B646B0080 for ; Thu, 7 May 2015 00:12:34 -0400 (EDT) Received: by qkgx75 with SMTP id x75so20380026qkg.1 for ; Wed, 06 May 2015 21:12:34 -0700 (PDT) Received: from mx1.redhat.com (mx1.redhat.com. [209.132.183.28]) by mx.google.com with ESMTPS id q63si894651qgd.39.2015.05.06.21.12.32 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 06 May 2015 21:12:33 -0700 (PDT) Subject: [PATCH 09/10] hisilicon: Replace put_page(virt_to_head_page()) with skb_free_frag() From: Alexander Duyck Date: Wed, 06 May 2015 21:12:25 -0700 Message-ID: <20150507041225.1873.74253.stgit@ahduyck-vm-fedora22> In-Reply-To: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> References: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: netdev@vger.kernel.org, linux-mm@kvack.org Cc: akpm@linux-foundation.org, davem@davemloft.net, eric.dumazet@gmail.com Signed-off-by: Alexander Duyck --- drivers/net/ethernet/hisilicon/hip04_eth.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/hisilicon/hip04_eth.c b/drivers/net/ethernet/hisilicon/hip04_eth.c index 3b39fdddeb57..d49bee38cd31 100644 --- a/drivers/net/ethernet/hisilicon/hip04_eth.c +++ b/drivers/net/ethernet/hisilicon/hip04_eth.c @@ -798,7 +798,7 @@ static void hip04_free_ring(struct net_device *ndev, struct device *d) for (i = 0; i < RX_DESC_NUM; i++) if (priv->rx_buf[i]) - put_page(virt_to_head_page(priv->rx_buf[i])); + skb_free_frag(priv->rx_buf[i]); for (i = 0; i < TX_DESC_NUM; i++) if (priv->tx_skb[i]) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ob0-f179.google.com (mail-ob0-f179.google.com [209.85.214.179]) by kanga.kvack.org (Postfix) with ESMTP id 638E26B0038 for ; Thu, 7 May 2015 04:46:30 -0400 (EDT) Received: by obcux3 with SMTP id ux3so26402326obc.2 for ; Thu, 07 May 2015 01:46:30 -0700 (PDT) Received: from mail-ob0-x22a.google.com (mail-ob0-x22a.google.com. [2607:f8b0:4003:c01::22a]) by mx.google.com with ESMTPS id xr4si832720obc.94.2015.05.07.01.46.29 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 07 May 2015 01:46:29 -0700 (PDT) Received: by obfe9 with SMTP id e9so26441667obf.1 for ; Thu, 07 May 2015 01:46:29 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20150507041220.1873.80925.stgit@ahduyck-vm-fedora22> References: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> <20150507041220.1873.80925.stgit@ahduyck-vm-fedora22> Date: Thu, 7 May 2015 01:46:29 -0700 Message-ID: Subject: Re: [PATCH 08/10] e1000: Replace e1000_free_frag with skb_free_frag From: Jeff Kirsher Content-Type: text/plain; charset=UTF-8 Sender: owner-linux-mm@kvack.org List-ID: To: Alexander Duyck Cc: netdev , linux-mm@kvack.org, Andrew Morton , David Miller , Eric Dumazet On Wed, May 6, 2015 at 9:12 PM, Alexander Duyck wrote: > Signed-off-by: Alexander Duyck > --- > drivers/net/ethernet/intel/e1000/e1000_main.c | 19 +++++++------------ > 1 file changed, 7 insertions(+), 12 deletions(-) Was my ACK of the first patch not good enough? :-) Acked-by: Jeff Kirsher -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f51.google.com (mail-pa0-f51.google.com [209.85.220.51]) by kanga.kvack.org (Postfix) with ESMTP id 607146B0038 for ; Sun, 10 May 2015 19:18:05 -0400 (EDT) Received: by pacwv17 with SMTP id wv17so95057645pac.0 for ; Sun, 10 May 2015 16:18:05 -0700 (PDT) Received: from shards.monkeyblade.net (shards.monkeyblade.net. [2001:4f8:3:36:211:85ff:fe63:a549]) by mx.google.com with ESMTP id ke5si15566309pab.238.2015.05.10.16.18.02 for ; Sun, 10 May 2015 16:18:03 -0700 (PDT) Date: Sun, 10 May 2015 19:17:58 -0400 (EDT) Message-Id: <20150510.191758.2130017622255857830.davem@davemloft.net> Subject: Re: [PATCH 00/10] Refactor netdev page frags and move them into mm/ From: David Miller In-Reply-To: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> References: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: alexander.h.duyck@redhat.com Cc: netdev@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, eric.dumazet@gmail.com From: Alexander Duyck Date: Wed, 06 May 2015 21:11:34 -0700 > This patch series addresses several things. > > First I found an issue in the performance of the pfmemalloc check from > build_skb. To work around it I have provided a cached copy of pfmemalloc > to be used in __netdev_alloc_skb and __napi_alloc_skb. > > Second I moved the page fragment allocation logic into the mm tree and > added functionality for freeing page fragments. I had to fix igb before I > could do this as it was using a reference to NETDEV_FRAG_PAGE_MAX_SIZE > incorrectly. > > Finally I went through and replaced all of the duplicate code that was > calling put_page and replaced it with calls to skb_free_frag. > > With these changes in place a simple receive and drop test increased from a > packet rate of 8.9Mpps to 9.8Mpps. The gains breakdown as follows: > > 8.9Mpps Before 9.8Mpps After > ------------------------ ------------------------ > 7.8% put_compound_page 9.1% __free_page_frag > 3.9% skb_free_head > 1.1% put_page > > 4.9% build_skb 3.8% __napi_alloc_skb > 2.5% __alloc_rx_skb > 1.9% __napi_alloc_skb I like this series, but again I need to see feedback from some mm folks before I can consider applying it. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f46.google.com (mail-pa0-f46.google.com [209.85.220.46]) by kanga.kvack.org (Postfix) with ESMTP id B461A6B006C for ; Sun, 10 May 2015 19:18:53 -0400 (EDT) Received: by pacyx8 with SMTP id yx8so94982178pac.1 for ; Sun, 10 May 2015 16:18:53 -0700 (PDT) Received: from shards.monkeyblade.net (shards.monkeyblade.net. [2001:4f8:3:36:211:85ff:fe63:a549]) by mx.google.com with ESMTP id au5si15518640pbc.232.2015.05.10.16.18.52 for ; Sun, 10 May 2015 16:18:52 -0700 (PDT) Date: Sun, 10 May 2015 19:18:51 -0400 (EDT) Message-Id: <20150510.191851.414324528131774160.davem@davemloft.net> Subject: Re: [PATCH 01/10] net: Use cached copy of pfmemalloc to avoid accessing page From: David Miller In-Reply-To: <20150507041140.1873.58533.stgit@ahduyck-vm-fedora22> References: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> <20150507041140.1873.58533.stgit@ahduyck-vm-fedora22> Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: alexander.h.duyck@redhat.com Cc: netdev@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, eric.dumazet@gmail.com From: Alexander Duyck Date: Wed, 06 May 2015 21:11:40 -0700 > + /* use OR instead of assignment to avoid clearing of bits in mask */ > + if (pfmemalloc) > + skb->pfmemalloc = 1; > + skb->head_frag = 1; ... > + /* use OR instead of assignment to avoid clearing of bits in mask */ > + if (nc->pfmemalloc) > + skb->pfmemalloc = 1; > + skb->head_frag = 1; Maybe make these two cases more consistent by either accessing nc->pfmemalloc or using a local variable in both cases. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f53.google.com (mail-pa0-f53.google.com [209.85.220.53]) by kanga.kvack.org (Postfix) with ESMTP id 605766B0038 for ; Sun, 10 May 2015 20:01:18 -0400 (EDT) Received: by pabsx10 with SMTP id sx10so95567547pab.3 for ; Sun, 10 May 2015 17:01:18 -0700 (PDT) Received: from mail-pa0-x235.google.com (mail-pa0-x235.google.com. [2607:f8b0:400e:c03::235]) by mx.google.com with ESMTPS id gl1si15686077pbd.2.2015.05.10.17.01.17 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 10 May 2015 17:01:17 -0700 (PDT) Received: by pabsx10 with SMTP id sx10so95567328pab.3 for ; Sun, 10 May 2015 17:01:17 -0700 (PDT) Message-ID: <554FF14B.4050901@gmail.com> Date: Sun, 10 May 2015 17:01:15 -0700 From: Alexander Duyck MIME-Version: 1.0 Subject: Re: [PATCH 01/10] net: Use cached copy of pfmemalloc to avoid accessing page References: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> <20150507041140.1873.58533.stgit@ahduyck-vm-fedora22> <20150510.191851.414324528131774160.davem@davemloft.net> In-Reply-To: <20150510.191851.414324528131774160.davem@davemloft.net> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: David Miller , alexander.h.duyck@redhat.com Cc: netdev@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, eric.dumazet@gmail.com On 05/10/2015 04:18 PM, David Miller wrote: > From: Alexander Duyck > Date: Wed, 06 May 2015 21:11:40 -0700 > >> + /* use OR instead of assignment to avoid clearing of bits in mask */ >> + if (pfmemalloc) >> + skb->pfmemalloc = 1; >> + skb->head_frag = 1; > ... >> + /* use OR instead of assignment to avoid clearing of bits in mask */ >> + if (nc->pfmemalloc) >> + skb->pfmemalloc = 1; >> + skb->head_frag = 1; > Maybe make these two cases more consistent by either accessing > nc->pfmemalloc or using a local variable in both cases. The only option would be to use a local variable in both cases, but then I am still stuck with the differences in when I can access the caches. The reason for the difference between the two is that in the case of netdev_alloc_skb/frag the netdev_alloc_cache can only be accessed with IRQs disabled, whereas in the napi_alloc_skb case we can access the napi_alloc_cache at any point in the function. Either way I am going to be stuck with differences because of the local_irq_save/restore that must be called when accessing the page frag cache that doesn't exist in the napi case. - Alex -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f181.google.com (mail-pd0-f181.google.com [209.85.192.181]) by kanga.kvack.org (Postfix) with ESMTP id 569136B0038 for ; Sun, 10 May 2015 20:52:19 -0400 (EDT) Received: by pdea3 with SMTP id a3so134627849pde.3 for ; Sun, 10 May 2015 17:52:19 -0700 (PDT) Received: from shards.monkeyblade.net (shards.monkeyblade.net. [2001:4f8:3:36:211:85ff:fe63:a549]) by mx.google.com with ESMTP id kx13si15832730pab.98.2015.05.10.17.52.18 for ; Sun, 10 May 2015 17:52:18 -0700 (PDT) Date: Sun, 10 May 2015 20:52:13 -0400 (EDT) Message-Id: <20150510.205213.202914226916051307.davem@davemloft.net> Subject: Re: [PATCH 01/10] net: Use cached copy of pfmemalloc to avoid accessing page From: David Miller In-Reply-To: <554FF14B.4050901@gmail.com> References: <20150507041140.1873.58533.stgit@ahduyck-vm-fedora22> <20150510.191851.414324528131774160.davem@davemloft.net> <554FF14B.4050901@gmail.com> Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: alexander.duyck@gmail.com Cc: alexander.h.duyck@redhat.com, netdev@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, eric.dumazet@gmail.com From: Alexander Duyck Date: Sun, 10 May 2015 17:01:15 -0700 > The reason for the difference between the two is that in the case of > netdev_alloc_skb/frag the netdev_alloc_cache can only be accessed with > IRQs disabled, whereas in the napi_alloc_skb case we can access the > napi_alloc_cache at any point in the function. Either way I am going > to be stuck with differences because of the local_irq_save/restore > that must be called when accessing the page frag cache that doesn't > exist in the napi case. I see, thanks for explaining. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f44.google.com (mail-pa0-f44.google.com [209.85.220.44]) by kanga.kvack.org (Postfix) with ESMTP id DCD7C6B006E for ; Mon, 11 May 2015 16:36:54 -0400 (EDT) Received: by pacyx8 with SMTP id yx8so118561229pac.1 for ; Mon, 11 May 2015 13:36:54 -0700 (PDT) Received: from mail.linuxfoundation.org (mail.linuxfoundation.org. [140.211.169.12]) by mx.google.com with ESMTPS id i6si19342825pdr.64.2015.05.11.13.36.53 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 11 May 2015 13:36:53 -0700 (PDT) Date: Mon, 11 May 2015 13:36:52 -0700 From: Andrew Morton Subject: Re: [PATCH 00/10] Refactor netdev page frags and move them into mm/ Message-Id: <20150511133652.cd885d654213fe4161da0d87@linux-foundation.org> In-Reply-To: <20150510.191758.2130017622255857830.davem@davemloft.net> References: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> <20150510.191758.2130017622255857830.davem@davemloft.net> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: David Miller Cc: alexander.h.duyck@redhat.com, netdev@vger.kernel.org, linux-mm@kvack.org, eric.dumazet@gmail.com On Sun, 10 May 2015 19:17:58 -0400 (EDT) David Miller wrote: > > 4.9% build_skb 3.8% __napi_alloc_skb > > 2.5% __alloc_rx_skb > > 1.9% __napi_alloc_skb > > I like this series, but again I need to see feedback from some > mm folks before I can consider applying it. The MM part looks OK to me - it's largely moving code out of net/ into mm/. It's a bit weird and it's unclear whether the code will gain other callers, but putting it in mm/ increase the likelihood that some other subsystem will use it. Please merge it via a net tree when ready. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pd0-f172.google.com (mail-pd0-f172.google.com [209.85.192.172]) by kanga.kvack.org (Postfix) with ESMTP id 6EA0D6B0038 for ; Tue, 12 May 2015 10:39:41 -0400 (EDT) Received: by pdbnk13 with SMTP id nk13so15030779pdb.0 for ; Tue, 12 May 2015 07:39:41 -0700 (PDT) Received: from shards.monkeyblade.net (shards.monkeyblade.net. [2001:4f8:3:36:211:85ff:fe63:a549]) by mx.google.com with ESMTP id b2si9243575pdf.235.2015.05.12.07.39.39 for ; Tue, 12 May 2015 07:39:40 -0700 (PDT) Date: Tue, 12 May 2015 10:39:37 -0400 (EDT) Message-Id: <20150512.103937.434422170476918731.davem@davemloft.net> Subject: Re: [PATCH 00/10] Refactor netdev page frags and move them into mm/ From: David Miller In-Reply-To: <20150511133652.cd885d654213fe4161da0d87@linux-foundation.org> References: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> <20150510.191758.2130017622255857830.davem@davemloft.net> <20150511133652.cd885d654213fe4161da0d87@linux-foundation.org> Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: akpm@linux-foundation.org Cc: alexander.h.duyck@redhat.com, netdev@vger.kernel.org, linux-mm@kvack.org, eric.dumazet@gmail.com From: Andrew Morton Date: Mon, 11 May 2015 13:36:52 -0700 > On Sun, 10 May 2015 19:17:58 -0400 (EDT) David Miller wrote: > >> > 4.9% build_skb 3.8% __napi_alloc_skb >> > 2.5% __alloc_rx_skb >> > 1.9% __napi_alloc_skb >> >> I like this series, but again I need to see feedback from some >> mm folks before I can consider applying it. > > The MM part looks OK to me - it's largely moving code out of net/ into > mm/. It's a bit weird and it's unclear whether the code will gain > other callers, but putting it in mm/ increase the likelihood that some other > subsystem will use it. > > Please merge it via a net tree when ready. Ok, will do that now, thanks Andrew! -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexander Duyck Subject: [PATCH 00/10] Refactor netdev page frags and move them into mm/ Date: Wed, 06 May 2015 21:11:34 -0700 Message-ID: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Cc: akpm@linux-foundation.org, davem@davemloft.net, eric.dumazet@gmail.com To: netdev@vger.kernel.org, linux-mm@kvack.org Return-path: Received: from mx1.redhat.com ([209.132.183.28]:35922 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750725AbbEGELo (ORCPT ); Thu, 7 May 2015 00:11:44 -0400 Sender: netdev-owner@vger.kernel.org List-ID: This patch series addresses several things. First I found an issue in the performance of the pfmemalloc check from build_skb. To work around it I have provided a cached copy of pfmemalloc to be used in __netdev_alloc_skb and __napi_alloc_skb. Second I moved the page fragment allocation logic into the mm tree and added functionality for freeing page fragments. I had to fix igb before I could do this as it was using a reference to NETDEV_FRAG_PAGE_MAX_SIZE incorrectly. Finally I went through and replaced all of the duplicate code that was calling put_page and replaced it with calls to skb_free_frag. With these changes in place a simple receive and drop test increased from a packet rate of 8.9Mpps to 9.8Mpps. The gains breakdown as follows: 8.9Mpps Before 9.8Mpps After ------------------------ ------------------------ 7.8% put_compound_page 9.1% __free_page_frag 3.9% skb_free_head 1.1% put_page 4.9% build_skb 3.8% __napi_alloc_skb 2.5% __alloc_rx_skb 1.9% __napi_alloc_skb --- Alexander Duyck (10): net: Use cached copy of pfmemalloc to avoid accessing page igb: Don't use NETDEV_FRAG_PAGE_MAX_SIZE in descriptor calculation net: Store virtual address instead of page in netdev_alloc_cache mm/net: Rename and move page fragment handling from net/ to mm/ net: Add skb_free_frag to replace use of put_page in freeing skb->head netcp: Replace put_page(virt_to_head_page(ptr)) w/ skb_free_frag mvneta: Replace put_page(virt_to_head_page(ptr)) w/ skb_free_frag e1000: Replace e1000_free_frag with skb_free_frag hisilicon: Replace put_page(virt_to_head_page()) with skb_free_frag() bnx2x, tg3: Replace put_page(virt_to_head_page()) with skb_free_frag() drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 2 drivers/net/ethernet/broadcom/tg3.c | 2 drivers/net/ethernet/hisilicon/hip04_eth.c | 2 drivers/net/ethernet/intel/e1000/e1000_main.c | 19 +- drivers/net/ethernet/intel/igb/igb_main.c | 11 - drivers/net/ethernet/marvell/mvneta.c | 2 drivers/net/ethernet/ti/netcp_core.c | 2 include/linux/gfp.h | 5 + include/linux/mm_types.h | 18 ++ include/linux/skbuff.h | 9 + mm/page_alloc.c | 98 ++++++++++ net/core/skbuff.c | 224 ++++++++--------------- 12 files changed, 223 insertions(+), 171 deletions(-) -- From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexander Duyck Subject: [PATCH 01/10] net: Use cached copy of pfmemalloc to avoid accessing page Date: Wed, 06 May 2015 21:11:40 -0700 Message-ID: <20150507041140.1873.58533.stgit@ahduyck-vm-fedora22> References: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Cc: akpm@linux-foundation.org, davem@davemloft.net, eric.dumazet@gmail.com To: netdev@vger.kernel.org, linux-mm@kvack.org Return-path: Received: from mx1.redhat.com ([209.132.183.28]:34529 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750841AbbEGELr (ORCPT ); Thu, 7 May 2015 00:11:47 -0400 In-Reply-To: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> Sender: netdev-owner@vger.kernel.org List-ID: While testing I found that the testing for pfmemalloc in build_skb was rather expensive. I found the issue to be two-fold. First we have to get from the virtual address to the head page and that comes at the cost of something like 11 cycles. Then there is the cost for reading pfmemalloc out of the head page which can be cache cold due to the fact that put_page_testzero is likely invalidating the cache-line on one or more CPUs as the fragments can be shared. To avoid this extra expense I have added a pfmemalloc member to the netdev_alloc_cache. I then pushed pieces of __alloc_rx_skb into __napi_alloc_skb and __netdev_alloc_skb so that I could rewrite them to make use of the cached pfmemalloc value. The result is that my perf traces show a reduction from 9.28% overhead to 3.7% for the code covered by build_skb, __alloc_rx_skb, and __napi_alloc_skb when performing a test with the packet being dropped instead of being handed to napi_gro_receive. Signed-off-by: Alexander Duyck --- net/core/skbuff.c | 141 ++++++++++++++++++++++++++++++----------------------- 1 file changed, 79 insertions(+), 62 deletions(-) diff --git a/net/core/skbuff.c b/net/core/skbuff.c index b9eb90b39ac7..d6851ca32598 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -353,6 +353,7 @@ struct netdev_alloc_cache { * containing page->_count every time we allocate a fragment. */ unsigned int pagecnt_bias; + bool pfmemalloc; }; static DEFINE_PER_CPU(struct netdev_alloc_cache, netdev_alloc_cache); static DEFINE_PER_CPU(struct netdev_alloc_cache, napi_alloc_cache); @@ -379,10 +380,9 @@ static struct page *__page_frag_refill(struct netdev_alloc_cache *nc, return page; } -static void *__alloc_page_frag(struct netdev_alloc_cache __percpu *cache, +static void *__alloc_page_frag(struct netdev_alloc_cache *nc, unsigned int fragsz, gfp_t gfp_mask) { - struct netdev_alloc_cache *nc = this_cpu_ptr(cache); struct page *page = nc->frag.page; unsigned int size; int offset; @@ -402,6 +402,7 @@ refill: atomic_add(size - 1, &page->_count); /* reset page count bias and offset to start of new frag */ + nc->pfmemalloc = page->pfmemalloc; nc->pagecnt_bias = size; nc->frag.offset = size; } @@ -430,11 +431,13 @@ refill: static void *__netdev_alloc_frag(unsigned int fragsz, gfp_t gfp_mask) { + struct netdev_alloc_cache *nc; unsigned long flags; void *data; local_irq_save(flags); - data = __alloc_page_frag(&netdev_alloc_cache, fragsz, gfp_mask); + nc = this_cpu_ptr(&netdev_alloc_cache); + data = __alloc_page_frag(nc, fragsz, gfp_mask); local_irq_restore(flags); return data; } @@ -454,7 +457,9 @@ EXPORT_SYMBOL(netdev_alloc_frag); static void *__napi_alloc_frag(unsigned int fragsz, gfp_t gfp_mask) { - return __alloc_page_frag(&napi_alloc_cache, fragsz, gfp_mask); + struct netdev_alloc_cache *nc = this_cpu_ptr(&napi_alloc_cache); + + return __alloc_page_frag(nc, fragsz, gfp_mask); } void *napi_alloc_frag(unsigned int fragsz) @@ -464,76 +469,64 @@ void *napi_alloc_frag(unsigned int fragsz) EXPORT_SYMBOL(napi_alloc_frag); /** - * __alloc_rx_skb - allocate an skbuff for rx + * __netdev_alloc_skb - allocate an skbuff for rx on a specific device + * @dev: network device to receive on * @length: length to allocate * @gfp_mask: get_free_pages mask, passed to alloc_skb - * @flags: If SKB_ALLOC_RX is set, __GFP_MEMALLOC will be used for - * allocations in case we have to fallback to __alloc_skb() - * If SKB_ALLOC_NAPI is set, page fragment will be allocated - * from napi_cache instead of netdev_cache. * * Allocate a new &sk_buff and assign it a usage count of one. The - * buffer has unspecified headroom built in. Users should allocate + * buffer has NET_SKB_PAD headroom built in. Users should allocate * the headroom they think they need without accounting for the * built in space. The built in space is used for optimisations. * * %NULL is returned if there is no free memory. */ -static struct sk_buff *__alloc_rx_skb(unsigned int length, gfp_t gfp_mask, - int flags) +struct sk_buff *__netdev_alloc_skb(struct net_device *dev, unsigned int len, + gfp_t gfp_mask) { - struct sk_buff *skb = NULL; - unsigned int fragsz = SKB_DATA_ALIGN(length) + - SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); + struct netdev_alloc_cache *nc; + unsigned long flags; + struct sk_buff *skb; + bool pfmemalloc; + void *data; - if (fragsz <= PAGE_SIZE && !(gfp_mask & (__GFP_WAIT | GFP_DMA))) { - void *data; + len += NET_SKB_PAD; - if (sk_memalloc_socks()) - gfp_mask |= __GFP_MEMALLOC; + if ((len > SKB_WITH_OVERHEAD(PAGE_SIZE)) || + (gfp_mask & (__GFP_WAIT | GFP_DMA))) + return __alloc_skb(len, gfp_mask, SKB_ALLOC_RX, NUMA_NO_NODE); - data = (flags & SKB_ALLOC_NAPI) ? - __napi_alloc_frag(fragsz, gfp_mask) : - __netdev_alloc_frag(fragsz, gfp_mask); + len += SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); + len = SKB_DATA_ALIGN(len); - if (likely(data)) { - skb = build_skb(data, fragsz); - if (unlikely(!skb)) - put_page(virt_to_head_page(data)); - } - } else { - skb = __alloc_skb(length, gfp_mask, - SKB_ALLOC_RX, NUMA_NO_NODE); - } - return skb; -} + if (sk_memalloc_socks()) + gfp_mask |= __GFP_MEMALLOC; -/** - * __netdev_alloc_skb - allocate an skbuff for rx on a specific device - * @dev: network device to receive on - * @length: length to allocate - * @gfp_mask: get_free_pages mask, passed to alloc_skb - * - * Allocate a new &sk_buff and assign it a usage count of one. The - * buffer has NET_SKB_PAD headroom built in. Users should allocate - * the headroom they think they need without accounting for the - * built in space. The built in space is used for optimisations. - * - * %NULL is returned if there is no free memory. - */ -struct sk_buff *__netdev_alloc_skb(struct net_device *dev, - unsigned int length, gfp_t gfp_mask) -{ - struct sk_buff *skb; + local_irq_save(flags); - length += NET_SKB_PAD; - skb = __alloc_rx_skb(length, gfp_mask, 0); + nc = this_cpu_ptr(&netdev_alloc_cache); + data = __alloc_page_frag(nc, len, gfp_mask); + pfmemalloc = nc->pfmemalloc; - if (likely(skb)) { - skb_reserve(skb, NET_SKB_PAD); - skb->dev = dev; + local_irq_restore(flags); + + if (unlikely(!data)) + return NULL; + + skb = __build_skb(data, len); + if (unlikely(!skb)) { + put_page(virt_to_head_page(data)); + return NULL; } + /* use OR instead of assignment to avoid clearing of bits in mask */ + if (pfmemalloc) + skb->pfmemalloc = 1; + skb->head_frag = 1; + + skb_reserve(skb, NET_SKB_PAD); + skb->dev = dev; + return skb; } EXPORT_SYMBOL(__netdev_alloc_skb); @@ -551,19 +544,43 @@ EXPORT_SYMBOL(__netdev_alloc_skb); * * %NULL is returned if there is no free memory. */ -struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, - unsigned int length, gfp_t gfp_mask) +struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len, + gfp_t gfp_mask) { + struct netdev_alloc_cache *nc = this_cpu_ptr(&napi_alloc_cache); struct sk_buff *skb; + void *data; + + len += NET_SKB_PAD + NET_IP_ALIGN; + + if ((len > SKB_WITH_OVERHEAD(PAGE_SIZE)) || + (gfp_mask & (__GFP_WAIT | GFP_DMA))) + return __alloc_skb(len, gfp_mask, SKB_ALLOC_RX, NUMA_NO_NODE); + + len += SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); + len = SKB_DATA_ALIGN(len); - length += NET_SKB_PAD + NET_IP_ALIGN; - skb = __alloc_rx_skb(length, gfp_mask, SKB_ALLOC_NAPI); + if (sk_memalloc_socks()) + gfp_mask |= __GFP_MEMALLOC; + + data = __alloc_page_frag(nc, len, gfp_mask); + if (unlikely(!data)) + return NULL; - if (likely(skb)) { - skb_reserve(skb, NET_SKB_PAD + NET_IP_ALIGN); - skb->dev = napi->dev; + skb = __build_skb(data, len); + if (unlikely(!skb)) { + put_page(virt_to_head_page(data)); + return NULL; } + /* use OR instead of assignment to avoid clearing of bits in mask */ + if (nc->pfmemalloc) + skb->pfmemalloc = 1; + skb->head_frag = 1; + + skb_reserve(skb, NET_SKB_PAD + NET_IP_ALIGN); + skb->dev = napi->dev; + return skb; } EXPORT_SYMBOL(__napi_alloc_skb); From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexander Duyck Subject: [PATCH 02/10] igb: Don't use NETDEV_FRAG_PAGE_MAX_SIZE in descriptor calculation Date: Wed, 06 May 2015 21:11:45 -0700 Message-ID: <20150507041145.1873.66455.stgit@ahduyck-vm-fedora22> References: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Cc: akpm@linux-foundation.org, davem@davemloft.net, eric.dumazet@gmail.com To: netdev@vger.kernel.org, linux-mm@kvack.org Return-path: Received: from mx1.redhat.com ([209.132.183.28]:52373 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750806AbbEGELq (ORCPT ); Thu, 7 May 2015 00:11:46 -0400 In-Reply-To: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> Sender: netdev-owner@vger.kernel.org List-ID: This change updates igb so that it will correctly perform the descriptor count calculation. Previously it was taking NETDEV_FRAG_PAGE_MAX_SIZE into account with isn't really correct since a different value is used to determine the size of the pages used for TCP. That is actually determined by SKB_FRAG_PAGE_ORDER. Signed-off-by: Alexander Duyck --- drivers/net/ethernet/intel/igb/igb_main.c | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-) diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index d3b4098fce48..1394642e4859 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -4974,6 +4974,7 @@ netdev_tx_t igb_xmit_frame_ring(struct sk_buff *skb, struct igb_tx_buffer *first; int tso; u32 tx_flags = 0; + unsigned short f; u16 count = TXD_USE_COUNT(skb_headlen(skb)); __be16 protocol = vlan_get_protocol(skb); u8 hdr_len = 0; @@ -4984,14 +4985,8 @@ netdev_tx_t igb_xmit_frame_ring(struct sk_buff *skb, * + 1 desc for context descriptor, * otherwise try next time */ - if (NETDEV_FRAG_PAGE_MAX_SIZE > IGB_MAX_DATA_PER_TXD) { - unsigned short f; - - for (f = 0; f < skb_shinfo(skb)->nr_frags; f++) - count += TXD_USE_COUNT(skb_shinfo(skb)->frags[f].size); - } else { - count += skb_shinfo(skb)->nr_frags; - } + for (f = 0; f < skb_shinfo(skb)->nr_frags; f++) + count += TXD_USE_COUNT(skb_shinfo(skb)->frags[f].size); if (igb_maybe_stop_tx(tx_ring, count + 3)) { /* this is a hard error */ From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexander Duyck Subject: [PATCH 04/10] mm/net: Rename and move page fragment handling from net/ to mm/ Date: Wed, 06 May 2015 21:11:57 -0700 Message-ID: <20150507041157.1873.63151.stgit@ahduyck-vm-fedora22> References: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Cc: akpm@linux-foundation.org, davem@davemloft.net, eric.dumazet@gmail.com To: netdev@vger.kernel.org, linux-mm@kvack.org Return-path: Received: from mx1.redhat.com ([209.132.183.28]:52395 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751220AbbEGEL6 (ORCPT ); Thu, 7 May 2015 00:11:58 -0400 In-Reply-To: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> Sender: netdev-owner@vger.kernel.org List-ID: This change moves the __alloc_page_frag functionality out of the networking stack and into the page allocation portion of mm. The idea it so help make this maintainable by placing it with other page allocation functions. Since we are moving it from skbuff.c to page_alloc.c I have also renamed the basic defines and structure from netdev_alloc_cache to page_frag_cache to reflect that this is now part of a different kernel subsystem. I have also added a simple __free_page_frag function which can handle freeing the frags based on the skb->head pointer. The model for this is based off of __free_pages since we don't actually need to deal with all of the cases that put_page handles. I incorporated the virt_to_head_page call and compound_order into the function as it actually allows for a signficant size reduction by reducing code duplication. Signed-off-by: Alexander Duyck --- include/linux/gfp.h | 5 ++ include/linux/mm_types.h | 18 ++++++++ include/linux/skbuff.h | 3 - mm/page_alloc.c | 98 +++++++++++++++++++++++++++++++++++++++++++++ net/core/skbuff.c | 100 +++------------------------------------------- 5 files changed, 127 insertions(+), 97 deletions(-) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index 97a9373e61e8..70a7fee1efb3 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -366,6 +366,11 @@ extern void free_pages(unsigned long addr, unsigned int order); extern void free_hot_cold_page(struct page *page, bool cold); extern void free_hot_cold_page_list(struct list_head *list, bool cold); +struct page_frag_cache; +extern void *__alloc_page_frag(struct page_frag_cache *nc, + unsigned int fragsz, gfp_t gfp_mask); +extern void __free_page_frag(void *addr); + extern void __free_kmem_pages(struct page *page, unsigned int order); extern void free_kmem_pages(unsigned long addr, unsigned int order); diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 8d37e26a1007..0038ac7466fd 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -226,6 +226,24 @@ struct page_frag { #endif }; +#define PAGE_FRAG_CACHE_MAX_SIZE __ALIGN_MASK(32768, ~PAGE_MASK) +#define PAGE_FRAG_CACHE_MAX_ORDER get_order(PAGE_FRAG_CACHE_MAX_SIZE) + +struct page_frag_cache { + void * va; +#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) + __u16 offset; + __u16 size; +#else + __u32 offset; +#endif + /* we maintain a pagecount bias, so that we dont dirty cache line + * containing page->_count every time we allocate a fragment. + */ + unsigned int pagecnt_bias; + bool pfmemalloc; +}; + typedef unsigned long __nocast vm_flags_t; /* diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 8b9a2c35a9d7..0039fcc45b3b 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -2128,9 +2128,6 @@ static inline void __skb_queue_purge(struct sk_buff_head *list) kfree_skb(skb); } -#define NETDEV_FRAG_PAGE_MAX_SIZE __ALIGN_MASK(32768, ~PAGE_MASK) -#define NETDEV_FRAG_PAGE_MAX_ORDER get_order(NETDEV_FRAG_PAGE_MAX_SIZE) - void *netdev_alloc_frag(unsigned int fragsz); struct sk_buff *__netdev_alloc_skb(struct net_device *dev, unsigned int length, diff --git a/mm/page_alloc.c b/mm/page_alloc.c index ebffa0e4a9c0..2fd31aebef30 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2967,6 +2967,104 @@ void free_pages(unsigned long addr, unsigned int order) EXPORT_SYMBOL(free_pages); /* + * Page Fragment: + * An arbitrary-length arbitrary-offset area of memory which resides + * within a 0 or higher order page. Multiple fragments within that page + * are individually refcounted, in the page's reference counter. + * + * The page_frag functions below provide a simple allocation framework for + * page fragments. This is used by the network stack and network device + * drivers to provide a backing region of memory for use as either an + * sk_buff->head, or to be used in the "frags" portion of skb_shared_info. + */ +static struct page *__page_frag_refill(struct page_frag_cache *nc, + gfp_t gfp_mask) +{ + struct page *page = NULL; + gfp_t gfp = gfp_mask; + +#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) + gfp_mask |= __GFP_COMP | __GFP_NOWARN | __GFP_NORETRY | + __GFP_NOMEMALLOC; + page = alloc_pages_node(NUMA_NO_NODE, gfp_mask, + PAGE_FRAG_CACHE_MAX_ORDER); + nc->size = page ? PAGE_FRAG_CACHE_MAX_SIZE : PAGE_SIZE; +#endif + if (unlikely(!page)) + page = alloc_pages_node(NUMA_NO_NODE, gfp, 0); + + nc->va = page ? page_address(page) : NULL; + + return page; +} + +void *__alloc_page_frag(struct page_frag_cache *nc, + unsigned int fragsz, gfp_t gfp_mask) +{ + unsigned int size = PAGE_SIZE; + struct page *page; + int offset; + + if (unlikely(!nc->va)) { +refill: + page = __page_frag_refill(nc, gfp_mask); + if (!page) + return NULL; + +#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) + /* if size can vary use size else just use PAGE_SIZE */ + size = nc->size; +#endif + /* Even if we own the page, we do not use atomic_set(). + * This would break get_page_unless_zero() users. + */ + atomic_add(size - 1, &page->_count); + + /* reset page count bias and offset to start of new frag */ + nc->pfmemalloc = page->pfmemalloc; + nc->pagecnt_bias = size; + nc->offset = size; + } + + offset = nc->offset - fragsz; + if (unlikely(offset < 0)) { + page = virt_to_page(nc->va); + + if (!atomic_sub_and_test(nc->pagecnt_bias, &page->_count)) + goto refill; + +#if (PAGE_SIZE < PAGE_FRAG_CACHE_MAX_SIZE) + /* if size can vary use size else just use PAGE_SIZE */ + size = nc->size; +#endif + /* OK, page count is 0, we can safely set it */ + atomic_set(&page->_count, size); + + /* reset page count bias and offset to start of new frag */ + nc->pagecnt_bias = size; + offset = size - fragsz; + } + + nc->pagecnt_bias--; + nc->offset = offset; + + return nc->va + offset; +} +EXPORT_SYMBOL(__alloc_page_frag); + +/* + * Frees a page fragment allocated out of either a compound or order 0 page. + */ +void __free_page_frag(void *addr) +{ + struct page *page = virt_to_head_page(addr); + + if (unlikely(put_page_testzero(page))) + __free_pages_ok(page, compound_order(page)); +} +EXPORT_SYMBOL(__free_page_frag); + +/* * alloc_kmem_pages charges newly allocated pages to the kmem resource counter * of the current memory cgroup. * diff --git a/net/core/skbuff.c b/net/core/skbuff.c index a3062ec341c3..dcc0e07abf47 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -347,100 +347,12 @@ struct sk_buff *build_skb(void *data, unsigned int frag_size) } EXPORT_SYMBOL(build_skb); -struct netdev_alloc_cache { - void * va; -#if (PAGE_SIZE < NETDEV_FRAG_PAGE_MAX_SIZE) - __u16 offset; - __u16 size; -#else - __u32 offset; -#endif - /* we maintain a pagecount bias, so that we dont dirty cache line - * containing page->_count every time we allocate a fragment. - */ - unsigned int pagecnt_bias; - bool pfmemalloc; -}; -static DEFINE_PER_CPU(struct netdev_alloc_cache, netdev_alloc_cache); -static DEFINE_PER_CPU(struct netdev_alloc_cache, napi_alloc_cache); - -static struct page *__page_frag_refill(struct netdev_alloc_cache *nc, - gfp_t gfp_mask) -{ - struct page *page = NULL; - gfp_t gfp = gfp_mask; - -#if (PAGE_SIZE < NETDEV_FRAG_PAGE_MAX_SIZE) - gfp_mask |= __GFP_COMP | __GFP_NOWARN | __GFP_NORETRY | - __GFP_NOMEMALLOC; - page = alloc_pages_node(NUMA_NO_NODE, gfp_mask, - NETDEV_FRAG_PAGE_MAX_ORDER); - nc->size = page ? NETDEV_FRAG_PAGE_MAX_SIZE : PAGE_SIZE; -#endif - if (unlikely(!page)) - page = alloc_pages_node(NUMA_NO_NODE, gfp, 0); - - nc->va = page ? page_address(page) : NULL; - - return page; -} - -static void *__alloc_page_frag(struct netdev_alloc_cache *nc, - unsigned int fragsz, gfp_t gfp_mask) -{ - unsigned int size = PAGE_SIZE; - struct page *page; - int offset; - - if (unlikely(!nc->va)) { -refill: - page = __page_frag_refill(nc, gfp_mask); - if (!page) - return NULL; - -#if (PAGE_SIZE < NETDEV_FRAG_PAGE_MAX_SIZE) - /* if size can vary use size else just use PAGE_SIZE */ - size = nc->size; -#endif - /* Even if we own the page, we do not use atomic_set(). - * This would break get_page_unless_zero() users. - */ - atomic_add(size - 1, &page->_count); - - /* reset page count bias and offset to start of new frag */ - nc->pfmemalloc = page->pfmemalloc; - nc->pagecnt_bias = size; - nc->offset = size; - } - - offset = nc->offset - fragsz; - if (unlikely(offset < 0)) { - page = virt_to_page(nc->va); - - if (!atomic_sub_and_test(nc->pagecnt_bias, &page->_count)) - goto refill; - -#if (PAGE_SIZE < NETDEV_FRAG_PAGE_MAX_SIZE) - /* if size can vary use size else just use PAGE_SIZE */ - size = nc->size; -#endif - /* OK, page count is 0, we can safely set it */ - atomic_set(&page->_count, size); - - /* reset page count bias and offset to start of new frag */ - nc->pagecnt_bias = size; - offset = size - fragsz; - } - - nc->pagecnt_bias--; - nc->offset = offset; - - return nc->va + offset; -} +static DEFINE_PER_CPU(struct page_frag_cache, netdev_alloc_cache); +static DEFINE_PER_CPU(struct page_frag_cache, napi_alloc_cache); static void *__netdev_alloc_frag(unsigned int fragsz, gfp_t gfp_mask) { - struct netdev_alloc_cache *nc; + struct page_frag_cache *nc; unsigned long flags; void *data; @@ -466,7 +378,7 @@ EXPORT_SYMBOL(netdev_alloc_frag); static void *__napi_alloc_frag(unsigned int fragsz, gfp_t gfp_mask) { - struct netdev_alloc_cache *nc = this_cpu_ptr(&napi_alloc_cache); + struct page_frag_cache *nc = this_cpu_ptr(&napi_alloc_cache); return __alloc_page_frag(nc, fragsz, gfp_mask); } @@ -493,7 +405,7 @@ EXPORT_SYMBOL(napi_alloc_frag); struct sk_buff *__netdev_alloc_skb(struct net_device *dev, unsigned int len, gfp_t gfp_mask) { - struct netdev_alloc_cache *nc; + struct page_frag_cache *nc; unsigned long flags; struct sk_buff *skb; bool pfmemalloc; @@ -556,7 +468,7 @@ EXPORT_SYMBOL(__netdev_alloc_skb); struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len, gfp_t gfp_mask) { - struct netdev_alloc_cache *nc = this_cpu_ptr(&napi_alloc_cache); + struct page_frag_cache *nc = this_cpu_ptr(&napi_alloc_cache); struct sk_buff *skb; void *data; From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexander Duyck Subject: [PATCH 03/10] net: Store virtual address instead of page in netdev_alloc_cache Date: Wed, 06 May 2015 21:11:51 -0700 Message-ID: <20150507041151.1873.2487.stgit@ahduyck-vm-fedora22> References: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Cc: akpm@linux-foundation.org, davem@davemloft.net, eric.dumazet@gmail.com To: netdev@vger.kernel.org, linux-mm@kvack.org Return-path: Received: from mx1.redhat.com ([209.132.183.28]:40388 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751156AbbEGELw (ORCPT ); Thu, 7 May 2015 00:11:52 -0400 In-Reply-To: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> Sender: netdev-owner@vger.kernel.org List-ID: This change makes it so that we store the virtual address of the page in the netdev_alloc_cache instead of the page pointer. The idea behind this is to avoid multiple calls to page_address since the virtual address is required for every access, but the page pointer is only needed at allocation or reset of the page. While I was at it I also reordered the netdev_alloc_cache structure a bit so that the size is always 16 bytes by dropping size in the case where PAGE_SIZE is greater than or equal to 32KB. Signed-off-by: Alexander Duyck --- include/linux/skbuff.h | 5 ++-- net/core/skbuff.c | 55 ++++++++++++++++++++++++++++-------------------- 2 files changed, 34 insertions(+), 26 deletions(-) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 9c2f793573fa..8b9a2c35a9d7 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -2128,9 +2128,8 @@ static inline void __skb_queue_purge(struct sk_buff_head *list) kfree_skb(skb); } -#define NETDEV_FRAG_PAGE_MAX_ORDER get_order(32768) -#define NETDEV_FRAG_PAGE_MAX_SIZE (PAGE_SIZE << NETDEV_FRAG_PAGE_MAX_ORDER) -#define NETDEV_PAGECNT_MAX_BIAS NETDEV_FRAG_PAGE_MAX_SIZE +#define NETDEV_FRAG_PAGE_MAX_SIZE __ALIGN_MASK(32768, ~PAGE_MASK) +#define NETDEV_FRAG_PAGE_MAX_ORDER get_order(NETDEV_FRAG_PAGE_MAX_SIZE) void *netdev_alloc_frag(unsigned int fragsz); diff --git a/net/core/skbuff.c b/net/core/skbuff.c index d6851ca32598..a3062ec341c3 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -348,7 +348,13 @@ struct sk_buff *build_skb(void *data, unsigned int frag_size) EXPORT_SYMBOL(build_skb); struct netdev_alloc_cache { - struct page_frag frag; + void * va; +#if (PAGE_SIZE < NETDEV_FRAG_PAGE_MAX_SIZE) + __u16 offset; + __u16 size; +#else + __u32 offset; +#endif /* we maintain a pagecount bias, so that we dont dirty cache line * containing page->_count every time we allocate a fragment. */ @@ -361,21 +367,20 @@ static DEFINE_PER_CPU(struct netdev_alloc_cache, napi_alloc_cache); static struct page *__page_frag_refill(struct netdev_alloc_cache *nc, gfp_t gfp_mask) { - const unsigned int order = NETDEV_FRAG_PAGE_MAX_ORDER; struct page *page = NULL; gfp_t gfp = gfp_mask; - if (order) { - gfp_mask |= __GFP_COMP | __GFP_NOWARN | __GFP_NORETRY | - __GFP_NOMEMALLOC; - page = alloc_pages_node(NUMA_NO_NODE, gfp_mask, order); - nc->frag.size = PAGE_SIZE << (page ? order : 0); - } - +#if (PAGE_SIZE < NETDEV_FRAG_PAGE_MAX_SIZE) + gfp_mask |= __GFP_COMP | __GFP_NOWARN | __GFP_NORETRY | + __GFP_NOMEMALLOC; + page = alloc_pages_node(NUMA_NO_NODE, gfp_mask, + NETDEV_FRAG_PAGE_MAX_ORDER); + nc->size = page ? NETDEV_FRAG_PAGE_MAX_SIZE : PAGE_SIZE; +#endif if (unlikely(!page)) page = alloc_pages_node(NUMA_NO_NODE, gfp, 0); - nc->frag.page = page; + nc->va = page ? page_address(page) : NULL; return page; } @@ -383,19 +388,20 @@ static struct page *__page_frag_refill(struct netdev_alloc_cache *nc, static void *__alloc_page_frag(struct netdev_alloc_cache *nc, unsigned int fragsz, gfp_t gfp_mask) { - struct page *page = nc->frag.page; - unsigned int size; + unsigned int size = PAGE_SIZE; + struct page *page; int offset; - if (unlikely(!page)) { + if (unlikely(!nc->va)) { refill: page = __page_frag_refill(nc, gfp_mask); if (!page) return NULL; - /* if size can vary use frag.size else just use PAGE_SIZE */ - size = NETDEV_FRAG_PAGE_MAX_ORDER ? nc->frag.size : PAGE_SIZE; - +#if (PAGE_SIZE < NETDEV_FRAG_PAGE_MAX_SIZE) + /* if size can vary use size else just use PAGE_SIZE */ + size = nc->size; +#endif /* Even if we own the page, we do not use atomic_set(). * This would break get_page_unless_zero() users. */ @@ -404,17 +410,20 @@ refill: /* reset page count bias and offset to start of new frag */ nc->pfmemalloc = page->pfmemalloc; nc->pagecnt_bias = size; - nc->frag.offset = size; + nc->offset = size; } - offset = nc->frag.offset - fragsz; + offset = nc->offset - fragsz; if (unlikely(offset < 0)) { + page = virt_to_page(nc->va); + if (!atomic_sub_and_test(nc->pagecnt_bias, &page->_count)) goto refill; - /* if size can vary use frag.size else just use PAGE_SIZE */ - size = NETDEV_FRAG_PAGE_MAX_ORDER ? nc->frag.size : PAGE_SIZE; - +#if (PAGE_SIZE < NETDEV_FRAG_PAGE_MAX_SIZE) + /* if size can vary use size else just use PAGE_SIZE */ + size = nc->size; +#endif /* OK, page count is 0, we can safely set it */ atomic_set(&page->_count, size); @@ -424,9 +433,9 @@ refill: } nc->pagecnt_bias--; - nc->frag.offset = offset; + nc->offset = offset; - return page_address(page) + offset; + return nc->va + offset; } static void *__netdev_alloc_frag(unsigned int fragsz, gfp_t gfp_mask) From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexander Duyck Subject: [PATCH 06/10] netcp: Replace put_page(virt_to_head_page(ptr)) w/ skb_free_frag Date: Wed, 06 May 2015 21:12:08 -0700 Message-ID: <20150507041208.1873.92668.stgit@ahduyck-vm-fedora22> References: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Cc: akpm@linux-foundation.org, davem@davemloft.net, eric.dumazet@gmail.com To: netdev@vger.kernel.org, linux-mm@kvack.org Return-path: Received: from mx1.redhat.com ([209.132.183.28]:52419 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751292AbbEGEMJ (ORCPT ); Thu, 7 May 2015 00:12:09 -0400 In-Reply-To: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> Sender: netdev-owner@vger.kernel.org List-ID: Signed-off-by: Alexander Duyck --- drivers/net/ethernet/ti/netcp_core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/ti/netcp_core.c b/drivers/net/ethernet/ti/netcp_core.c index 43efc3a0cda5..0a28c07361cf 100644 --- a/drivers/net/ethernet/ti/netcp_core.c +++ b/drivers/net/ethernet/ti/netcp_core.c @@ -537,7 +537,7 @@ int netcp_unregister_rxhook(struct netcp_intf *netcp_priv, int order, static void netcp_frag_free(bool is_frag, void *ptr) { if (is_frag) - put_page(virt_to_head_page(ptr)); + skb_free_frag(ptr); else kfree(ptr); } From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexander Duyck Subject: [PATCH 07/10] mvneta: Replace put_page(virt_to_head_page(ptr)) w/ skb_free_frag Date: Wed, 06 May 2015 21:12:14 -0700 Message-ID: <20150507041214.1873.71461.stgit@ahduyck-vm-fedora22> References: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Cc: akpm@linux-foundation.org, davem@davemloft.net, eric.dumazet@gmail.com To: netdev@vger.kernel.org, linux-mm@kvack.org Return-path: Received: from mx1.redhat.com ([209.132.183.28]:40411 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751353AbbEGEMP (ORCPT ); Thu, 7 May 2015 00:12:15 -0400 In-Reply-To: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> Sender: netdev-owner@vger.kernel.org List-ID: Signed-off-by: Alexander Duyck --- drivers/net/ethernet/marvell/mvneta.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c index ce5f7f9cff06..ecce8261ce3b 100644 --- a/drivers/net/ethernet/marvell/mvneta.c +++ b/drivers/net/ethernet/marvell/mvneta.c @@ -1359,7 +1359,7 @@ static void *mvneta_frag_alloc(const struct mvneta_port *pp) static void mvneta_frag_free(const struct mvneta_port *pp, void *data) { if (likely(pp->frag_size <= PAGE_SIZE)) - put_page(virt_to_head_page(data)); + skb_free_frag(data); else kfree(data); } From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexander Duyck Subject: [PATCH 08/10] e1000: Replace e1000_free_frag with skb_free_frag Date: Wed, 06 May 2015 21:12:20 -0700 Message-ID: <20150507041220.1873.80925.stgit@ahduyck-vm-fedora22> References: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Cc: akpm@linux-foundation.org, davem@davemloft.net, eric.dumazet@gmail.com To: netdev@vger.kernel.org, linux-mm@kvack.org Return-path: Received: from mx1.redhat.com ([209.132.183.28]:54054 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751353AbbEGEMV (ORCPT ); Thu, 7 May 2015 00:12:21 -0400 In-Reply-To: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> Sender: netdev-owner@vger.kernel.org List-ID: Signed-off-by: Alexander Duyck --- drivers/net/ethernet/intel/e1000/e1000_main.c | 19 +++++++------------ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c index 983eb4e6f7aa..74dc15055971 100644 --- a/drivers/net/ethernet/intel/e1000/e1000_main.c +++ b/drivers/net/ethernet/intel/e1000/e1000_main.c @@ -2079,11 +2079,6 @@ static void *e1000_alloc_frag(const struct e1000_adapter *a) return data; } -static void e1000_free_frag(const void *data) -{ - put_page(virt_to_head_page(data)); -} - /** * e1000_clean_rx_ring - Free Rx Buffers per Queue * @adapter: board private structure @@ -2107,7 +2102,7 @@ static void e1000_clean_rx_ring(struct e1000_adapter *adapter, adapter->rx_buffer_len, DMA_FROM_DEVICE); if (buffer_info->rxbuf.data) { - e1000_free_frag(buffer_info->rxbuf.data); + skb_free_frag(buffer_info->rxbuf.data); buffer_info->rxbuf.data = NULL; } } else if (adapter->clean_rx == e1000_clean_jumbo_rx_irq) { @@ -4594,28 +4589,28 @@ static void e1000_alloc_rx_buffers(struct e1000_adapter *adapter, data = e1000_alloc_frag(adapter); /* Failed allocation, critical failure */ if (!data) { - e1000_free_frag(olddata); + skb_free_frag(olddata); adapter->alloc_rx_buff_failed++; break; } if (!e1000_check_64k_bound(adapter, data, bufsz)) { /* give up */ - e1000_free_frag(data); - e1000_free_frag(olddata); + skb_free_frag(data); + skb_free_frag(olddata); adapter->alloc_rx_buff_failed++; break; } /* Use new allocation */ - e1000_free_frag(olddata); + skb_free_frag(olddata); } buffer_info->dma = dma_map_single(&pdev->dev, data, adapter->rx_buffer_len, DMA_FROM_DEVICE); if (dma_mapping_error(&pdev->dev, buffer_info->dma)) { - e1000_free_frag(data); + skb_free_frag(data); buffer_info->dma = 0; adapter->alloc_rx_buff_failed++; break; @@ -4637,7 +4632,7 @@ static void e1000_alloc_rx_buffers(struct e1000_adapter *adapter, adapter->rx_buffer_len, DMA_FROM_DEVICE); - e1000_free_frag(data); + skb_free_frag(data); buffer_info->rxbuf.data = NULL; buffer_info->dma = 0; From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexander Duyck Subject: [PATCH 10/10] bnx2x, tg3: Replace put_page(virt_to_head_page()) with skb_free_frag() Date: Wed, 06 May 2015 21:12:31 -0700 Message-ID: <20150507041231.1873.76848.stgit@ahduyck-vm-fedora22> References: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Cc: akpm@linux-foundation.org, davem@davemloft.net, eric.dumazet@gmail.com To: netdev@vger.kernel.org, linux-mm@kvack.org Return-path: Received: from mx1.redhat.com ([209.132.183.28]:52468 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751356AbbEGEMc (ORCPT ); Thu, 7 May 2015 00:12:32 -0400 In-Reply-To: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> Sender: netdev-owner@vger.kernel.org List-ID: Signed-off-by: Alexander Duyck --- drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 2 +- drivers/net/ethernet/broadcom/tg3.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c index a8bb8f664d3d..b10d1744e5ae 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c @@ -662,7 +662,7 @@ static int bnx2x_fill_frag_skb(struct bnx2x *bp, struct bnx2x_fastpath *fp, static void bnx2x_frag_free(const struct bnx2x_fastpath *fp, void *data) { if (fp->rx_frag_size) - put_page(virt_to_head_page(data)); + skb_free_frag(data); else kfree(data); } diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c index 069952fa5d64..73c934cf6c61 100644 --- a/drivers/net/ethernet/broadcom/tg3.c +++ b/drivers/net/ethernet/broadcom/tg3.c @@ -6618,7 +6618,7 @@ static void tg3_tx(struct tg3_napi *tnapi) static void tg3_frag_free(bool is_frag, void *data) { if (is_frag) - put_page(virt_to_head_page(data)); + skb_free_frag(data); else kfree(data); } From mboxrd@z Thu Jan 1 00:00:00 1970 From: Alexander Duyck Subject: [PATCH 09/10] hisilicon: Replace put_page(virt_to_head_page()) with skb_free_frag() Date: Wed, 06 May 2015 21:12:25 -0700 Message-ID: <20150507041225.1873.74253.stgit@ahduyck-vm-fedora22> References: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Cc: akpm@linux-foundation.org, davem@davemloft.net, eric.dumazet@gmail.com To: netdev@vger.kernel.org, linux-mm@kvack.org Return-path: Received: from mx1.redhat.com ([209.132.183.28]:39629 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751356AbbEGEMd (ORCPT ); Thu, 7 May 2015 00:12:33 -0400 In-Reply-To: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> Sender: netdev-owner@vger.kernel.org List-ID: Signed-off-by: Alexander Duyck --- drivers/net/ethernet/hisilicon/hip04_eth.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/hisilicon/hip04_eth.c b/drivers/net/ethernet/hisilicon/hip04_eth.c index 3b39fdddeb57..d49bee38cd31 100644 --- a/drivers/net/ethernet/hisilicon/hip04_eth.c +++ b/drivers/net/ethernet/hisilicon/hip04_eth.c @@ -798,7 +798,7 @@ static void hip04_free_ring(struct net_device *ndev, struct device *d) for (i = 0; i < RX_DESC_NUM; i++) if (priv->rx_buf[i]) - put_page(virt_to_head_page(priv->rx_buf[i])); + skb_free_frag(priv->rx_buf[i]); for (i = 0; i < TX_DESC_NUM; i++) if (priv->tx_skb[i]) From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeff Kirsher Subject: Re: [PATCH 08/10] e1000: Replace e1000_free_frag with skb_free_frag Date: Thu, 7 May 2015 01:46:29 -0700 Message-ID: References: <20150507035558.1873.52664.stgit@ahduyck-vm-fedora22> <20150507041220.1873.80925.stgit@ahduyck-vm-fedora22> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Cc: netdev , linux-mm@kvack.org, Andrew Morton , David Miller , Eric Dumazet To: Alexander Duyck Return-path: Received: from mail-ob0-f178.google.com ([209.85.214.178]:35021 "EHLO mail-ob0-f178.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750991AbbEGIq3 (ORCPT ); Thu, 7 May 2015 04:46:29 -0400 Received: by obcux3 with SMTP id ux3so26402134obc.2 for ; Thu, 07 May 2015 01:46:29 -0700 (PDT) In-Reply-To: <20150507041220.1873.80925.stgit@ahduyck-vm-fedora22> Sender: netdev-owner@vger.kernel.org List-ID: On Wed, May 6, 2015 at 9:12 PM, Alexander Duyck wrote: > Signed-off-by: Alexander Duyck > --- > drivers/net/ethernet/intel/e1000/e1000_main.c | 19 +++++++------------ > 1 file changed, 7 insertions(+), 12 deletions(-) Was my ACK of the first patch not good enough? :-) Acked-by: Jeff Kirsher From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Miller Subject: Re: [PATCH 01/10] net: Use cached copy of pfmemalloc to avoid accessing page Date: Sun, 10 May 2015 20:52:13 -0400 (EDT) Message-ID: <20150510.205213.202914226916051307.davem@davemloft.net> References: <20150507041140.1873.58533.stgit@ahduyck-vm-fedora22> <20150510.191851.414324528131774160.davem@davemloft.net> <554FF14B.4050901@gmail.com> Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: alexander.h.duyck@redhat.com, netdev@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, eric.dumazet@gmail.com To: alexander.duyck@gmail.com Return-path: Received: from shards.monkeyblade.net ([149.20.54.216]:57185 "EHLO shards.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751705AbbEKAwS (ORCPT ); Sun, 10 May 2015 20:52:18 -0400 In-Reply-To: <554FF14B.4050901@gmail.com> Sender: netdev-owner@vger.kernel.org List-ID: From: Alexander Duyck Date: Sun, 10 May 2015 17:01:15 -0700 > The reason for the difference between the two is that in the case of > netdev_alloc_skb/frag the netdev_alloc_cache can only be accessed with > IRQs disabled, whereas in the napi_alloc_skb case we can access the > napi_alloc_cache at any point in the function. Either way I am going > to be stuck with differences because of the local_irq_save/restore > that must be called when accessing the page frag cache that doesn't > exist in the napi case. I see, thanks for explaining.