Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH iproute2 0/2] lib names: Refactoring and cleanups
From: Stephen Hemminger @ 2014-12-10  4:38 UTC (permalink / raw)
  To: Vadim Kochan; +Cc: netdev
In-Reply-To: <1417831512-19452-1-git-send-email-vadim4j@gmail.com>

On Sat,  6 Dec 2014 04:05:10 +0200
Vadim Kochan <vadim4j@gmail.com> wrote:

> Some cleanups and refactoring in lib/rt_names.c:
> 
>     #1 Replaced using of /etc/iproute2 path by CONFDIR define
>         when initializing tables of group names.
> 
>     #2 Added helper to have one func for parsing id and names from
>         db files.
> 
> Vadim Kochan (2):
>   lib names: Use CONFDIR for specify 'group' file path
>   lib names: Add helper func for parse id and name from file
> 
>  lib/rt_names.c | 70 +++++++++++++++++++++++++++++++++-------------------------
>  1 file changed, 40 insertions(+), 30 deletions(-)
> 

Both applied

^ permalink raw reply

* Re: [PATCH] iproute2: Add support for CAN presume-ack feature
From: Stephen Hemminger @ 2014-12-10  4:35 UTC (permalink / raw)
  To: Nikita Edward Baruzdin; +Cc: linux-can, netdev
In-Reply-To: <1417768902-5404-2-git-send-email-nebaruzdin@gmail.com>

On Fri,  5 Dec 2014 11:41:42 +0300
Nikita Edward Baruzdin <nebaruzdin@gmail.com> wrote:

> This patch makes CAN_CTRLMODE_PRESUME_ACK netlink feature configurable.
> When enabled, the feature sets CAN controller in mode in which
> acknowledgement absence is ignored.
> 
> Signed-off-by: Nikita Edward Baruzdin <nebaruzdin@gmail.com>
> ---

Applied.

^ permalink raw reply

* Re: [PATCH iproute2 REGRESSIONS v3] ss: Fix layout/output issues introduced by regression
From: Stephen Hemminger @ 2014-12-10  4:32 UTC (permalink / raw)
  To: Vadim Kochan; +Cc: netdev
In-Reply-To: <1417799951-25143-1-git-send-email-vadim4j@gmail.com>

On Fri,  5 Dec 2014 19:19:11 +0200
Vadim Kochan <vadim4j@gmail.com> wrote:

> This patch fixes the following issues which was introduced by me in commits:
> 
>     #1 (2dc854854b7f1b) ss: Fixed broken output for Netlink 'Peer Address:Port' column
>     ISSUE: Broken layout when all sockets are printed out
> 
>     #2 (eef43b5052afb7) ss: Identify more netlink protocol names
>     ISSUE: Protocol id is not printed if 'numbers only' output was specified (-n)
> 
> Also aligned the width of the local/peer ports to be more wider.
> 
> I tested with a lot of option combinations (I may miss some test cases),
> but layout seems to me better than the previous released version of iproute2/ss.
> 
> Signed-off-by: Vadim Kochan <vadim4j@gmail.com>

Applied.
Thanks for following through and dealing with this.

^ permalink raw reply

* PATCH] Revert "bridge: include in6.h in if_bridge.h for struct in6_addr"
From: Stephen Hemminger @ 2014-12-10  4:30 UTC (permalink / raw)
  To: David Miller, Gregory Fong, Florian Fainelli, Cong Wang; +Cc: netdev

This reverts commit 66f1c44887ba4f47d617f8ae21cf8e04e1892bd7.

iproute2 uses sanitized kernel headers, and this change broke the build.
The problem is that the iproute2 ip/iplink_bridge_slave.c include if_bridge.h
but this causes a redefintion error when Glibc <netlink/in.h> is included.

Since iproute2 is arguably the most important user of network headers,
the uapi header change should be reverted.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

---
Please update stable as well.

 include/uapi/linux/if_bridge.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h
index b03ee8f..439fd0d 100644
--- a/include/uapi/linux/if_bridge.h
+++ b/include/uapi/linux/if_bridge.h
@@ -15,7 +15,6 @@

 #include <linux/types.h>
 #include <linux/if_ether.h>
-#include <linux/in6.h>

 #define SYSFS_BRIDGE_ATTR	"bridge"
 #define SYSFS_BRIDGE_FDB	"brforward"
-- 
2.1.3

^ permalink raw reply related

* Re: [PATCH net-next] ipv6: fix sparse warning
From: David Miller @ 2014-12-10  4:18 UTC (permalink / raw)
  To: ying.xue; +Cc: netdev
In-Reply-To: <1418182101-7992-1-git-send-email-ying.xue@windriver.com>

From: Ying Xue <ying.xue@windriver.com>
Date: Wed, 10 Dec 2014 11:28:21 +0800

> diff --git a/include/linux/rculist.h b/include/linux/rculist.h
> index 866d9c9..32bd4ad 100644
> --- a/include/linux/rculist.h
> +++ b/include/linux/rculist.h
> @@ -524,11 +524,11 @@ static inline void hlist_add_behind_rcu(struct hlist_node *n,

Again, changes to this file should not go through my tree.

^ permalink raw reply

* Re: [PATCH net-next] tipc: fix RCU sparse error
From: David Miller @ 2014-12-10  3:20 UTC (permalink / raw)
  To: ying.xue; +Cc: jon.maloy, kbuild-all, erik.hugne, netdev, tipc-discussion
In-Reply-To: <1418180407-1847-1-git-send-email-ying.xue@windriver.com>

From: Ying Xue <ying.xue@windriver.com>
Date: Wed, 10 Dec 2014 11:00:07 +0800

> diff --git a/include/linux/rculist.h b/include/linux/rculist.h
> index aa79b3c..866d9c9 100644
> --- a/include/linux/rculist.h
> +++ b/include/linux/rculist.h
> @@ -549,8 +549,8 @@ static inline void hlist_add_behind_rcu(struct hlist_node *n,
>   */
>  #define hlist_for_each_entry_from_rcu(pos, member)			\
>  	for (; pos;							\
> -	     pos = hlist_entry_safe(rcu_dereference((pos)->member.next),\
> -			typeof(*(pos)), member))
> +	     pos = hlist_entry_safe(rcu_dereference_raw(hlist_next_rcu( \
> +			&(pos)->member)), typeof(*(pos)), member))
>  
>  #endif	/* __KERNEL__ */
>  #endif

This needs to go via another route, perhaps the RCU maintainer.

^ permalink raw reply

* Re: [net-next PATCH 1/6] net: Split netdev_alloc_frag into __alloc_page_frag and add __napi_alloc_frag
From: Alexei Starovoitov @ 2014-12-10  4:16 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Network Development, David S. Miller, Eric Dumazet,
	Jesper Dangaard Brouer
In-Reply-To: <20141210034042.2114.29360.stgit@ahduyck-vm-fedora20>

On Tue, Dec 9, 2014 at 7:40 PM, Alexander Duyck
<alexander.h.duyck@redhat.com> wrote:
> This patch splits the netdev_alloc_frag function up so that it can be used
> on one of two page frag pools instead of being fixed on the
> netdev_alloc_cache.  By doing this we can add a NAPI specific function
> __napi_alloc_frag that accesses a pool that is only used from softirq
> context.  The advantage to this is that we do not need to call
> local_irq_save/restore which can be a significant savings.
>
> I also took the opportunity to refactor the core bits that were placed in
> __alloc_page_frag.  First I updated the allocation to do either a 32K
> allocation or an order 0 page.  This is based on the changes in commmit
> d9b2938aa where it was found that latencies could be reduced in case of

thanks for explaining that piece of it.

> +       struct page *page = NULL;
> +       gfp_t gfp = gfp_mask;
> +
> +       if (order) {
> +               gfp_mask |= __GFP_COMP | __GFP_NOWARN | __GFP_NORETRY;
> +               page = alloc_pages_node(NUMA_NO_NODE, gfp_mask, order);
> +               nc->frag.size = PAGE_SIZE << (page ? order : 0);
> +       }
>
> -       local_irq_save(flags);
> -       nc = this_cpu_ptr(&netdev_alloc_cache);
> -       if (unlikely(!nc->frag.page)) {
> +       if (unlikely(!page))
> +               page = alloc_pages_node(NUMA_NO_NODE, gfp, 0);

I'm guessing you're not combining this 'if' with above one to
keep gfp untouched, so there is a 'warn' when it actually fails 2nd time.
Tricky :)
Anyway looks good to me and I think I understand it enough to say:
Acked-by: Alexei Starovoitov <ast@plumgrid.com>

^ permalink raw reply

* Re: [PATCH v2 iproute2] ipaddress: enable -details option
From: Stephen Hemminger @ 2014-12-10  4:14 UTC (permalink / raw)
  To: Nicolas Dichtel; +Cc: shemminger, netdev, mkubecek, jbenc
In-Reply-To: <1417711267-13334-1-git-send-email-nicolas.dichtel@6wind.com>

On Thu,  4 Dec 2014 17:41:07 +0100
Nicolas Dichtel <nicolas.dichtel@6wind.com> wrote:

> This option was used only for 'ip link', but it can be useful to have it for
> 'ip address'. Thus it is possible to display link details and addresses with one
> command.
> 
> Example:
> $ ip -d a ls dev gre1
> 9: gre1@NONE: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1468 qdisc noqueue state UNKNOWN group default
>     link/gre 10.16.0.249 peer 10.16.0.121 promiscuity 0
>     gre remote 10.16.0.121 local 10.16.0.249 ttl inherit ikey 0.0.0.10 okey 0.0.0.10 icsum ocsum
>     inet 192.168.0.249 peer 192.168.0.121/32 scope global gre1
>        valid_lft forever preferred_lft forever
>     inet6 fe80::5efe:a10:f9/64 scope link
>        valid_lft forever preferred_lft forever
> 
> Suggested-by: Christophe Gouault <christophe.gouault@6wind.com>
> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>

Looks good, accepted.

^ permalink raw reply

* [net-next PATCH 6/6] ethernet/broadcom: Use napi_alloc_skb instead of netdev_alloc_skb_ip_align
From: Alexander Duyck @ 2014-12-10  3:41 UTC (permalink / raw)
  To: netdev
  Cc: Florian Fainelli, eric.dumazet, Ariel Elior, brouer,
	Gary Zambrano, davem, ast
In-Reply-To: <20141210033902.2114.68658.stgit@ahduyck-vm-fedora20>

This patch replaces the calls to netdev_alloc_skb_ip_align in the
copybreak paths.

Cc: Gary Zambrano <zambrano@broadcom.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Ariel Elior <ariel.elior@qlogic.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
---
 drivers/net/ethernet/broadcom/b44.c             |    2 +-
 drivers/net/ethernet/broadcom/bcm63xx_enet.c    |    2 +-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c |    2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/b44.c b/drivers/net/ethernet/broadcom/b44.c
index ffeaf47..d86d6ba 100644
--- a/drivers/net/ethernet/broadcom/b44.c
+++ b/drivers/net/ethernet/broadcom/b44.c
@@ -836,7 +836,7 @@ static int b44_rx(struct b44 *bp, int budget)
 			struct sk_buff *copy_skb;
 
 			b44_recycle_rx(bp, cons, bp->rx_prod);
-			copy_skb = netdev_alloc_skb_ip_align(bp->dev, len);
+			copy_skb = napi_alloc_skb(&bp->napi, len);
 			if (copy_skb == NULL)
 				goto drop_it_no_recycle;
 
diff --git a/drivers/net/ethernet/broadcom/bcm63xx_enet.c b/drivers/net/ethernet/broadcom/bcm63xx_enet.c
index 3e8d1a8..21206d3 100644
--- a/drivers/net/ethernet/broadcom/bcm63xx_enet.c
+++ b/drivers/net/ethernet/broadcom/bcm63xx_enet.c
@@ -385,7 +385,7 @@ static int bcm_enet_receive_queue(struct net_device *dev, int budget)
 		if (len < copybreak) {
 			struct sk_buff *nskb;
 
-			nskb = netdev_alloc_skb_ip_align(dev, len);
+			nskb = napi_alloc_skb(&priv->napi, len);
 			if (!nskb) {
 				/* forget packet, just rearm desc */
 				dev->stats.rx_dropped++;
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index b4d71fd..1d1147c 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -1015,7 +1015,7 @@ static int bnx2x_rx_int(struct bnx2x_fastpath *fp, int budget)
 		 */
 		if ((bp->dev->mtu > ETH_MAX_PACKET_SIZE) &&
 		    (len <= RX_COPY_THRESH)) {
-			skb = netdev_alloc_skb_ip_align(bp->dev, len);
+			skb = napi_alloc_skb(&fp->napi, len);
 			if (skb == NULL) {
 				DP(NETIF_MSG_RX_ERR | NETIF_MSG_RX_STATUS,
 				   "ERROR  packet dropped because of alloc failure\n");

^ permalink raw reply related

* Re: [PATCH v8 3/3] net: hisilicon: new hip04 ethernet driver
From: Ding Tianhong @ 2014-12-10  3:51 UTC (permalink / raw)
  To: Arnd Bergmann, Alexander Graf
  Cc: Zhangfei Gao, davem, linux, f.fainelli, sergei.shtylyov,
	mark.rutland, David.Laight, eric.dumazet, xuwei5,
	linux-arm-kernel, netdev, devicetree
In-Reply-To: <2619239.NTtdNaZCJM@wuerfel>

On 2014/12/8 4:09, Arnd Bergmann wrote:
> On Sunday 07 December 2014 10:49:12 Alexander Graf wrote:
>> On 07.12.14 04:28, Ding Tianhong wrote:
>>> On 2014/12/7 8:42, Alexander Graf wrote:
>>>> On 19.04.14 03:13, Zhangfei Gao wrote:
>>>>> Support Hisilicon hip04 ethernet driver, including 100M / 1000M controller.
>>>>> The controller has no tx done interrupt, reclaim xmitted buffer in the poll.
>>>>>
>>>>> Signed-off-by: Zhangfei Gao <zhangfei.gao@linaro.org>
>>>>
>>>> Is this driver still supposed to go upstream? I presume this was the
>>>> last submission and it's been quite some time ago 
>>>>
>>>
>>> yes, it is really a long time, but The hip04 did not support tx irq, 
>>> we couldn't get any better idea to fix this defect, do you have any suggestion?
>>
>> Well, if hardware doesn't have a TX irq I don't see there's anything we
>> can do to fix that ;).
> 
> I don't know if it's related to the ethernet on hip01, but I would assume
> it is, and that platform is currently being submitted for inclusion, so
> I'd definitely hope to see this driver get merged too eventually.
> 
> IIRC, the last revision of the patch set had basically fixed the problem,
> except for a race that would still allow the napi poll function to exit
> with poll_complete() but a full queue of TX descriptors and no fallback
> to clean them up. There was also still an open question about whether or
> not the driver should use skb_orphan, but I may be misremembering that part.
>  

Hi Arnd:

what about use a state machine to check the tx queue and free the skb, just like:

diff --git a/drivers/net/ethernet/hisilicon/hip04_eth.c b/drivers/net/ethernet/hisilicon/hip04_eth.c
index 8593658..71faca8 100644
--- a/drivers/net/ethernet/hisilicon/hip04_eth.c
+++ b/drivers/net/ethernet/hisilicon/hip04_eth.c
@@ -396,9 +396,25 @@ static int hip04_mac_start_xmit(struct sk_buff *skb, struct net_device *ndev)
        stats->tx_packets++;
        priv->tx_count++;

+
+       queue_delayed_work(priv->wq, &priv->tx_queue, delay);
+
        return NETDEV_TX_OK;
 }

+static void hip04_tx_queue_monitor(struct work_struct *work)
+{
+       struct hip04_priv *priv = container_of(work, struct hip04_priv,
+                                              queue_work.work);
+       struct net_device *dev = priv->ndev;
+       hip04_tx_reclain(ndev, false);
+
+       if (TX_QUEUE_IS_EMPRY(ndev))
+               return;
+
+       queue_delayed_work(priv->wq, &priv->tx_queue, delay);
+}
+
 static int hip04_rx_poll(struct napi_struct *napi, int budget)
 {
        struct hip04_priv *priv = container_of(napi, struct hip04_priv, napi);
@@ -736,6 +752,8 @@ static int hip04_mac_probe(struct platform_device *pdev)
                goto alloc_fail;
        }

+       INIT_DELAYED_WORK(&priv->tx_queue, hip04_tx_queue_monitor);
+
        return 0;



what do you think of this solution?

Regards
Ding


>> Dave, what's your take here? Should we keep a driver from going upstream
>> just because the hardware is partly broken? I'd really prefer to have an
>> upstream driver on that SoC rather than some random (eventually even
>> more broken) downstream code.
> 
> We can certainly have a slow driver for this hardware, and I'd much
> prefer slow over broken. I'd guess that some of the performance impact
> of the missing interrupts can now be offset with the xmit_more	 logic.
> 
> 	Arnd
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> .
> 

^ permalink raw reply related

* [net-next PATCH 5/6] ethernet/realtek: use napi_alloc_skb instead of netdev_alloc_skb_ip_align
From: Alexander Duyck @ 2014-12-10  3:41 UTC (permalink / raw)
  To: netdev; +Cc: Realtek linux nic maintainers, ast, davem, eric.dumazet, brouer
In-Reply-To: <20141210033902.2114.68658.stgit@ahduyck-vm-fedora20>

This replaces most of the calls to netdev_alloc_skb_ip_align in the Realtek
drivers.  The one instance I didn't replace in 8139cp.c is because it was
called as a part of init and as such is not always accessed from the
softirq context.

Cc: Realtek linux nic maintainers <nic_swsd@realtek.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
---
 drivers/net/ethernet/realtek/8139cp.c  |    2 +-
 drivers/net/ethernet/realtek/8139too.c |    2 +-
 drivers/net/ethernet/realtek/r8169.c   |    2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/realtek/8139cp.c b/drivers/net/ethernet/realtek/8139cp.c
index 75b1693..9c31e46 100644
--- a/drivers/net/ethernet/realtek/8139cp.c
+++ b/drivers/net/ethernet/realtek/8139cp.c
@@ -507,7 +507,7 @@ rx_status_loop:
 		netif_dbg(cp, rx_status, dev, "rx slot %d status 0x%x len %d\n",
 			  rx_tail, status, len);
 
-		new_skb = netdev_alloc_skb_ip_align(dev, buflen);
+		new_skb = napi_alloc_skb(napi, buflen);
 		if (!new_skb) {
 			dev->stats.rx_dropped++;
 			goto rx_next;
diff --git a/drivers/net/ethernet/realtek/8139too.c b/drivers/net/ethernet/realtek/8139too.c
index 63dc0f9..6d0b9df 100644
--- a/drivers/net/ethernet/realtek/8139too.c
+++ b/drivers/net/ethernet/realtek/8139too.c
@@ -2037,7 +2037,7 @@ keep_pkt:
 		/* Malloc up new buffer, compatible with net-2e. */
 		/* Omit the four octet CRC from the length. */
 
-		skb = netdev_alloc_skb_ip_align(dev, pkt_size);
+		skb = napi_alloc_skb(&tp->napi, pkt_size);
 		if (likely(skb)) {
 #if RX_BUF_IDX == 3
 			wrap_copy(skb, rx_ring, ring_offset+4, pkt_size);
diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 1620184..54a3ec1 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -7269,7 +7269,7 @@ static struct sk_buff *rtl8169_try_rx_copy(void *data,
 	data = rtl8169_align(data);
 	dma_sync_single_for_cpu(d, addr, pkt_size, DMA_FROM_DEVICE);
 	prefetch(data);
-	skb = netdev_alloc_skb_ip_align(tp->dev, pkt_size);
+	skb = napi_alloc_skb(&tp->napi, pkt_size);
 	if (skb)
 		memcpy(skb->data, data, pkt_size);
 	dma_sync_single_for_device(d, addr, pkt_size, DMA_FROM_DEVICE);

^ permalink raw reply related

* [net-next PATCH 4/6] cxgb: Use napi_alloc_skb instead of netdev_alloc_skb_ip_align
From: Alexander Duyck @ 2014-12-10  3:41 UTC (permalink / raw)
  To: netdev; +Cc: ast, davem, eric.dumazet, brouer
In-Reply-To: <20141210033902.2114.68658.stgit@ahduyck-vm-fedora20>

In order to use napi_alloc_skb I needed to pass a pointer to struct adapter
instead of struct pci_dev.  This allowed me to access &adapter->napi.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
---
 drivers/net/ethernet/chelsio/cxgb/sge.c |   11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb/sge.c b/drivers/net/ethernet/chelsio/cxgb/sge.c
index 86222a1..babe2a9 100644
--- a/drivers/net/ethernet/chelsio/cxgb/sge.c
+++ b/drivers/net/ethernet/chelsio/cxgb/sge.c
@@ -1025,7 +1025,7 @@ MODULE_PARM_DESC(copybreak, "Receive copy threshold");
 
 /**
  *	get_packet - return the next ingress packet buffer
- *	@pdev: the PCI device that received the packet
+ *	@adapter: the adapter that received the packet
  *	@fl: the SGE free list holding the packet
  *	@len: the actual packet length, excluding any SGE padding
  *
@@ -1037,14 +1037,15 @@ MODULE_PARM_DESC(copybreak, "Receive copy threshold");
  *	threshold and the packet is too big to copy, or (b) the packet should
  *	be copied but there is no memory for the copy.
  */
-static inline struct sk_buff *get_packet(struct pci_dev *pdev,
+static inline struct sk_buff *get_packet(struct adapter *adapter,
 					 struct freelQ *fl, unsigned int len)
 {
-	struct sk_buff *skb;
 	const struct freelQ_ce *ce = &fl->centries[fl->cidx];
+	struct pci_dev *pdev = adapter->pdev;
+	struct sk_buff *skb;
 
 	if (len < copybreak) {
-		skb = netdev_alloc_skb_ip_align(NULL, len);
+		skb = napi_alloc_skb(&adapter->napi, len);
 		if (!skb)
 			goto use_orig_buf;
 
@@ -1357,7 +1358,7 @@ static void sge_rx(struct sge *sge, struct freelQ *fl, unsigned int len)
 	struct sge_port_stats *st;
 	struct net_device *dev;
 
-	skb = get_packet(adapter->pdev, fl, len - sge->rx_pkt_pad);
+	skb = get_packet(adapter, fl, len - sge->rx_pkt_pad);
 	if (unlikely(!skb)) {
 		sge->stats.rx_drops++;
 		return;

^ permalink raw reply related

* [net-next PATCH 3/6] ethernet/intel: Use napi_alloc_skb
From: Alexander Duyck @ 2014-12-10  3:40 UTC (permalink / raw)
  To: netdev; +Cc: brouer, ast, Jeff Kirsher, eric.dumazet, davem
In-Reply-To: <20141210033902.2114.68658.stgit@ahduyck-vm-fedora20>

This change replaces calls to netdev_alloc_skb_ip_align with
napi_alloc_skb.  The advantage of napi_alloc_skb is currently the fact that
the page allocation doesn't make use of any irq disable calls.

There are few spots where I couldn't replace the calls as the buffer
allocation routine is called as a part of init which is outside of the
softirq context.

Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
---
 drivers/net/ethernet/intel/e1000/e1000_main.c |    2 +-
 drivers/net/ethernet/intel/e1000e/netdev.c    |    2 +-
 drivers/net/ethernet/intel/fm10k/fm10k_main.c |    4 ++--
 drivers/net/ethernet/intel/igb/igb_main.c     |    3 +--
 drivers/net/ethernet/intel/ixgb/ixgb_main.c   |    6 +++---
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c |    4 ++--
 6 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c
index 862d198..83140cb 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
@@ -4100,7 +4100,7 @@ static bool e1000_tbi_should_accept(struct e1000_adapter *adapter,
 static struct sk_buff *e1000_alloc_rx_skb(struct e1000_adapter *adapter,
 					  unsigned int bufsz)
 {
-	struct sk_buff *skb = netdev_alloc_skb_ip_align(adapter->netdev, bufsz);
+	struct sk_buff *skb = napi_alloc_skb(&adapter->napi, bufsz);
 
 	if (unlikely(!skb))
 		adapter->alloc_rx_buff_failed++;
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 88936aa..5c82c80 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -1016,7 +1016,7 @@ static bool e1000_clean_rx_irq(struct e1000_ring *rx_ring, int *work_done,
 		 */
 		if (length < copybreak) {
 			struct sk_buff *new_skb =
-			    netdev_alloc_skb_ip_align(netdev, length);
+				napi_alloc_skb(&adapter->napi, length);
 			if (new_skb) {
 				skb_copy_to_linear_data_offset(new_skb,
 							       -NET_IP_ALIGN,
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
index f0d9d6c..3418665 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_main.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c
@@ -308,8 +308,8 @@ static struct sk_buff *fm10k_fetch_rx_buffer(struct fm10k_ring *rx_ring,
 #endif
 
 		/* allocate a skb to store the frags */
-		skb = netdev_alloc_skb_ip_align(rx_ring->netdev,
-						FM10K_RX_HDR_LEN);
+		skb = napi_alloc_skb(&rx_ring->q_vector->napi,
+				     FM10K_RX_HDR_LEN);
 		if (unlikely(!skb)) {
 			rx_ring->rx_stats.alloc_failed++;
 			return NULL;
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index bfd60577..675a330 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -6641,8 +6641,7 @@ static struct sk_buff *igb_fetch_rx_buffer(struct igb_ring *rx_ring,
 #endif
 
 		/* allocate a skb to store the frags */
-		skb = netdev_alloc_skb_ip_align(rx_ring->netdev,
-						IGB_RX_HDR_LEN);
+		skb = napi_alloc_skb(&rx_ring->q_vector->napi, IGB_RX_HDR_LEN);
 		if (unlikely(!skb)) {
 			rx_ring->rx_stats.alloc_failed++;
 			return NULL;
diff --git a/drivers/net/ethernet/intel/ixgb/ixgb_main.c b/drivers/net/ethernet/intel/ixgb/ixgb_main.c
index 055961b..aa87605 100644
--- a/drivers/net/ethernet/intel/ixgb/ixgb_main.c
+++ b/drivers/net/ethernet/intel/ixgb/ixgb_main.c
@@ -1963,7 +1963,7 @@ ixgb_rx_checksum(struct ixgb_adapter *adapter,
  * this should improve performance for small packets with large amounts
  * of reassembly being done in the stack
  */
-static void ixgb_check_copybreak(struct net_device *netdev,
+static void ixgb_check_copybreak(struct napi_struct *napi,
 				 struct ixgb_buffer *buffer_info,
 				 u32 length, struct sk_buff **skb)
 {
@@ -1972,7 +1972,7 @@ static void ixgb_check_copybreak(struct net_device *netdev,
 	if (length > copybreak)
 		return;
 
-	new_skb = netdev_alloc_skb_ip_align(netdev, length);
+	new_skb = napi_alloc_skb(napi, length);
 	if (!new_skb)
 		return;
 
@@ -2064,7 +2064,7 @@ ixgb_clean_rx_irq(struct ixgb_adapter *adapter, int *work_done, int work_to_do)
 			goto rxdesc_done;
 		}
 
-		ixgb_check_copybreak(netdev, buffer_info, length, &skb);
+		ixgb_check_copybreak(&adapter->napi, buffer_info, length, &skb);
 
 		/* Good Receive */
 		skb_put(skb, length);
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
index 94feddf..d75d01b 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
@@ -1909,8 +1909,8 @@ static struct sk_buff *ixgbe_fetch_rx_buffer(struct ixgbe_ring *rx_ring,
 #endif
 
 		/* allocate a skb to store the frags */
-		skb = netdev_alloc_skb_ip_align(rx_ring->netdev,
-						IXGBE_RX_HDR_SIZE);
+		skb = napi_alloc_skb(&rx_ring->q_vector->napi,
+				     IXGBE_RX_HDR_SIZE);
 		if (unlikely(!skb)) {
 			rx_ring->rx_stats.alloc_rx_buff_failed++;
 			return NULL;

^ permalink raw reply related

* [net-next PATCH 2/6] net: Pull out core bits of __netdev_alloc_skb and add __napi_alloc_skb
From: Alexander Duyck @ 2014-12-10  3:40 UTC (permalink / raw)
  To: netdev; +Cc: ast, davem, eric.dumazet, brouer
In-Reply-To: <20141210033902.2114.68658.stgit@ahduyck-vm-fedora20>

This change pulls the core functionality out of __netdev_alloc_skb and
places them in a new function named __alloc_rx_skb.  The reason for doing
this is to make these bits accessible to a new function __napi_alloc_skb.
In addition __alloc_rx_skb now has a new flags value that is used to
determine which page frag pool to allocate from.  If the SKB_ALLOC_NAPI
flag is set then the NAPI pool is used.  The advantage of this is that we
do not have to use local_irq_save/restore when accessing the NAPI pool from
NAPI context.

In my test setup I saw at least 11ns of savings using the napi_alloc_skb
function versus the netdev_alloc_skb function, most of this being due to
the fact that we didn't have to call local_irq_save/restore.

The main use case for napi_alloc_skb would be for things such as copybreak
or page fragment based receive paths where an skb is allocated after the
data has been received instead of before.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
---
 include/linux/skbuff.h |    9 ++++++
 net/core/dev.c         |    2 +
 net/core/skbuff.c      |   74 +++++++++++++++++++++++++++++++++++++++++++-----
 3 files changed, 77 insertions(+), 8 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index b2b53b0..af79302 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -151,6 +151,7 @@ struct net_device;
 struct scatterlist;
 struct pipe_inode_info;
 struct iov_iter;
+struct napi_struct;
 
 #if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE)
 struct nf_conntrack {
@@ -673,6 +674,7 @@ struct sk_buff {
 
 #define SKB_ALLOC_FCLONE	0x01
 #define SKB_ALLOC_RX		0x02
+#define SKB_ALLOC_NAPI		0x04
 
 /* Returns true if the skb was allocated from PFMEMALLOC reserves */
 static inline bool skb_pfmemalloc(const struct sk_buff *skb)
@@ -2165,6 +2167,13 @@ static inline struct sk_buff *netdev_alloc_skb_ip_align(struct net_device *dev,
 }
 
 void *napi_alloc_frag(unsigned int fragsz);
+struct sk_buff *__napi_alloc_skb(struct napi_struct *napi,
+				 unsigned int length, gfp_t gfp_mask);
+static inline struct sk_buff *napi_alloc_skb(struct napi_struct *napi,
+					     unsigned int length)
+{
+	return __napi_alloc_skb(napi, length, GFP_ATOMIC);
+}
 
 /**
  * __dev_alloc_pages - allocate page for network Rx
diff --git a/net/core/dev.c b/net/core/dev.c
index 3f191da..80f798d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4172,7 +4172,7 @@ struct sk_buff *napi_get_frags(struct napi_struct *napi)
 	struct sk_buff *skb = napi->skb;
 
 	if (!skb) {
-		skb = netdev_alloc_skb_ip_align(napi->dev, GRO_MAX_HEAD);
+		skb = napi_alloc_skb(napi, GRO_MAX_HEAD);
 		napi->skb = skb;
 	}
 	return skb;
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 56ed17c..ae13ef6 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -444,10 +444,13 @@ void *napi_alloc_frag(unsigned int fragsz)
 EXPORT_SYMBOL(napi_alloc_frag);
 
 /**
- *	__netdev_alloc_skb - allocate an skbuff for rx on a specific device
- *	@dev: network device to receive on
+ *	__alloc_rx_skb - allocate an skbuff for rx
  *	@length: length to allocate
  *	@gfp_mask: get_free_pages mask, passed to alloc_skb
+ *	@flags:	If SKB_ALLOC_RX is set, __GFP_MEMALLOC will be used for
+ *		allocations in case we have to fallback to __alloc_skb()
+ *		If SKB_ALLOC_NAPI is set, page fragment will be allocated
+ *		from napi_cache instead of netdev_cache.
  *
  *	Allocate a new &sk_buff and assign it a usage count of one. The
  *	buffer has unspecified headroom built in. Users should allocate
@@ -456,11 +459,11 @@ EXPORT_SYMBOL(napi_alloc_frag);
  *
  *	%NULL is returned if there is no free memory.
  */
-struct sk_buff *__netdev_alloc_skb(struct net_device *dev,
-				   unsigned int length, gfp_t gfp_mask)
+static struct sk_buff *__alloc_rx_skb(unsigned int length, gfp_t gfp_mask,
+				      int flags)
 {
 	struct sk_buff *skb = NULL;
-	unsigned int fragsz = SKB_DATA_ALIGN(length + NET_SKB_PAD) +
+	unsigned int fragsz = SKB_DATA_ALIGN(length) +
 			      SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
 
 	if (fragsz <= PAGE_SIZE && !(gfp_mask & (__GFP_WAIT | GFP_DMA))) {
@@ -469,7 +472,9 @@ struct sk_buff *__netdev_alloc_skb(struct net_device *dev,
 		if (sk_memalloc_socks())
 			gfp_mask |= __GFP_MEMALLOC;
 
-		data = __netdev_alloc_frag(fragsz, gfp_mask);
+		data = (flags & SKB_ALLOC_NAPI) ?
+			__napi_alloc_frag(fragsz, gfp_mask) :
+			__netdev_alloc_frag(fragsz, gfp_mask);
 
 		if (likely(data)) {
 			skb = build_skb(data, fragsz);
@@ -477,17 +482,72 @@ struct sk_buff *__netdev_alloc_skb(struct net_device *dev,
 				put_page(virt_to_head_page(data));
 		}
 	} else {
-		skb = __alloc_skb(length + NET_SKB_PAD, gfp_mask,
+		skb = __alloc_skb(length, gfp_mask,
 				  SKB_ALLOC_RX, NUMA_NO_NODE);
 	}
+	return skb;
+}
+
+/**
+ *	__netdev_alloc_skb - allocate an skbuff for rx on a specific device
+ *	@dev: network device to receive on
+ *	@length: length to allocate
+ *	@gfp_mask: get_free_pages mask, passed to alloc_skb
+ *
+ *	Allocate a new &sk_buff and assign it a usage count of one. The
+ *	buffer has NET_SKB_PAD headroom built in. Users should allocate
+ *	the headroom they think they need without accounting for the
+ *	built in space. The built in space is used for optimisations.
+ *
+ *	%NULL is returned if there is no free memory.
+ */
+struct sk_buff *__netdev_alloc_skb(struct net_device *dev,
+				   unsigned int length, gfp_t gfp_mask)
+{
+	struct sk_buff *skb;
+
+	length += NET_SKB_PAD;
+	skb = __alloc_rx_skb(length, gfp_mask, 0);
+
 	if (likely(skb)) {
 		skb_reserve(skb, NET_SKB_PAD);
 		skb->dev = dev;
 	}
+
 	return skb;
 }
 EXPORT_SYMBOL(__netdev_alloc_skb);
 
+/**
+ *	__napi_alloc_skb - allocate skbuff for rx in a specific NAPI instance
+ *	@napi: napi instance this buffer was allocated for
+ *	@length: length to allocate
+ *	@gfp_mask: get_free_pages mask, passed to alloc_skb and alloc_pages
+ *
+ *	Allocate a new sk_buff for use in NAPI receive.  This buffer will
+ *	attempt to allocate the head from a special reserved region used
+ *	only for NAPI Rx allocation.  By doing this we can save several
+ *	CPU cycles by avoiding having to disable and re-enable IRQs.
+ *
+ *	%NULL is returned if there is no free memory.
+ */
+struct sk_buff *__napi_alloc_skb(struct napi_struct *napi,
+				 unsigned int length, gfp_t gfp_mask)
+{
+	struct sk_buff *skb;
+
+	length += NET_SKB_PAD + NET_IP_ALIGN;
+	skb = __alloc_rx_skb(length, gfp_mask, SKB_ALLOC_NAPI);
+
+	if (likely(skb)) {
+		skb_reserve(skb, NET_SKB_PAD + NET_IP_ALIGN);
+		skb->dev = napi->dev;
+	}
+
+	return skb;
+}
+EXPORT_SYMBOL(__napi_alloc_skb);
+
 void skb_add_rx_frag(struct sk_buff *skb, int i, struct page *page, int off,
 		     int size, unsigned int truesize)
 {

^ permalink raw reply related

* [net-next PATCH 1/6] net: Split netdev_alloc_frag into __alloc_page_frag and add __napi_alloc_frag
From: Alexander Duyck @ 2014-12-10  3:40 UTC (permalink / raw)
  To: netdev; +Cc: ast, davem, eric.dumazet, brouer
In-Reply-To: <20141210033902.2114.68658.stgit@ahduyck-vm-fedora20>

This patch splits the netdev_alloc_frag function up so that it can be used
on one of two page frag pools instead of being fixed on the
netdev_alloc_cache.  By doing this we can add a NAPI specific function
__napi_alloc_frag that accesses a pool that is only used from softirq
context.  The advantage to this is that we do not need to call
local_irq_save/restore which can be a significant savings.

I also took the opportunity to refactor the core bits that were placed in
__alloc_page_frag.  First I updated the allocation to do either a 32K
allocation or an order 0 page.  This is based on the changes in commmit
d9b2938aa where it was found that latencies could be reduced in case of
failures.  Then I also rewrote the logic to work from the end of the page to
the start.  By doing this the size value doesn't have to be used unless we
have run out of space for page fragments.  Finally I cleaned up the atomic
bits so that we just do an atomic_sub_and_test and if that returns true then
we set the page->_count via an atomic_set.  This way we can remove the extra
conditional for the atomic_read since it would have led to an atomic_inc in
the case of success anyway.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
---
 include/linux/skbuff.h |    2 +
 net/core/skbuff.c      |  117 ++++++++++++++++++++++++++++++++----------------
 2 files changed, 79 insertions(+), 40 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index ef64cec..b2b53b0 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2164,6 +2164,8 @@ static inline struct sk_buff *netdev_alloc_skb_ip_align(struct net_device *dev,
 	return __netdev_alloc_skb_ip_align(dev, length, GFP_ATOMIC);
 }
 
+void *napi_alloc_frag(unsigned int fragsz);
+
 /**
  * __dev_alloc_pages - allocate page for network Rx
  * @gfp_mask: allocation priority. Set __GFP_NOMEMALLOC if not for network Rx
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 7a338fb..56ed17c 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -336,59 +336,85 @@ struct netdev_alloc_cache {
 	unsigned int		pagecnt_bias;
 };
 static DEFINE_PER_CPU(struct netdev_alloc_cache, netdev_alloc_cache);
+static DEFINE_PER_CPU(struct netdev_alloc_cache, napi_alloc_cache);
 
-static void *__netdev_alloc_frag(unsigned int fragsz, gfp_t gfp_mask)
+static struct page *__page_frag_refill(struct netdev_alloc_cache *nc,
+				       gfp_t gfp_mask)
 {
-	struct netdev_alloc_cache *nc;
-	void *data = NULL;
-	int order;
-	unsigned long flags;
+	const unsigned int order = NETDEV_FRAG_PAGE_MAX_ORDER;
+	struct page *page = NULL;
+	gfp_t gfp = gfp_mask;
+
+	if (order) {
+		gfp_mask |= __GFP_COMP | __GFP_NOWARN | __GFP_NORETRY;
+		page = alloc_pages_node(NUMA_NO_NODE, gfp_mask, order);
+		nc->frag.size = PAGE_SIZE << (page ? order : 0);
+	}
 
-	local_irq_save(flags);
-	nc = this_cpu_ptr(&netdev_alloc_cache);
-	if (unlikely(!nc->frag.page)) {
+	if (unlikely(!page))
+		page = alloc_pages_node(NUMA_NO_NODE, gfp, 0);
+
+	nc->frag.page = page;
+
+	return page;
+}
+
+static void *__alloc_page_frag(struct netdev_alloc_cache __percpu *cache,
+			       unsigned int fragsz, gfp_t gfp_mask)
+{
+	struct netdev_alloc_cache *nc = this_cpu_ptr(cache);
+	struct page *page = nc->frag.page;
+	unsigned int size;
+	int offset;
+
+	if (unlikely(!page)) {
 refill:
-		for (order = NETDEV_FRAG_PAGE_MAX_ORDER; ;) {
-			gfp_t gfp = gfp_mask;
+		page = __page_frag_refill(nc, gfp_mask);
+		if (!page)
+			return NULL;
+
+		/* if size can vary use frag.size else just use PAGE_SIZE */
+		size = NETDEV_FRAG_PAGE_MAX_ORDER ? nc->frag.size : PAGE_SIZE;
 
-			if (order)
-				gfp |= __GFP_COMP | __GFP_NOWARN;
-			nc->frag.page = alloc_pages(gfp, order);
-			if (likely(nc->frag.page))
-				break;
-			if (--order < 0)
-				goto end;
-		}
-		nc->frag.size = PAGE_SIZE << order;
 		/* Even if we own the page, we do not use atomic_set().
 		 * This would break get_page_unless_zero() users.
 		 */
-		atomic_add(NETDEV_PAGECNT_MAX_BIAS - 1,
-			   &nc->frag.page->_count);
-		nc->pagecnt_bias = NETDEV_PAGECNT_MAX_BIAS;
-		nc->frag.offset = 0;
+		atomic_add(size - 1, &page->_count);
+
+		/* reset page count bias and offset to start of new frag */
+		nc->pagecnt_bias = size;
+		nc->frag.offset = size;
 	}
 
-	if (nc->frag.offset + fragsz > nc->frag.size) {
-		if (atomic_read(&nc->frag.page->_count) != nc->pagecnt_bias) {
-			if (!atomic_sub_and_test(nc->pagecnt_bias,
-						 &nc->frag.page->_count))
-				goto refill;
-			/* OK, page count is 0, we can safely set it */
-			atomic_set(&nc->frag.page->_count,
-				   NETDEV_PAGECNT_MAX_BIAS);
-		} else {
-			atomic_add(NETDEV_PAGECNT_MAX_BIAS - nc->pagecnt_bias,
-				   &nc->frag.page->_count);
-		}
-		nc->pagecnt_bias = NETDEV_PAGECNT_MAX_BIAS;
-		nc->frag.offset = 0;
+	offset = nc->frag.offset - fragsz;
+	if (unlikely(offset < 0)) {
+		if (!atomic_sub_and_test(nc->pagecnt_bias, &page->_count))
+			goto refill;
+
+		/* if size can vary use frag.size else just use PAGE_SIZE */
+		size = NETDEV_FRAG_PAGE_MAX_ORDER ? nc->frag.size : PAGE_SIZE;
+
+		/* OK, page count is 0, we can safely set it */
+		atomic_set(&page->_count, size);
+
+		/* reset page count bias and offset to start of new frag */
+		nc->pagecnt_bias = size;
+		offset = size - fragsz;
 	}
 
-	data = page_address(nc->frag.page) + nc->frag.offset;
-	nc->frag.offset += fragsz;
 	nc->pagecnt_bias--;
-end:
+	nc->frag.offset = offset;
+
+	return page_address(page) + offset;
+}
+
+static void *__netdev_alloc_frag(unsigned int fragsz, gfp_t gfp_mask)
+{
+	unsigned long flags;
+	void *data;
+
+	local_irq_save(flags);
+	data = __alloc_page_frag(&netdev_alloc_cache, fragsz, gfp_mask);
 	local_irq_restore(flags);
 	return data;
 }
@@ -406,6 +432,17 @@ void *netdev_alloc_frag(unsigned int fragsz)
 }
 EXPORT_SYMBOL(netdev_alloc_frag);
 
+static void *__napi_alloc_frag(unsigned int fragsz, gfp_t gfp_mask)
+{
+	return __alloc_page_frag(&napi_alloc_cache, fragsz, gfp_mask);
+}
+
+void *napi_alloc_frag(unsigned int fragsz)
+{
+	return __napi_alloc_frag(fragsz, GFP_ATOMIC | __GFP_COLD);
+}
+EXPORT_SYMBOL(napi_alloc_frag);
+
 /**
  *	__netdev_alloc_skb - allocate an skbuff for rx on a specific device
  *	@dev: network device to receive on

^ permalink raw reply related

* [net-next PATCH 0/6] net: Alloc NAPI page frags from their own pool
From: Alexander Duyck @ 2014-12-10  3:40 UTC (permalink / raw)
  To: netdev; +Cc: ast, davem, eric.dumazet, brouer

This patch series implements a means of allocating page fragments without
the need for the local_irq_save/restore in __netdev_alloc_frag.  By doing
this I am able to decrease packet processing time by 11ns per packet in my
test environment.

---

Alexander Duyck (6):
      net: Split netdev_alloc_frag into __alloc_page_frag and add __napi_alloc_frag
      net: Pull out core bits of __netdev_alloc_skb and add __napi_alloc_skb
      ethernet/intel: Use napi_alloc_skb
      cxgb: Use napi_alloc_skb instead of netdev_alloc_skb_ip_align
      ethernet/realtek: use napi_alloc_skb instead of netdev_alloc_skb_ip_align
      ethernet/broadcom: Use napi_alloc_skb instead of netdev_alloc_skb_ip_align


 drivers/net/ethernet/broadcom/b44.c             |    2 
 drivers/net/ethernet/broadcom/bcm63xx_enet.c    |    2 
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c |    2 
 drivers/net/ethernet/chelsio/cxgb/sge.c         |   11 +
 drivers/net/ethernet/intel/e1000/e1000_main.c   |    2 
 drivers/net/ethernet/intel/e1000e/netdev.c      |    2 
 drivers/net/ethernet/intel/fm10k/fm10k_main.c   |    4 
 drivers/net/ethernet/intel/igb/igb_main.c       |    3 
 drivers/net/ethernet/intel/ixgb/ixgb_main.c     |    6 -
 drivers/net/ethernet/intel/ixgbe/ixgbe_main.c   |    4 
 drivers/net/ethernet/realtek/8139cp.c           |    2 
 drivers/net/ethernet/realtek/8139too.c          |    2 
 drivers/net/ethernet/realtek/r8169.c            |    2 
 include/linux/skbuff.h                          |   11 +
 net/core/dev.c                                  |    2 
 net/core/skbuff.c                               |  191 +++++++++++++++++------
 16 files changed, 178 insertions(+), 70 deletions(-)

--

^ permalink raw reply

* [PATCH net-next] ipv6: fix sparse warning
From: Ying Xue @ 2014-12-10  3:28 UTC (permalink / raw)
  To: davem; +Cc: netdev

This fixes the following sparse warning when using

make C=1 CF=-D__CHECK_ENDIAN__ net/ipv6/addrconf.o
net/ipv6/addrconf.c:3495:9: error: incompatible types in comparison expression (different address spaces)
net/ipv6/addrconf.c:3495:9: error: incompatible types in comparison expression (different address spaces)
net/ipv6/addrconf.c:3495:9: error: incompatible types in comparison expression (different address spaces)
net/ipv6/addrconf.c:3495:9: error: incompatible types in comparison expression (different address spaces)

To silence above spare complaint, an RCU annotation should be added
to next pointer of hlist_node structure through hlist_next_rcu() macro
when iterating over a hlist with hlist_for_each_entry_continue_rcu_bh().

By the way, this commit also resolves the same error appearing in
hlist_for_each_entry_continue_rcu().

Signed-off-by: Ying Xue <ying.xue@windriver.com>
---
 include/linux/rculist.h |   16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/include/linux/rculist.h b/include/linux/rculist.h
index 866d9c9..32bd4ad 100644
--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -524,11 +524,11 @@ static inline void hlist_add_behind_rcu(struct hlist_node *n,
  * @member:	the name of the hlist_node within the struct.
  */
 #define hlist_for_each_entry_continue_rcu(pos, member)			\
-	for (pos = hlist_entry_safe(rcu_dereference((pos)->member.next),\
-			typeof(*(pos)), member);			\
+	for (pos = hlist_entry_safe(rcu_dereference_raw(hlist_next_rcu( \
+			&(pos)->member)), typeof(*(pos)), member);	\
 	     pos;							\
-	     pos = hlist_entry_safe(rcu_dereference((pos)->member.next),\
-			typeof(*(pos)), member))
+	     pos = hlist_entry_safe(rcu_dereference_raw(hlist_next_rcu(	\
+			&(pos)->member)), typeof(*(pos)), member))
 
 /**
  * hlist_for_each_entry_continue_rcu_bh - iterate over a hlist continuing after current point
@@ -536,11 +536,11 @@ static inline void hlist_add_behind_rcu(struct hlist_node *n,
  * @member:	the name of the hlist_node within the struct.
  */
 #define hlist_for_each_entry_continue_rcu_bh(pos, member)		\
-	for (pos = hlist_entry_safe(rcu_dereference_bh((pos)->member.next),\
-			typeof(*(pos)), member);			\
+	for (pos = hlist_entry_safe(rcu_dereference_bh(hlist_next_rcu(  \
+			&(pos)->member)), typeof(*(pos)), member);	\
 	     pos;							\
-	     pos = hlist_entry_safe(rcu_dereference_bh((pos)->member.next),\
-			typeof(*(pos)), member))
+	     pos = hlist_entry_safe(rcu_dereference_bh(hlist_next_rcu(	\
+			&(pos)->member)), typeof(*(pos)), member))
 
 /**
  * hlist_for_each_entry_from_rcu - iterate over a hlist continuing from current point
-- 
1.7.9.5

^ permalink raw reply related

* Re: [PATCH 1/1] net: dsa: Fix of kernel panic in case of missing PHY.
From: Florian Fainelli @ 2014-12-10  3:23 UTC (permalink / raw)
  To: Andrey Volkov, netdev, Brian Norris
In-Reply-To: <54873205.30401@nexvision.fr>

On 09/12/14 09:31, Andrey Volkov wrote:
> Fix of kernel panic in case of missing PHY.
> 
> Signed-off-by: Andrey Volkov <andrey.volkov@nexvision.fr>

Brian has actually been able to reproduce such a crash in this code-path
today:

if (!p->phy) {
                p->phy = ds->slave_mii_bus->phy_map[p->port];
                phy_connect_direct(slave_dev, p->phy,
				   dsa_slave_adjust_link,
                                   p->phy_interface);
}

we basically assume here that we have a valid phy pointer out of
ds->slave_mii_bus->phy_map[p->port] which is not true in all cases,
especially not if the device is not there.

I will come up with a fix for that, as for propagating the error code
down to the caller, this can be a separate patch.

Thanks!

> ---
>  net/dsa/slave.c |   19 +++++++++++++++----
>  1 file changed, 15 insertions(+), 4 deletions(-)
> 
> diff --git a/net/dsa/slave.c b/net/dsa/slave.c
> index 528380a..6f89caa 100644
> --- a/net/dsa/slave.c
> +++ b/net/dsa/slave.c
> @@ -512,7 +512,7 @@ static int dsa_slave_fixed_link_update(struct net_device *dev,
>  }
>  
>  /* slave device setup *******************************************************/
> -static void dsa_slave_phy_setup(struct dsa_slave_priv *p,
> +static int dsa_slave_phy_setup(struct dsa_slave_priv *p,
>  				struct net_device *slave_dev)
>  {
>  	struct dsa_switch *ds = p->parent;
> @@ -533,7 +533,7 @@ static void dsa_slave_phy_setup(struct dsa_slave_priv *p,
>  		ret = of_phy_register_fixed_link(port_dn);
>  		if (ret) {
>  			netdev_err(slave_dev, "failed to register fixed PHY\n");
> -			return;
> +			return ret;
>  		}
>  		phy_is_fixed = true;
>  		phy_dn = port_dn;
> @@ -555,12 +555,17 @@ static void dsa_slave_phy_setup(struct dsa_slave_priv *p,
>  	 */
>  	if (!p->phy) {
>  		p->phy = ds->slave_mii_bus->phy_map[p->port];
> -		phy_connect_direct(slave_dev, p->phy, dsa_slave_adjust_link,
> +		if(p->phy)
> +			phy_connect_direct(slave_dev, p->phy, dsa_slave_adjust_link,
>  				   p->phy_interface);
> +		else
> +			return -ENODEV;
> +
>  	} else {
>  		netdev_info(slave_dev, "attached PHY at address %d [%s]\n",
>  			    p->phy->addr, p->phy->drv->name);
>  	}
> +	return 0;
>  }
>  
>  int dsa_slave_suspend(struct net_device *slave_dev)
> @@ -653,7 +658,13 @@ dsa_slave_create(struct dsa_switch *ds, struct device *parent,
>  	p->old_link = -1;
>  	p->old_duplex = -1;
>  
> -	dsa_slave_phy_setup(p, slave_dev);
> +	ret = dsa_slave_phy_setup(p, slave_dev);
> +	if (ret) {
> +		netdev_err(master, "error %d registering interface %s\n",
> +			   ret, slave_dev->name);
> +		free_netdev(slave_dev);
> +		return NULL;
> +	}
>  
>  	ret = register_netdev(slave_dev);
>  	if (ret) {
> 

^ permalink raw reply

* [PATCH net-next] tipc: fix RCU sparse error
From: Ying Xue @ 2014-12-10  3:00 UTC (permalink / raw)
  To: davem; +Cc: jon.maloy, kbuild-all, erik.hugne, netdev, tipc-discussion

The commit 97ede29e80ee ("tipc: convert name table read-write lock to
RCU") involves the following sparse when using

  make ARCH=x86_64 allmodconfig
  make C=1 CF=-D__CHECK_ENDIAN__

net/tipc/name_table.c:1136:17: sparse: incompatible types in comparison expression (different address spaces)
net/tipc/name_table.c:1136:17: sparse: incompatible types in comparison expression (different address spaces)

To silence above spare complaint, an RCU annotation should be added
to next pointer of hlist_node structure through hlist_next_rcu() macro
when iterating over a hlist with hlist_for_each_entry_from_rcu().

By the way, this commit also simplifies the way of dereferencing
the first element of a hlist_head list by replacing
hlist_for_each_entry_rcu() with hlist_entry_safe().

Reported-by: Kbuild test robot <kbuild-all@01.org>
Cc: Kbuild test robot <kbuild-all@01.org>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
---
 include/linux/rculist.h |    4 ++--
 net/tipc/name_table.c   |    6 +++---
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/linux/rculist.h b/include/linux/rculist.h
index aa79b3c..866d9c9 100644
--- a/include/linux/rculist.h
+++ b/include/linux/rculist.h
@@ -549,8 +549,8 @@ static inline void hlist_add_behind_rcu(struct hlist_node *n,
  */
 #define hlist_for_each_entry_from_rcu(pos, member)			\
 	for (; pos;							\
-	     pos = hlist_entry_safe(rcu_dereference((pos)->member.next),\
-			typeof(*(pos)), member))
+	     pos = hlist_entry_safe(rcu_dereference_raw(hlist_next_rcu( \
+			&(pos)->member)), typeof(*(pos)), member))
 
 #endif	/* __KERNEL__ */
 #endif
diff --git a/net/tipc/name_table.c b/net/tipc/name_table.c
index c8df022..fa4341f 100644
--- a/net/tipc/name_table.c
+++ b/net/tipc/name_table.c
@@ -1110,7 +1110,7 @@ static int __tipc_nl_seq_list(struct tipc_nl_msg *msg, u32 *last_type,
 			      u32 *last_lower, u32 *last_publ)
 {
 	struct hlist_head *seq_head;
-	struct name_seq *seq = NULL;
+	struct name_seq *seq;
 	int err;
 	int i;
 
@@ -1127,8 +1127,8 @@ static int __tipc_nl_seq_list(struct tipc_nl_msg *msg, u32 *last_type,
 			if (!seq)
 				return -EPIPE;
 		} else {
-			hlist_for_each_entry_rcu(seq, seq_head, ns_list)
-				break;
+			seq = hlist_entry_safe(rcu_dereference_raw(
+			hlist_first_rcu(seq_head)), struct name_seq, ns_list);
 			if (!seq)
 				continue;
 		}
-- 
1.7.9.5

^ permalink raw reply related

* Re: [PATCH 0/20] fix misspelling of current function in string
From: Julian Calaby @ 2014-12-10  2:56 UTC (permalink / raw)
  To: Julia Lawall
  Cc: devel@driverdev.osuosl.org, linux-samsung-soc, linux-scsi,
	linux-pci, linux-wireless, intel-gfx, linux-usb, kernel-janitors,
	linux-kernel@vger.kernel.org, dri-devel, netdev, linux-mtd, linux,
	Joe Perches, Mailing List, Arm
In-Reply-To: <alpine.DEB.2.02.1412080738210.2100@localhost6.localdomain6>

Hi Julia,

On Mon, Dec 8, 2014 at 5:43 PM, Julia Lawall <julia.lawall@lip6.fr> wrote:
> On Mon, 8 Dec 2014, Julian Calaby wrote:
>
>> Hi Julia,
>>
>> On Mon, Dec 8, 2014 at 6:20 AM, Julia Lawall <Julia.Lawall@lip6.fr> wrote:
>> > These patches replace what appears to be a reference to the name of the
>> > current function but is misspelled in some way by either the name of the
>> > function itself, or by %s and then __func__ in an argument list.
>>
>> Would there be any value in doing this for _all_ cases where the
>> function name is written in a format string?
>
> Probably.  But there are a lot of them.  Even for the misspellings, I have
> only don about 1/3 of the cases.
>
> On the other hand, the misspelling have to be checked carefully, because a
> misspelling of one thing could be the correct spelling of the thing thst
> was actually intended.
>
> Joe, however, points out that a lot of these prints are just for function
> tracing, and could be removed.  I worked on another semantic patch that
> tries to do that.  It might be better to remove those prints completely,
> rather than sending one patch to transform them and then one patch to
> remove them after that.  That is why for this series I did only the ones
> where there was actually a problem.

Ok, that makes sense.

Either way though, this is a really interesting application of the
semantic patching. Nice work!

Thanks,

-- 
Julian Calaby

Email: julian.calaby@gmail.com
Profile: http://www.google.com/profiles/julian.calaby/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply

* Re: [net-next] dummy: use MODULE_VERSION
From: David Miller @ 2014-12-10  2:52 UTC (permalink / raw)
  To: fbl; +Cc: netdev
In-Reply-To: <1418172108-10522-1-git-send-email-fbl@redhat.com>

From: Flavio Leitner <fbl@redhat.com>
Date: Tue,  9 Dec 2014 22:41:48 -0200

> Use MODULE_VERSION() now that dummy driver has a version.
> 
> Signed-off-by: Flavio Leitner <fbl@redhat.com>

Applied, thanks a lot Flavio.

^ permalink raw reply

* Re: [patch net-next v2] net: sched: cls: use nla_nest_cancel instead of nlmsg_trim
From: David Miller @ 2014-12-10  2:51 UTC (permalink / raw)
  To: jiri; +Cc: netdev, jhs
In-Reply-To: <1418160209-20001-1-git-send-email-jiri@resnulli.us>

From: Jiri Pirko <jiri@resnulli.us>
Date: Tue,  9 Dec 2014 22:23:29 +0100

> To cancel nesting, this function is more convenient.
> 
> Signed-off-by: Jiri Pirko <jiri@resnulli.us>
> ---
> v1->v2:
> - removed remaining use of b in tcindex pointed out by DaveM

This one compiles! :-)

Applied, thanks.

^ permalink raw reply

* Re: [PATCH] fix suspicious rcu_dereference_check in net/sched/sch_fq_codel.c
From: David Miller @ 2014-12-10  2:49 UTC (permalink / raw)
  To: Valdis.Kletnieks; +Cc: edumazet, john.r.fastabend, linux-kernel, netdev
In-Reply-To: <58905.1418159750@turing-police.cc.vt.edu>

From: Valdis Kletnieks <Valdis.Kletnieks@vt.edu>
Date: Tue, 09 Dec 2014 16:15:50 -0500

> commit 46e5da40ae (net: qdisc: use rcu prefix and silence
>  sparse warnings) triggers a spurious warning:
> 
> net/sched/sch_fq_codel.c:97 suspicious rcu_dereference_check() usage!
> 
> The code should be using the _bh variant of rcu_dereference.
> 
> Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH net-next] amd-xgbe: Use disable_irq_nosync when in IRQ context
From: David Miller @ 2014-12-10  2:48 UTC (permalink / raw)
  To: thomas.lendacky; +Cc: netdev
In-Reply-To: <20141209205408.26097.70441.stgit@tlendack-t1.amdoffice.net>

From: Tom Lendacky <thomas.lendacky@amd.com>
Date: Tue, 9 Dec 2014 14:54:08 -0600

> The disable_irq_nosync function, not the disable_irq function, must be
> used to disable the DMA channel interrupt from within the interrupt
> service routine. Change the disable_irq call to disable_irq_nosync.
> 
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>

Applied, thanks Tom.

^ permalink raw reply

* Re: [RFC PATCH net-next 00/11] net: remove disable_irq() from ->ndo_poll_controller
From: David Miller @ 2014-12-10  2:44 UTC (permalink / raw)
  To: sd; +Cc: netdev, peterz, tglx
In-Reply-To: <1418135842-21389-1-git-send-email-sd@queasysnail.net>

Adding a new spinlock to every interrupt service routine is
simply a non-starter.

You will certainly have to find a way to fix this in a way
that doesn't involve adding any new overhead to the normal
operational paths of these drivers.

Thanks.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox