Netdev List

Netdev List
 help / color / mirror / Atom feed

* [net-next] e1000e: Fix merge conflict (net->net-next)
From: Jeff Kirsher @ 2012-05-09  9:23 UTC (permalink / raw)
  To: davem; +Cc: Jeff Kirsher, netdev, gospo, sassmann

During merge of net to net-next the changes in patch:

e1000e: Fix default interrupt throttle rate not set in NIC HW

got munged in param.c of the e1000e driver.  This rectifies the
merge issues.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
---

This patch is also available on my net-next tree on kernel.org if you
want to pull this patch.

---
 drivers/net/ethernet/intel/e1000e/param.c |   56 ++++++-----------------------
 1 files changed, 11 insertions(+), 45 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/param.c b/drivers/net/ethernet/intel/e1000e/param.c
index 42444d1..55cc156 100644
--- a/drivers/net/ethernet/intel/e1000e/param.c
+++ b/drivers/net/ethernet/intel/e1000e/param.c
@@ -344,50 +344,16 @@ void __devinit e1000e_check_options(struct e1000_adapter *adapter)
 
 		if (num_InterruptThrottleRate > bd) {
 			adapter->itr = InterruptThrottleRate[bd];
-			switch (adapter->itr) {
-			case 0:
-				e_info("%s turned off\n", opt.name);
-				break;
-			case 1:
-				e_info("%s set to dynamic mode\n", opt.name);
-				adapter->itr_setting = adapter->itr;
-				adapter->itr = 20000;
-				break;
-			case 3:
-				e_info("%s set to dynamic conservative mode\n",
-					opt.name);
-				adapter->itr_setting = adapter->itr;
-				adapter->itr = 20000;
-				break;
-			case 4:
-				e_info("%s set to simplified (2000-8000 ints) mode\n",
-				       opt.name);
-				adapter->itr_setting = 4;
-				break;
-			default:
-				/*
-				 * Save the setting, because the dynamic bits
-				 * change itr.
-				 */
-				if (e1000_validate_option(&adapter->itr, &opt,
-							  adapter) &&
-				    (adapter->itr == 3)) {
-					/*
-					 * In case of invalid user value,
-					 * default to conservative mode.
-					 */
-					adapter->itr_setting = adapter->itr;
-					adapter->itr = 20000;
-				} else {
-					/*
-					 * Clear the lower two bits because
-					 * they are used as control.
-					 */
-					adapter->itr_setting =
-						adapter->itr & ~3;
-				}
-				break;
-			}
+
+			/*
+			 * Make sure a message is printed for non-special
+			 * values. And in case of an invalid option, display
+			 * warning, use default and go through itr/itr_setting
+			 * adjustment logic below
+			 */
+			if ((adapter->itr > 4) &&
+			    e1000_validate_option(&adapter->itr, &opt, adapter))
+				adapter->itr = opt.def;
 		} else {
 			/*
 			 * If no option specified, use default value and go
@@ -399,7 +365,7 @@ void __devinit e1000e_check_options(struct e1000_adapter *adapter)
 			 * Make sure a message is printed for non-special
 			 * default values
 			 */
-			if (adapter->itr > 40)
+			if (adapter->itr > 4)
 				e_info("%s set to default %d\n", opt.name,
 				       adapter->itr);
 		}
-- 
1.7.7.6

^ permalink raw reply related

* Re: IP_MULTICAST_IF getsockopt man part
From: Jiri Pirko @ 2012-05-09  9:34 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: linux-man-u79uwXL29TY76Z2rM5mHXA, netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <CAKgNAkgBz_dRV97JVV1Np0YuEd0EVqoLPZkajiE_KJ2Lk20YRA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

Sat, May 05, 2012 at 01:57:35PM CEST, mtk.manpages-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org wrote:
>On Fri, May 4, 2012 at 9:00 PM, Jiri Pirko <jpirko-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote:
>> Hi.
>>
>> <quote>
>> IP_MULTICAST_IF (since Linux 1.2)
>>                      Set the local device for a multicast socket.
>>                      Argument is an ip_mreqn or ip_mreq structure
>>                      similar to IP_ADD_MEMBERSHIP.
>> </quote>
>>
>> That is not true. Setsockopt recognizes only ip_mreqn and in_addr. I
>> made patch which makes it recognize ip_mreq as well. So that would be
>> probably ok.
>> http://patchwork.ozlabs.org/patch/156815/
>>
>> On the other hand, getsockopt works only with in_addr. That I think is
>> good behaviour but manpages here needs to be corrected in this way (read
>> part needs to be added here)
>
>Jirka,
>
>I'm having trouble to understand what you mean. Perhaps it would be
>simplest if you showed your proposed replacement text for the text
>quoted above.

<orig>
IP_MULTICAST_IF (since Linux 1.2)
Set the local device for a multicast socket. Argument is an ip_mreqn
or ip_mreq structure similar to IP_ADD_MEMBERSHIP.
</orig>
<new>
IP_MULTICAST_IF (since Linux 1.2)
Set or read the local device for a multicast socket.
Argument for set is an ip_mreqn or ip_mreq or addr_in structure.
Argument for read is an addr_in structure.
</new>


Jirka



>
>Thanks,
>
>Michael
>
>
>
>-- 
>Michael Kerrisk
>Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
>Author of "The Linux Programming Interface"; http://man7.org/tlpi/
--
To unsubscribe from this list: send the line "unsubscribe linux-man" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 7/9] net: add skb_orphan_frags to copy aside frags with destructors
From: Ian Campbell @ 2012-05-09  9:36 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: netdev@vger.kernel.org, David Miller, Eric Dumazet
In-Reply-To: <20120507102441.GA18591@redhat.com>


> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index 945b807..f009abb 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -697,31 +697,25 @@ struct sk_buff *skb_morph(struct sk_buff *dst, struct sk_buff *src)
> >  }
> >  EXPORT_SYMBOL_GPL(skb_morph);
> >  
> > -/*	skb_copy_ubufs	-	copy userspace skb frags buffers to kernel
> > - *	@skb: the skb to modify
> > - *	@gfp_mask: allocation priority
> > - *
> > - *	This must be called on SKBTX_DEV_ZEROCOPY skb.
> > - *	It will copy all frags into kernel and drop the reference
> > - *	to userspace pages.
> > - *
> > - *	If this function is called from an interrupt gfp_mask() must be
> > - *	%GFP_ATOMIC.
> > - *
> > - *	Returns 0 on success or a negative error code on failure
> > - *	to allocate kernel memory to copy to.
> > +/*
> > + * If uarg != NULL copy and replace all frags.
> > + * If uarg == NULL then only copy and replace those which have a destructor
> > + * pointer.
> >   */
> > -int skb_copy_ubufs(struct sk_buff *skb, gfp_t gfp_mask)
> > +static int skb_copy_frags(struct sk_buff *skb, gfp_t gfp_mask,
> > +			  struct ubuf_info *uarg)
> >  {
> >  	int i;
> >  	int num_frags = skb_shinfo(skb)->nr_frags;
> >  	struct page *page, *head = NULL;
> > -	struct ubuf_info *uarg = skb_shinfo(skb)->destructor_arg;
> >  
> >  	for (i = 0; i < num_frags; i++) {
> >  		u8 *vaddr;
> >  		skb_frag_t *f = &skb_shinfo(skb)->frags[i];
> >  
> > +		if (!uarg && !f->page.destructor)
> > +			continue;
> > +
> >  		page = alloc_page(GFP_ATOMIC);
> >  		if (!page) {
> >  			while (head) {
> > @@ -739,11 +733,16 @@ int skb_copy_ubufs(struct sk_buff *skb, gfp_t gfp_mask)
> >  		head = page;
> >  	}
> >  
> > -	/* skb frags release userspace buffers */
> > -	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++)
> > +	/* skb frags release buffers */
> > +	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
> > +		skb_frag_t *f = &skb_shinfo(skb)->frags[i];
> > +		if (!uarg && !f->page.destructor)
> > +			continue;
> >  		skb_frag_unref(skb, i);
> > +	}
> >  
> > -	uarg->callback(uarg);
> > +	if (uarg)
> > +		uarg->callback(uarg);
> >  
> 
> So above we only linked up copied pages, but below we
> try to use the list for all frags. Looks like a bug,
> I think it needs to check destructor and uarg too.

You are absolutely correct. We need the same check in all three loops.

> 
> 
> >  	/* skb frags point to kernel buffers */
> >  	for (i = skb_shinfo(skb)->nr_frags; i > 0; i--) {
> > @@ -752,10 +751,41 @@ int skb_copy_ubufs(struct sk_buff *skb, gfp_t gfp_mask)
> >  		head = (struct page *)head->private;
> >  	}
> >  
> > -	skb_shinfo(skb)->tx_flags &= ~SKBTX_DEV_ZEROCOPY;
> >  	return 0;
> >  }

^ permalink raw reply

* Re: [v12 PATCH 2/3] NETFILTER module xt_hmark, new target for HASH based fwmark
From: Pablo Neira Ayuso @ 2012-05-09 10:38 UTC (permalink / raw)
  To: Hans Schillstrom
  Cc: kaber@trash.net, jengelh@medozas.de,
	netfilter-devel@vger.kernel.org, netdev@vger.kernel.org,
	hans@schillstrom.com
In-Reply-To: <201205080937.36853.hans.schillstrom@ericsson.com>

On Tue, May 08, 2012 at 09:37:35AM +0200, Hans Schillstrom wrote:
> From d5065af3988cc7561a02f30bae8342e1a89126a4 Mon Sep 17 00:00:00 2001
> From: Hans Schillstrom <hans.schillstrom@ericsson.com>
> Date: Wed, 2 May 2012 07:49:47 +0000
> Subject: netfilter: add xt_hmark target for hash-based skb
>  marking
> 
> The target allows you to create rules in the "raw" and "mangle" tables
> which set the skbuff mark by means of hash calculation within a given
> range. The nfmark can influence the routing method (see "Use netfilter
> MARK value as routing key") and can also be used by other subsystems to
> change their behaviour.
> 
> Some examples:
> 
> * Default rule handles all TCP, UDP, SCTP, ESP & AH
> 
>  iptables -t mangle -A PREROUTING -m state --state NEW,ESTABLISHED,RELATED \
> 	-j HMARK --hmark-offset 10000 --hmark-mod 10
> 
> * Handle SCTP and hash dest port only and produce a nfmark between 100-119.
> 
>  iptables -t mangle -A PREROUTING -p SCTP -j HMARK --src-mask 0 --dst-mask 0 \
> 	--sp-mask 0 --offset 100 --mod 20
> 
> * Fragment safe Layer 3 only, that keep a class C network flow together
> 
>  iptables -t mangle -A PREROUTING -j HMARK --method L3 \
> 	--src-mask 24 --mod 20 --offset 100

I have removed these examples. Just in case we make changes to the
user-space part. We'll have the time for this (the entire 3.5 cycle).

Some minor glitches I made on this patch:

>  include/linux/netfilter/xt_HMARK.h |   48 +++++
>  net/netfilter/Kconfig              |   15 ++
>  net/netfilter/Makefile             |    1 +
>  net/netfilter/xt_HMARK.c           |  358 ++++++++++++++++++++++++++++++++++++
>  4 files changed, 422 insertions(+)
>  create mode 100644 include/linux/netfilter/xt_HMARK.h
>  create mode 100644 net/netfilter/xt_HMARK.c
> 
> diff --git a/include/linux/netfilter/xt_HMARK.h b/include/linux/netfilter/xt_HMARK.h
> new file mode 100644
> index 0000000..05e43ba
> --- /dev/null
> +++ b/include/linux/netfilter/xt_HMARK.h
> @@ -0,0 +1,46 @@
> +#ifndef XT_HMARK_H_
> +#define XT_HMARK_H_
> +
> +#include <linux/types.h>
> +
> +enum {
> +	XT_HMARK_NONE,

this means (1 << 0) is unused. I have removed this _NONE.

> +	XT_HMARK_SADR_MASK,
> +	XT_HMARK_DADR_MASK,
> +	XT_HMARK_SPI_MASK,
> +	XT_HMARK_SPI,
> +	XT_HMARK_SPORT_MASK,
> +	XT_HMARK_DPORT_MASK,
> +	XT_HMARK_SPORT,
> +	XT_HMARK_DPORT,
> +	XT_HMARK_PROTO_MASK,
> +	XT_HMARK_RND,
> +	XT_HMARK_MODULUS,
> +	XT_HMARK_OFFSET,
> +	XT_HMARK_CT,
> +	XT_HMARK_METHOD_L3,
> +	XT_HMARK_METHOD_L3_4,

I have also rearrange the order of the flags:

enum {
        XT_HMARK_SADDR_MASK,
        XT_HMARK_DADDR_MASK,
        XT_HMARK_SPI,       
        XT_HMARK_SPI_MASK,  
        XT_HMARK_SPORT,     
        XT_HMARK_DPORT,     
        XT_HMARK_SPORT_MASK,
        XT_HMARK_DPORT_MASK,
        XT_HMARK_PROTO_MASK,
        XT_HMARK_RND,       
        XT_HMARK_MODULUS,   
        XT_HMARK_OFFSET,    
        XT_HMARK_CT,        
        XT_HMARK_METHOD_L3, 
        XT_HMARK_METHOD_L3_4,
};

I don't want people to ask me why we where using some strange order in
the flag definition in the future (yes, you'll have to recompile your
iptables HMARK support in your setups, sorry)

> +};
> +#define XT_HMARK_FLAG(flag)	(1 << flag)
> +
> +union hmark_ports {
> +	struct {
> +		__u16	src;
> +		__u16	dst;
> +	} p16;
> +	__u32	v32;
> +};
> +
> +struct xt_hmark_info {
> +	union nf_inet_addr	src_mask;	/* Source address mask */
> +	union nf_inet_addr	dst_mask;	/* Dest address mask */
> +	union hmark_ports	port_mask;
> +	union hmark_ports	port_set;
> +	__u32			flags;		/* Print out only */
> +	__u16			proto_mask;	/* L4 Proto mask */
> +	__u32			hashrnd;
> +	__u32			hmodulus;	/* Modulus */
> +	__u32			hoffset;	/* Offset */

I've removed these comments, they provide no extra information. Still
I left the one that described hoffset, that may seem not obvious.

> +#endif /* XT_HMARK_H_ */

I have applied this, I'm going to pass it to davem.

^ permalink raw reply

* [PATCH v5] tilegx network driver: initial support
From: Chris Metcalf @ 2012-05-09 10:42 UTC (permalink / raw)
  To: David Miller, arnd, linux-kernel, netdev
In-Reply-To: <20120504.024216.1730395431103337434.davem@davemloft.net>

This change adds support for the tilegx network driver based on the
GXIO IORPC support in the tilegx software stack, using the on-chip
mPIPE packet processing engine.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
---
This version removes some "generic" comments on network driver
infrastructure, removes runs of multiple blank lines, and eliminates
the hand-rolled ROUND_UP() macro in favor of ALIGN().  There are also
a number of additional cleanups.

 drivers/net/ethernet/tile/Kconfig  |    1 +
 drivers/net/ethernet/tile/Makefile |    4 +-
 drivers/net/ethernet/tile/tilegx.c | 1928 ++++++++++++++++++++++++++++++++++++
 3 files changed, 1931 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/ethernet/tile/tilegx.c

diff --git a/drivers/net/ethernet/tile/Kconfig b/drivers/net/ethernet/tile/Kconfig
index 2d9218f..9184b61 100644
--- a/drivers/net/ethernet/tile/Kconfig
+++ b/drivers/net/ethernet/tile/Kconfig
@@ -7,6 +7,7 @@ config TILE_NET
 	depends on TILE
 	default y
 	select CRC32
+	select TILE_GXIO_MPIPE if TILEGX
 	---help---
 	  This is a standard Linux network device driver for the
 	  on-chip Tilera Gigabit Ethernet and XAUI interfaces.
diff --git a/drivers/net/ethernet/tile/Makefile b/drivers/net/ethernet/tile/Makefile
index f634f14..0ef9eef 100644
--- a/drivers/net/ethernet/tile/Makefile
+++ b/drivers/net/ethernet/tile/Makefile
@@ -4,7 +4,7 @@
 
 obj-$(CONFIG_TILE_NET) += tile_net.o
 ifdef CONFIG_TILEGX
-tile_net-objs := tilegx.o mpipe.o iorpc_mpipe.o dma_queue.o
+tile_net-y := tilegx.o
 else
-tile_net-objs := tilepro.o
+tile_net-y := tilepro.o
 endif
diff --git a/drivers/net/ethernet/tile/tilegx.c b/drivers/net/ethernet/tile/tilegx.c
new file mode 100644
index 0000000..1ba52a7
--- /dev/null
+++ b/drivers/net/ethernet/tile/tilegx.c
@@ -0,0 +1,1928 @@
+/*
+ * Copyright 2012 Tilera Corporation. All Rights Reserved.
+ *
+ *   This program is free software; you can redistribute it and/or
+ *   modify it under the terms of the GNU General Public License
+ *   as published by the Free Software Foundation, version 2.
+ *
+ *   This program is distributed in the hope that it will be useful, but
+ *   WITHOUT ANY WARRANTY; without even the implied warranty of
+ *   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or
+ *   NON INFRINGEMENT.  See the GNU General Public License for
+ *   more details.
+ */
+
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/moduleparam.h>
+#include <linux/sched.h>
+#include <linux/kernel.h>      /* printk() */
+#include <linux/slab.h>        /* kmalloc() */
+#include <linux/errno.h>       /* error codes */
+#include <linux/types.h>       /* size_t */
+#include <linux/interrupt.h>
+#include <linux/in.h>
+#include <linux/irq.h>
+#include <linux/netdevice.h>   /* struct device, and other headers */
+#include <linux/etherdevice.h> /* eth_type_trans */
+#include <linux/skbuff.h>
+#include <linux/ioctl.h>
+#include <linux/cdev.h>
+#include <linux/hugetlb.h>
+#include <linux/in6.h>
+#include <linux/timer.h>
+#include <linux/io.h>
+#include <linux/ctype.h>
+#include <linux/ip.h>
+#include <linux/tcp.h>
+
+#include <asm/checksum.h>
+#include <asm/homecache.h>
+#include <gxio/mpipe.h>
+#include <arch/sim.h>
+
+/* Define to support GSO. */
+#undef TILE_NET_GSO
+
+/* Define to support TSO. */
+#define TILE_NET_TSO
+
+/* Use 3000 to enable the Linux Traffic Control (QoS) layer, else 0. */
+#define TILE_NET_TX_QUEUE_LEN 0
+
+/* Define to dump packets (prints out the whole packet on tx and rx). */
+#undef TILE_NET_DUMP_PACKETS
+
+/* Default transmit lockup timeout period, in jiffies. */
+#define TILE_NET_TIMEOUT (5 * HZ)
+
+/* The maximum number of distinct channels (idesc.channel is 5 bits). */
+#define TILE_NET_CHANNELS 32
+
+/* Maximum number of idescs to handle per "poll". */
+#define TILE_NET_BATCH 128
+
+/* Maximum number of packets to handle per "poll". */
+#define TILE_NET_WEIGHT 64
+
+/* Number of entries in each iqueue. */
+#define IQUEUE_ENTRIES 512
+
+/* Number of entries in each equeue. */
+#define EQUEUE_ENTRIES 2048
+
+/* Total header bytes per equeue slot.  Must be big enough for 2 bytes
+ * of NET_IP_ALIGN alignment, plus 14 bytes (?) of L2 header, plus up to
+ * 60 bytes of actual TCP header.  We round up to align to cache lines.
+ */
+#define HEADER_BYTES 128
+
+/* Maximum completions per cpu per device (must be a power of two).
+ * ISSUE: What is the right number here?
+ */
+#define TILE_NET_MAX_COMPS 64
+
+#define MAX_FRAGS (65536 / PAGE_SIZE + 2 + 1)
+
+MODULE_AUTHOR("Tilera Corporation");
+MODULE_LICENSE("GPL");
+
+/* A "packet fragment" (a chunk of memory). */
+struct frag {
+	void *buf;
+	size_t length;
+};
+
+/* A single completion. */
+struct tile_net_comp {
+	/* The "complete_count" when the completion will be complete. */
+	s64 when;
+	/* The buffer to be freed when the completion is complete. */
+	struct sk_buff *skb;
+};
+
+/* The completions for a given cpu and device. */
+struct tile_net_comps {
+	/* The completions. */
+	struct tile_net_comp comp_queue[TILE_NET_MAX_COMPS];
+	/* The number of completions used. */
+	unsigned long comp_next;
+	/* The number of completions freed. */
+	unsigned long comp_last;
+};
+
+/* Info for a specific cpu. */
+struct tile_net_info {
+	/* The NAPI struct. */
+	struct napi_struct napi;
+	/* Packet queue. */
+	gxio_mpipe_iqueue_t iqueue;
+	/* Our cpu. */
+	int my_cpu;
+	/* True if iqueue is valid. */
+	bool has_iqueue;
+	/* NAPI flags. */
+	bool napi_added;
+	bool napi_enabled;
+	/* Number of small sk_buffs which must still be provided. */
+	unsigned int num_needed_small_buffers;
+	/* Number of large sk_buffs which must still be provided. */
+	unsigned int num_needed_large_buffers;
+	/* A timer for handling egress completions. */
+	struct timer_list egress_timer;
+	/* True if "egress_timer" is scheduled. */
+	bool egress_timer_scheduled;
+	/* Comps for each egress channel. */
+	struct tile_net_comps *comps_for_echannel[TILE_NET_CHANNELS];
+};
+
+/* Info for egress on a particular egress channel. */
+struct tile_net_egress {
+	/* The "equeue". */
+	gxio_mpipe_equeue_t *equeue;
+	/* The headers for TSO. */
+	unsigned char *headers;
+};
+
+/* Info for a specific device. */
+struct tile_net_priv {
+	/* Our network device. */
+	struct net_device *dev;
+	/* The primary link. */
+	gxio_mpipe_link_t link;
+	/* The primary channel, if open, else -1. */
+	int channel;
+	/* The "loopify" egress link, if needed. */
+	gxio_mpipe_link_t loopify_link;
+	/* The "loopify" egress channel, if open, else -1. */
+	int loopify_channel;
+	/* The egress channel (channel or loopify_channel). */
+	int echannel;
+	/* Total stats. */
+	struct net_device_stats stats;
+};
+
+/* Egress info, indexed by "priv->echannel" (lazily created as needed). */
+static struct tile_net_egress egress_for_echannel[TILE_NET_CHANNELS];
+
+/* Devices currently associated with each channel.
+ * NOTE: The array entry can become NULL after ifconfig down, but
+ * we do not free the underlying net_device structures, so it is
+ * safe to use a pointer after reading it from this array.
+ */
+static struct net_device *tile_net_devs_for_channel[TILE_NET_CHANNELS];
+
+/* A mutex for "tile_net_devs_for_channel". */
+static DEFINE_MUTEX(tile_net_devs_for_channel_mutex);
+
+/* The per-cpu info. */
+static DEFINE_PER_CPU(struct tile_net_info, per_cpu_info);
+
+/* The "context" for all devices. */
+static gxio_mpipe_context_t context;
+
+/* The small/large "buffer stacks". */
+static int small_buffer_stack = -1;
+static int large_buffer_stack = -1;
+
+/* The buckets. */
+static int first_bucket = -1;
+static int num_buckets = 1;
+
+/* The ingress irq. */
+static int ingress_irq = -1;
+
+/* Text value of tile_net.cpus if passed as a module parameter. */
+static char *network_cpus_string;
+
+/* The actual cpus in "network_cpus". */
+static struct cpumask network_cpus_map;
+
+/* If "loopify=LINK" was specified, this is "LINK". */
+static char *loopify_link_name;
+
+/* The "tile_net.cpus" argument specifies the cpus that are dedicated
+ * to handle ingress packets.
+ *
+ * The parameter should be in the form "tile_net.cpus=m-n[,x-y]", where
+ * m, n, x, y are integer numbers that represent the cpus that can be
+ * neither a dedicated cpu nor a dataplane cpu.
+ */
+static bool network_cpus_init(void)
+{
+	char buf[1024];
+	int rc;
+
+	if (network_cpus_string == NULL)
+		return false;
+
+	rc = cpulist_parse_crop(network_cpus_string, &network_cpus_map);
+	if (rc != 0) {
+		pr_warn("tile_net.cpus=%s: malformed cpu list\n",
+			network_cpus_string);
+		return false;
+	}
+
+	/* Remove dedicated cpus. */
+	cpumask_and(&network_cpus_map, &network_cpus_map, cpu_possible_mask);
+
+	if (cpumask_empty(&network_cpus_map)) {
+		pr_warn("Ignoring empty tile_net.cpus='%s'.\n",
+			network_cpus_string);
+		return false;
+	}
+
+	cpulist_scnprintf(buf, sizeof(buf), &network_cpus_map);
+	pr_info("Linux network CPUs: %s\n", buf);
+	return true;
+}
+
+module_param_named(cpus, network_cpus_string, charp, 0444);
+MODULE_PARM_DESC(cpus, "cpulist of cores that handle network interrupts");
+
+/* The "tile_net.loopify=LINK" argument causes the named device to
+ * actually use "loop0" for ingress, and "loop1" for egress.  This
+ * allows an app to sit between the actual link and linux, passing
+ * (some) packets along to linux, and forwarding (some) packets sent
+ * out by linux.
+ */
+module_param_named(loopify, loopify_link_name, charp, 0444);
+MODULE_PARM_DESC(loopify, "name the device to use loop0/1 for ingress/egress");
+
+#ifdef TILE_NET_DUMP_PACKETS
+/* Dump a packet. */
+static void dump_packet(unsigned char *data, unsigned long length, char *s)
+{
+	unsigned long i;
+	static unsigned int count;
+	char buf[128];
+
+	pr_info("Dumping %s packet of 0x%lx bytes at %p [%d]\n",
+		s, length, data, count++);
+
+	pr_info("\n");
+
+	for (i = 0; i < length; i++) {
+		if ((i & 0xf) == 0)
+			sprintf(buf, "%8.8lx:", i);
+		sprintf(buf + strlen(buf), " %02x", data[i]);
+		if ((i & 0xf) == 0xf || i == length - 1)
+			pr_info("%s\n", buf);
+	}
+
+	pr_info("\n");
+}
+#endif
+
+/* Allocate and push a buffer. */
+static bool tile_net_provide_buffer(bool small)
+{
+	int stack = small ? small_buffer_stack : large_buffer_stack;
+
+	/* Buffers must be aligned. */
+	const unsigned long align = 128;
+
+	/* Note that "dev_alloc_skb()" adds NET_SKB_PAD more bytes,
+	 * and also "reserves" that many bytes.
+	 */
+	int len = sizeof(struct sk_buff **) + align + (small ? 128 : 1664);
+
+	/* Allocate (or fail). */
+	struct sk_buff *skb = dev_alloc_skb(len);
+	if (skb == NULL)
+		return false;
+
+	/* Make room for a back-pointer to 'skb'. */
+	skb_reserve(skb, sizeof(struct sk_buff **));
+
+	/* Make sure we are aligned. */
+	skb_reserve(skb, -(long)skb->data & (align - 1));
+
+	/* Save a back-pointer to 'skb'. */
+	*(struct sk_buff **)(skb->data - sizeof(struct sk_buff **)) = skb;
+
+	/* Make sure "skb" and the back-pointer have been flushed. */
+	wmb();
+
+	gxio_mpipe_push_buffer(&context, stack,
+			       (void *)va_to_tile_io_addr(skb->data));
+
+	return true;
+}
+
+static void tile_net_pop_all_buffers(int stack)
+{
+	void *va;
+	while ((va = gxio_mpipe_pop_buffer(&context, stack)) != NULL) {
+		struct sk_buff **skb_ptr = va - sizeof(*skb_ptr);
+		struct sk_buff *skb = *skb_ptr;
+		dev_kfree_skb_irq(skb);
+	}
+}
+
+/* Provide linux buffers to mPIPE. */
+static void tile_net_provide_needed_buffers(struct tile_net_info *info)
+{
+	while (info->num_needed_small_buffers != 0) {
+		if (!tile_net_provide_buffer(true))
+			goto oops;
+		info->num_needed_small_buffers--;
+	}
+
+	while (info->num_needed_large_buffers != 0) {
+		if (!tile_net_provide_buffer(false))
+			goto oops;
+		info->num_needed_large_buffers--;
+	}
+
+	return;
+
+oops:
+	/* Add a description to the page allocation failure dump. */
+	pr_notice("Tile %d still needs some buffers\n", info->my_cpu);
+}
+
+/* Handle a packet.  Return true if "processed", false if "filtered". */
+static bool tile_net_handle_packet(struct tile_net_info *info,
+				   gxio_mpipe_idesc_t *idesc)
+{
+	struct net_device *dev = tile_net_devs_for_channel[idesc->channel];
+	uint8_t l2_offset = gxio_mpipe_idesc_get_l2_offset(idesc);
+	void *va;
+	void *buf;
+	unsigned long len;
+	int filter = 0;
+
+	/* Drop packets for which no buffer was available.
+	 * NOTE: This happens under heavy load.
+	 */
+	if (idesc->be) {
+		gxio_mpipe_iqueue_consume(&info->iqueue, idesc);
+		if (net_ratelimit())
+			pr_info("Dropping packet (insufficient buffers).\n");
+		return false;
+	}
+
+	/* Get the raw buffer VA. */
+	va = tile_io_addr_to_va((unsigned long)gxio_mpipe_idesc_get_va(idesc));
+
+	/* Get the actual packet start/length. */
+	buf = va + l2_offset;
+	len = gxio_mpipe_idesc_get_l2_length(idesc);
+
+	/* Point "va" at the raw buffer. */
+	va -= NET_IP_ALIGN;
+
+#ifdef TILE_NET_DUMP_PACKETS
+	dump_packet(buf, len, "rx");
+#endif
+
+	if (dev != NULL) {
+		/* ISSUE: Is this needed? */
+		dev->last_rx = jiffies;
+	}
+
+	if (dev == NULL || !(dev->flags & IFF_UP)) {
+		/* Filter packets received before we're up. */
+		filter = 1;
+	} else if (!(dev->flags & IFF_PROMISC)) {
+		/* ISSUE: "eth_type_trans()" implies that "IFF_PROMISC"
+		 * is set for "all silly devices", however, it appears
+		 * to NOT be set for us, so this code here DOES run.
+		 */
+		if (!is_multicast_ether_addr(buf)) {
+			/* Filter packets not for our address. */
+			const u8 *mine = dev->dev_addr;
+			filter = compare_ether_addr(mine, buf);
+		}
+	}
+
+	if (filter) {
+
+		/* ISSUE: Update "drop" statistics? */
+
+		gxio_mpipe_iqueue_drop(&info->iqueue, idesc);
+
+	} else {
+
+		struct tile_net_priv *priv = netdev_priv(dev);
+
+		/* Acquire the associated "skb". */
+		struct sk_buff **skb_ptr = va - sizeof(*skb_ptr);
+		struct sk_buff *skb = *skb_ptr;
+
+		/* Paranoia. */
+		if (skb->data != va) {
+			/* Panic here since there's a reasonable chance
+			 * that corrupt buffers means generic memory
+			 * corruption, with unpredictable system effects.
+			 */
+			panic("Corrupt linux buffer! "
+			      "buf=%p, skb=%p, skb->data=%p",
+			      buf, skb, skb->data);
+		}
+
+		/* Skip headroom, and any custom header. */
+		skb_reserve(skb, NET_IP_ALIGN + l2_offset);
+
+		/* Encode the actual packet length. */
+		skb_put(skb, len);
+
+		/* NOTE: This call also sets "skb->dev = dev".
+		 * ISSUE: The classifier provides us with "eth_type"
+		 * (aka "eth->h_proto"), which is basically the value
+		 * returned by "eth_type_trans()".
+		 * Note that "eth_type_trans()" computes "skb->pkt_type",
+		 * which would be useful for the "filter" check above,
+		 * if we had a (modifiable) "skb" to work with.
+		 */
+		skb->protocol = eth_type_trans(skb, dev);
+
+		/* Acknowledge "good" hardware checksums. */
+		if (idesc->cs && idesc->csum_seed_val == 0xFFFF)
+			skb->ip_summed = CHECKSUM_UNNECESSARY;
+
+		netif_receive_skb(skb);
+
+		/* Update stats. */
+		atomic_add(1, (atomic_t *)&priv->stats.rx_packets);
+		atomic_add(len, (atomic_t *)&priv->stats.rx_bytes);
+
+		/* Need a new buffer. */
+		if (idesc->size == GXIO_MPIPE_BUFFER_SIZE_128)
+			info->num_needed_small_buffers++;
+		else
+			info->num_needed_large_buffers++;
+	}
+
+	gxio_mpipe_iqueue_consume(&info->iqueue, idesc);
+
+	return !filter;
+}
+
+/* Handle some packets for the current CPU.
+ *
+ * This function handles up to TILE_NET_BATCH idescs per call.
+ *
+ * ISSUE: Since we do not provide new buffers until this function is
+ * complete, we must initially provide enough buffers for each network
+ * cpu to fill its iqueue and also its batched idescs.
+ *
+ * ISSUE: The "rotting packet" race condition occurs if a packet
+ * arrives after the queue appears to be empty, and before the
+ * hypervisor interrupt is re-enabled.
+ */
+static int tile_net_poll(struct napi_struct *napi, int budget)
+{
+	struct tile_net_info *info = &__get_cpu_var(per_cpu_info);
+	unsigned int work = 0;
+	gxio_mpipe_idesc_t *idesc;
+	int i, n;
+
+	/* Process packets. */
+	while ((n = gxio_mpipe_iqueue_try_peek(&info->iqueue, &idesc)) > 0) {
+		for (i = 0; i < n; i++) {
+			if (i == TILE_NET_BATCH)
+				goto done;
+			if (tile_net_handle_packet(info, idesc + i)) {
+				if (++work >= budget)
+					goto done;
+			}
+		}
+	}
+
+	/* There are no packets left. */
+	napi_complete(&info->napi);
+
+	/* Re-enable hypervisor interrupts. */
+	gxio_mpipe_enable_notif_ring_interrupt(&context, info->iqueue.ring);
+
+	/* HACK: Avoid the "rotting packet" problem. */
+	if (gxio_mpipe_iqueue_try_peek(&info->iqueue, &idesc) > 0)
+		napi_schedule(&info->napi);
+
+	/* ISSUE: Handle completions? */
+
+done:
+	tile_net_provide_needed_buffers(info);
+
+	return work;
+}
+
+/* Handle an ingress interrupt on the current cpu. */
+static irqreturn_t tile_net_handle_ingress_irq(int irq, void *unused)
+{
+	struct tile_net_info *info = &__get_cpu_var(per_cpu_info);
+	napi_schedule(&info->napi);
+	return IRQ_HANDLED;
+}
+
+/* Free some completions.  This must be called with interrupts blocked. */
+static void tile_net_free_comps(gxio_mpipe_equeue_t *equeue,
+				struct tile_net_comps *comps,
+				int limit, bool force_update)
+{
+	int n = 0;
+	while (comps->comp_last < comps->comp_next) {
+		unsigned int cid = comps->comp_last % TILE_NET_MAX_COMPS;
+		struct tile_net_comp *comp = &comps->comp_queue[cid];
+		if (!gxio_mpipe_equeue_is_complete(equeue, comp->when,
+						   force_update || n == 0))
+			return;
+		dev_kfree_skb_irq(comp->skb);
+		comps->comp_last++;
+		if (++n == limit)
+			return;
+	}
+}
+
+/* Make sure the egress timer is scheduled.
+ *
+ * Note that we use "schedule if not scheduled" logic instead of the more
+ * obvious "reschedule" logic, because "reschedule" is fairly expensive.
+ */
+static void tile_net_schedule_egress_timer(struct tile_net_info *info)
+{
+	if (!info->egress_timer_scheduled) {
+		mod_timer_pinned(&info->egress_timer, jiffies + 1);
+		info->egress_timer_scheduled = true;
+	}
+}
+
+/* The "function" for "info->egress_timer".
+ *
+ * This timer will reschedule itself as long as there are any pending
+ * completions expected for this tile.
+ */
+static void tile_net_handle_egress_timer(unsigned long arg)
+{
+	struct tile_net_info *info = (struct tile_net_info *)arg;
+	unsigned long irqflags;
+	bool pending = false;
+	int i;
+
+	local_irq_save(irqflags);
+
+	/* The timer is no longer scheduled. */
+	info->egress_timer_scheduled = false;
+
+	/* Free all possible comps for this tile. */
+	for (i = 0; i < TILE_NET_CHANNELS; i++) {
+		struct tile_net_egress *egress = &egress_for_echannel[i];
+		struct tile_net_comps *comps = info->comps_for_echannel[i];
+		if (comps->comp_last >= comps->comp_next)
+			continue;
+		tile_net_free_comps(egress->equeue, comps, -1, true);
+		pending = pending || (comps->comp_last < comps->comp_next);
+	}
+
+	/* Reschedule timer if needed. */
+	if (pending)
+		tile_net_schedule_egress_timer(info);
+
+	local_irq_restore(irqflags);
+}
+
+/* Prepare each CPU. */
+static void tile_net_prepare_cpu(void *unused)
+{
+	struct tile_net_info *info = &__get_cpu_var(per_cpu_info);
+	int my_cpu = smp_processor_id();
+
+	info->has_iqueue = false;
+
+	info->my_cpu = my_cpu;
+
+	/* Initialize the egress timer. */
+	init_timer(&info->egress_timer);
+	info->egress_timer.data = (long)info;
+	info->egress_timer.function = tile_net_handle_egress_timer;
+}
+
+/* Helper function for "tile_net_update()". */
+static void tile_net_update_cpu(void *arg)
+{
+	struct net_device *dev = arg;
+	struct tile_net_info *info = &__get_cpu_var(per_cpu_info);
+
+	if (info->has_iqueue) {
+		if (dev != NULL) {
+			if (!info->napi_added) {
+				netif_napi_add(dev, &info->napi,
+					       tile_net_poll, TILE_NET_WEIGHT);
+				info->napi_added = true;
+			}
+			if (!info->napi_enabled) {
+				napi_enable(&info->napi);
+				info->napi_enabled = true;
+			}
+			enable_percpu_irq(ingress_irq, 0);
+		} else {
+			disable_percpu_irq(ingress_irq);
+			if (info->napi_enabled) {
+				napi_disable(&info->napi);
+				info->napi_enabled = false;
+			}
+			/* FIXME: Drain the iqueue. */
+		}
+	}
+}
+
+/* Helper function for tile_net_open() and tile_net_stop().
+ * Always called under tile_net_devs_for_channel_mutex.
+ */
+static int tile_net_update(struct net_device *dev)
+{
+	int channel;
+	long count = 0;
+	int cpu;
+	int rc;
+	bool saw_channel;
+	static gxio_mpipe_rules_t rules;
+
+	gxio_mpipe_rules_init(&rules, &context);
+
+	saw_channel = false;
+	for (channel = 0; channel < TILE_NET_CHANNELS; channel++) {
+		if (tile_net_devs_for_channel[channel] == NULL)
+			continue;
+		if (!saw_channel) {
+			saw_channel = true;
+			gxio_mpipe_rules_begin(&rules, first_bucket,
+					       num_buckets, NULL);
+			gxio_mpipe_rules_set_headroom(&rules, NET_IP_ALIGN);
+		}
+		gxio_mpipe_rules_add_channel(&rules, channel);
+	}
+
+	/* NOTE: This can fail if there is no classifier.
+	 * ISSUE: Can anything else cause it to fail?
+	 */
+	rc = gxio_mpipe_rules_commit(&rules);
+	if (rc != 0) {
+		netdev_warn(dev, "gxio_mpipe_rules_commit failed: %d\n", rc);
+		return -EIO;
+	}
+
+	/* Update all cpus, sequentially (to protect "netif_napi_add()"). */
+	for_each_online_cpu(cpu)
+		smp_call_function_single(cpu, tile_net_update_cpu,
+					 (saw_channel ? dev : NULL), 1);
+
+	/* HACK: Allow packets to flow in the simulator. */
+	if (count != 0)
+		sim_enable_mpipe_links(0, -1);
+
+	return 0;
+}
+
+/* The first time any tilegx network device is opened, we initialize
+ * the global mpipe state.  If this step fails, we fail to open the
+ * device, but if it succeeds, we never need to do it again, and since
+ * tile_net can't be unloaded, we never undo it.
+ *
+ * Note that some resources in this path (buffer stack indices,
+ * bindings from init_buffer_stack, etc.) are hypervisor resources
+ * that are freed simply via gxio_mpipe_destroy().
+ */
+static int tile_net_init_mpipe(struct net_device *dev)
+{
+	gxio_mpipe_buffer_size_enum_t small_buf_size =
+		GXIO_MPIPE_BUFFER_SIZE_128;
+	gxio_mpipe_buffer_size_enum_t large_buf_size =
+		GXIO_MPIPE_BUFFER_SIZE_1664;
+	size_t stack_bytes;
+	pte_t pte = { 0 };
+	void *small = NULL;
+	void *large = NULL;
+	int i, num_buffers, rc;
+	int network_cpus_count, cpu;
+	int ring, group, next_ring;
+	size_t comps_size = 0;
+	size_t notif_ring_size = 0;
+
+	if (!hash_default) {
+		netdev_err(dev, "Networking requires hash_default!\n");
+		return -EIO;
+	}
+
+	rc =  gxio_mpipe_init(&context, 0);
+	if (rc != 0) {
+		netdev_err(dev, "gxio_mpipe_init failed: %d\n", rc);
+		return -EIO;
+	}
+
+	network_cpus_count = cpus_weight(network_cpus_map);
+
+	num_buffers =
+		network_cpus_count * (IQUEUE_ENTRIES + TILE_NET_BATCH);
+
+	/* Compute stack bytes; we round up to 64KB and then use
+	 * alloc_pages() so we get the required 64KB alignment as well.
+	 */
+	stack_bytes = ALIGN(gxio_mpipe_calc_buffer_stack_bytes(num_buffers),
+			    64 * 1024);
+
+	/* Allocate two buffer stack indices. */
+	rc = gxio_mpipe_alloc_buffer_stacks(&context, 2, 0, 0);
+	if (rc < 0) {
+		netdev_err(dev, "gxio_mpipe_alloc_buffer_stacks failed: %d\n",
+			   rc);
+		goto fail;
+	}
+	small_buffer_stack = rc;
+	large_buffer_stack = rc + 1;
+
+	/* Allocate the small memory stack. */
+	small = alloc_pages_exact(stack_bytes, GFP_KERNEL);
+	if (small == NULL) {
+		netdev_err(dev,
+			   "Could not alloc %zd bytes for buffer stacks\n",
+			   stack_bytes);
+		rc = -ENOMEM;
+		goto fail;
+	}
+	rc = gxio_mpipe_init_buffer_stack(&context, small_buffer_stack,
+					  small_buf_size,
+					  small, stack_bytes, 0);
+	if (rc != 0) {
+		netdev_err(dev, "gxio_mpipe_init_buffer_stack: %d\n", rc);
+		goto fail;
+	}
+
+	/* Allocate the large buffer stack. */
+	large = alloc_pages_exact(stack_bytes, GFP_KERNEL);
+	if (large == NULL) {
+		netdev_err(dev,
+			   "Could not alloc %zd bytes for buffer stacks\n",
+			   stack_bytes);
+		rc = -ENOMEM;
+		goto fail;
+	}
+	rc = gxio_mpipe_init_buffer_stack(&context, large_buffer_stack,
+					  large_buf_size,
+					  large, stack_bytes, 0);
+	if (rc != 0) {
+		netdev_err(dev, "gxio_mpipe_init_buffer_stack failed: %d\n",
+			   rc);
+		goto fail;
+	}
+
+	/* Register all the client memory in mpipe TLBs. */
+	pte = pte_set_home(pte, PAGE_HOME_HASH);
+	rc = gxio_mpipe_register_client_memory(&context, small_buffer_stack,
+					       pte, 0);
+	if (rc != 0) {
+		netdev_err(dev,
+			   "gxio_mpipe_register_buffer_memory failed: %d\n",
+			   rc);
+		goto fail;
+	}
+	rc = gxio_mpipe_register_client_memory(&context, large_buffer_stack,
+					       pte, 0);
+	if (rc != 0) {
+		netdev_err(dev,
+			   "gxio_mpipe_register_buffer_memory failed: %d\n",
+			   rc);
+		goto fail;
+	}
+
+	/* Provide initial buffers. */
+	rc = -ENOMEM;
+	for (i = 0; i < num_buffers; i++) {
+		if (!tile_net_provide_buffer(true)) {
+			netdev_err(dev, "Cannot allocate initial sk_bufs!\n");
+			goto fail_pop;
+		}
+	}
+	for (i = 0; i < num_buffers; i++) {
+		if (!tile_net_provide_buffer(false)) {
+			netdev_err(dev, "Cannot allocate initial sk_bufs!\n");
+			goto fail_pop;
+		}
+	}
+
+	/* Allocate one NotifRing for each network cpu. */
+	rc = gxio_mpipe_alloc_notif_rings(&context, network_cpus_count,
+					  0, 0);
+	if (rc < 0) {
+		netdev_err(dev, "gxio_mpipe_alloc_notif_rings failed %d\n",
+			   rc);
+		goto fail_pop;
+	}
+
+	/* Init NotifRings. */
+	ring = rc;
+	next_ring = rc;
+
+	/* ISSUE: This is more than strictly necessary. */
+	comps_size = TILE_NET_CHANNELS * sizeof(struct tile_net_comps);
+
+	notif_ring_size = IQUEUE_ENTRIES * sizeof(gxio_mpipe_idesc_t);
+
+	for_each_online_cpu(cpu) {
+
+		int order;
+		struct page *page;
+		void *addr;
+
+		struct tile_net_info *info = &per_cpu(per_cpu_info, cpu);
+
+		/* Allocate the "comps". */
+		order = get_order(comps_size);
+		page = homecache_alloc_pages(GFP_KERNEL, order, cpu);
+		if (page == NULL) {
+			netdev_err(dev,
+				   "Failed to alloc %zd bytes comps memory\n",
+				   comps_size);
+			rc = -ENOMEM;
+			goto fail_pop;
+		}
+
+		addr = pfn_to_kaddr(page_to_pfn(page));
+		memset(addr, 0, comps_size);
+		for (i = 0; i < TILE_NET_CHANNELS; i++)
+			info->comps_for_echannel[i] =
+				addr + i * sizeof(struct tile_net_comps);
+
+		/* Only network cpus can receive packets. */
+		if (!cpu_isset(cpu, network_cpus_map))
+			continue;
+
+		/* Allocate the actual idescs array. */
+		order = get_order(notif_ring_size);
+		page = homecache_alloc_pages(GFP_KERNEL, order, cpu);
+		if (page == NULL) {
+			netdev_err(dev,
+				   "Failed to alloc %zd bytes iqueue memory\n",
+				   notif_ring_size);
+			rc = -ENOMEM;
+			goto fail_pop;
+		}
+		addr = pfn_to_kaddr(page_to_pfn(page));
+		rc = gxio_mpipe_iqueue_init(&info->iqueue, &context, next_ring,
+					    addr, notif_ring_size, 0);
+		if (rc != 0) {
+			netdev_err(dev,
+				   "gxio_mpipe_iqueue_init failed: %d\n", rc);
+			goto fail_pop;
+		}
+
+		info->has_iqueue = true;
+
+		next_ring++;
+	}
+
+	/* Allocate one NotifGroup. */
+	rc = gxio_mpipe_alloc_notif_groups(&context, 1, 0, 0);
+	if (rc < 0) {
+		netdev_err(dev, "gxio_mpipe_alloc_notif_groups failed: %d\n",
+			   rc);
+		goto fail_pop;
+	}
+	group = rc;
+
+	if (network_cpus_count > 4)
+		num_buckets = 256;
+	else if (network_cpus_count > 1)
+		num_buckets = 16;
+
+	/* Allocate some buckets. */
+	rc = gxio_mpipe_alloc_buckets(&context, num_buckets, 0, 0);
+	if (rc < 0) {
+		netdev_err(dev, "gxio_mpipe_alloc_buckets failed: %d\n", rc);
+		goto fail_pop;
+	}
+	first_bucket = rc;
+
+	/* Init group and buckets. */
+	rc = gxio_mpipe_init_notif_group_and_buckets(
+		&context, group, ring, network_cpus_count,
+		first_bucket, num_buckets,
+		GXIO_MPIPE_BUCKET_STICKY_FLOW_LOCALITY);
+
+	if (rc != 0) {
+		netdev_err(
+			dev,
+			"gxio_mpipe_init_notif_group_and_buckets failed: %d\n",
+			rc);
+		goto fail_pop;
+	}
+
+	/* Create an irq and register it. Note that "ingress_irq" being
+	 * initialized is how we know not to call this function again.
+	 */
+	rc = create_irq();
+	if (rc < 0) {
+		netdev_err(dev, "create_irq failed: %d\n", rc);
+		goto fail_pop;
+
+	}
+	ingress_irq = rc;
+	tile_irq_activate(ingress_irq, TILE_IRQ_PERCPU);
+	rc = request_irq(ingress_irq, tile_net_handle_ingress_irq,
+			 0, NULL, NULL);
+	if (rc != 0) {
+		netdev_err(dev, "request_irq failed: %d\n", rc);
+		destroy_irq(ingress_irq);
+		ingress_irq = -1;
+		goto fail_pop;
+	}
+
+	for_each_online_cpu(cpu) {
+		struct tile_net_info *info = &per_cpu(per_cpu_info, cpu);
+		if (info->has_iqueue) {
+			gxio_mpipe_request_notif_ring_interrupt(
+				&context, cpu_x(cpu), cpu_y(cpu),
+				1, ingress_irq, info->iqueue.ring);
+		}
+	}
+
+	return 0;
+
+fail_pop:
+	/* Do cleanups that require the mpipe context first. */
+	tile_net_pop_all_buffers(small_buffer_stack);
+	tile_net_pop_all_buffers(large_buffer_stack);
+
+fail:
+	/* Destroy mpipe context so the hardware no longer owns any memory. */
+	gxio_mpipe_destroy(&context);
+
+	for_each_online_cpu(cpu) {
+		struct tile_net_info *info = &per_cpu(per_cpu_info, cpu);
+		free_pages((unsigned long)(info->comps_for_echannel[0]),
+			   get_order(comps_size));
+		info->comps_for_echannel[0] = NULL;
+		free_pages((unsigned long)(info->iqueue.idescs),
+			   get_order(notif_ring_size));
+		info->iqueue.idescs = NULL;
+	}
+
+	if (small)
+		free_pages_exact(small, stack_bytes);
+	if (large)
+		free_pages_exact(large, stack_bytes);
+
+	large_buffer_stack = -1;
+	small_buffer_stack = -1;
+	first_bucket = -1;
+
+	return rc;
+}
+
+/* Create persistent egress info for a given egress channel.
+ *
+ * Note that this may be shared between, say, "gbe0" and "xgbe0".
+ *
+ * ISSUE: Defer header allocation until TSO is actually needed?
+ */
+static int tile_net_init_egress(struct net_device *dev, int echannel)
+{
+	size_t headers_order;
+	struct page *headers_page;
+	unsigned char *headers;
+
+	size_t edescs_size;
+	int edescs_order;
+	struct page *edescs_page;
+	gxio_mpipe_edesc_t *edescs;
+
+	int equeue_order;
+	struct page *equeue_page;
+	gxio_mpipe_equeue_t *equeue;
+	int edma;
+
+	int rc = -ENOMEM;
+
+	/* Only initialize once. */
+	if (egress_for_echannel[echannel].equeue != NULL)
+		return 0;
+
+	/* Allocate memory for the "headers". */
+	headers_order = get_order(EQUEUE_ENTRIES * HEADER_BYTES);
+	headers_page = alloc_pages(GFP_KERNEL, headers_order);
+	if (headers_page == NULL) {
+		netdev_warn(dev,
+			    "Could not alloc %zd bytes for TSO headers.\n",
+			    PAGE_SIZE << headers_order);
+		goto fail;
+	}
+	headers = pfn_to_kaddr(page_to_pfn(headers_page));
+
+	/* Allocate memory for the "edescs". */
+	edescs_size = EQUEUE_ENTRIES * sizeof(*edescs);
+	edescs_order = get_order(edescs_size);
+	edescs_page = alloc_pages(GFP_KERNEL, edescs_order);
+	if (edescs_page == NULL) {
+		netdev_warn(dev,
+			    "Could not alloc %zd bytes for eDMA ring.\n",
+			    edescs_size);
+		goto fail_headers;
+	}
+	edescs = pfn_to_kaddr(page_to_pfn(edescs_page));
+
+	/* Allocate memory for the "equeue". */
+	equeue_order = get_order(sizeof(*equeue));
+	equeue_page = alloc_pages(GFP_KERNEL, equeue_order);
+	if (equeue_page == NULL) {
+		netdev_warn(dev,
+			    "Could not alloc %zd bytes for equeue info.\n",
+			    PAGE_SIZE << equeue_order);
+		goto fail_edescs;
+	}
+	equeue = pfn_to_kaddr(page_to_pfn(equeue_page));
+
+	/* Allocate an edma ring.  Note that in practice this can't
+	 * fail, which is good, because we will leak an edma ring if so.
+	 */
+	rc = gxio_mpipe_alloc_edma_rings(&context, 1, 0, 0);
+	if (rc < 0) {
+		netdev_warn(dev, "gxio_mpipe_alloc_edma_rings failed: %d\n",
+			    rc);
+		goto fail_equeue;
+	}
+	edma = rc;
+
+	/* Initialize the equeue. */
+	rc = gxio_mpipe_equeue_init(equeue, &context, edma, echannel,
+				    edescs, edescs_size, 0);
+	if (rc != 0) {
+		netdev_err(dev, "gxio_mpipe_equeue_init failed: %d\n", rc);
+		goto fail_equeue;
+	}
+
+	/* Done. */
+	egress_for_echannel[echannel].equeue = equeue;
+	egress_for_echannel[echannel].headers = headers;
+	return 0;
+
+fail_equeue:
+	__free_pages(equeue_page, equeue_order);
+
+fail_edescs:
+	__free_pages(edescs_page, edescs_order);
+
+fail_headers:
+	__free_pages(headers_page, headers_order);
+
+fail:
+	return rc;
+}
+
+/* Return channel number for a newly-opened link. */
+static int tile_net_link_open(struct net_device *dev, gxio_mpipe_link_t *link,
+			      const char *link_name)
+{
+	int rc = gxio_mpipe_link_open(link, &context, link_name, 0);
+	if (rc < 0) {
+		netdev_err(dev, "Failed to open '%s'\n", link_name);
+		return rc;
+	}
+	rc = gxio_mpipe_link_channel(link);
+	if (rc < 0 || rc >= TILE_NET_CHANNELS) {
+		netdev_err(dev, "gxio_mpipe_link_channel bad value: %d\n", rc);
+		gxio_mpipe_link_close(link);
+		return -EINVAL;
+	}
+	return rc;
+}
+
+/* Help the kernel activate the given network interface. */
+static int tile_net_open(struct net_device *dev)
+{
+	struct tile_net_priv *priv = netdev_priv(dev);
+	int rc;
+
+	mutex_lock(&tile_net_devs_for_channel_mutex);
+
+	/* Do one-time initialization the first time any device is opened. */
+	if (ingress_irq < 0) {
+		rc = tile_net_init_mpipe(dev);
+		if (rc != 0)
+			goto fail;
+	}
+
+	/* Determine if this is the "loopify" device. */
+	if (unlikely((loopify_link_name != NULL) &&
+		     !strcmp(dev->name, loopify_link_name))) {
+		rc = tile_net_link_open(dev, &priv->link, "loop0");
+		if (rc < 0)
+			goto fail;
+		priv->channel = rc;
+		rc = tile_net_link_open(dev, &priv->loopify_link, "loop1");
+		if (rc < 0)
+			goto fail;
+		priv->loopify_channel = rc;
+		priv->echannel = rc;
+	} else {
+		rc = tile_net_link_open(dev, &priv->link, dev->name);
+		if (rc < 0)
+			goto fail;
+		priv->channel = rc;
+		priv->echannel = rc;
+	}
+
+	/* Initialize egress info (if needed).  Once ever, per echannel. */
+	rc = tile_net_init_egress(dev, priv->echannel);
+	if (rc != 0)
+		goto fail;
+
+	tile_net_devs_for_channel[priv->channel] = dev;
+
+	rc = tile_net_update(dev);
+	if (rc != 0)
+		goto fail;
+
+	mutex_unlock(&tile_net_devs_for_channel_mutex);
+
+	/* Start our transmit queue. */
+	netif_start_queue(dev);
+
+	netif_carrier_on(dev);
+
+	return 0;
+
+fail:
+	if (priv->loopify_channel >= 0) {
+		if (gxio_mpipe_link_close(&priv->loopify_link) != 0)
+			netdev_warn(dev, "Failed to close loopify link!\n");
+		priv->loopify_channel = -1;
+	}
+	if (priv->channel >= 0) {
+		if (gxio_mpipe_link_close(&priv->link) != 0)
+			netdev_warn(dev, "Failed to close link!\n");
+		priv->channel = -1;
+	}
+
+	priv->echannel = -1;
+
+	tile_net_devs_for_channel[priv->channel] = NULL;
+
+	mutex_unlock(&tile_net_devs_for_channel_mutex);
+
+	/* Don't return raw gxio error codes to generic Linux. */
+	return (rc > -512) ? rc : -EIO;
+}
+
+/* Help the kernel deactivate the given network interface. */
+static int tile_net_stop(struct net_device *dev)
+{
+	struct tile_net_priv *priv = netdev_priv(dev);
+
+	/* Stop our transmit queue. */
+	netif_stop_queue(dev);
+
+	mutex_lock(&tile_net_devs_for_channel_mutex);
+
+	tile_net_devs_for_channel[priv->channel] = NULL;
+
+	(void)tile_net_update(dev);
+
+	if (priv->loopify_channel >= 0) {
+		if (gxio_mpipe_link_close(&priv->loopify_link) != 0)
+			netdev_warn(dev, "Failed to close loopify link!\n");
+		priv->loopify_channel = -1;
+	}
+
+	if (priv->channel >= 0) {
+		if (gxio_mpipe_link_close(&priv->link) != 0)
+			netdev_warn(dev, "Failed to close link!\n");
+		priv->channel = -1;
+	}
+
+	priv->echannel = -1;
+
+	mutex_unlock(&tile_net_devs_for_channel_mutex);
+
+	return 0;
+}
+
+/* Determine the VA for a fragment. */
+static inline void *tile_net_frag_buf(skb_frag_t *f)
+{
+	unsigned long pfn = page_to_pfn(skb_frag_page(f));
+	return pfn_to_kaddr(pfn) + f->page_offset;
+}
+
+/* Used for paranoia to make sure we handle no ill-formed packets. */
+#define TSO_DROP_IF(cond) \
+	do { if (WARN_ON(cond)) return NETDEV_TX_OK; } while (0)
+
+
+/* This function takes "skb", consisting of a header template and a
+ * (presumably) huge payload, and egresses it as one or more segments
+ * (aka packets), each consisting of a (possibly modified) copy of the
+ * header plus a piece of the payload, via "tcp segmentation offload".
+ *
+ * Usually, "data" will contain the header template, of size "sh_len",
+ * and "sh->frags" will contain "skb->data_len" bytes of payload, and
+ * there will be "sh->gso_segs" segments.
+ *
+ * Sometimes, if "sendfile()" requires copying, we will be called with
+ * "data" containing the header and payload, with "frags" being empty.
+ *
+ * Sometimes, for example when using NFS over TCP, a single segment can
+ * span 3 fragments.  This requires special care below.
+ *
+ * See "emulate_large_send_offload()" for some reference code, which
+ * does not handle checksumming.
+ */
+static int tile_net_tx_tso(struct sk_buff *skb, struct net_device *dev)
+{
+	struct tile_net_priv *priv = netdev_priv(dev);
+
+	struct tile_net_info *info = &__get_cpu_var(per_cpu_info);
+
+	struct tile_net_egress *egress = &egress_for_echannel[priv->echannel];
+	gxio_mpipe_equeue_t *equeue = egress->equeue;
+
+	struct tile_net_comps *comps =
+		info->comps_for_echannel[priv->echannel];
+
+	unsigned int len = skb->len;
+	unsigned char *data = skb->data;
+
+	/* The ip header follows the ethernet header. */
+	struct iphdr *ih = ip_hdr(skb);
+	unsigned int ih_len = ih->ihl * 4;
+
+	/* Note that "nh == iph", by definition. */
+	unsigned char *nh = skb_network_header(skb);
+	unsigned int eh_len = nh - data;
+
+	/* The tcp header follows the ip header. */
+	struct tcphdr *th = (struct tcphdr *)(nh + ih_len);
+	unsigned int th_len = th->doff * 4;
+
+	/* The total number of header bytes. */
+	unsigned int sh_len = eh_len + ih_len + th_len;
+
+	/* Help compute "jh->check". */
+	unsigned int isum_hack =
+		((0xFFFF - ih->check) +
+		 (0xFFFF - ih->tot_len) +
+		 (0xFFFF - ih->id));
+
+	/* Help compute "uh->check". */
+	unsigned int tsum_hack = th->check + (0xFFFF ^ htons(len));
+
+	struct skb_shared_info *sh = skb_shinfo(skb);
+
+	/* The maximum payload size. */
+	unsigned int gso_size = sh->gso_size;
+
+	/* The size of the initial segments (including header). */
+	unsigned int mtu = sh_len + gso_size;
+
+	/* The size of the final segment (including header). */
+	unsigned int mtu2 = len - ((sh->gso_segs - 1) * gso_size);
+
+	/* Track tx stats. */
+	unsigned int tx_packets = 0;
+	unsigned int tx_bytes = 0;
+
+	/* Which segment are we on. */
+	unsigned int segment;
+
+	/* Get the initial ip "id". */
+	u16 id = ntohs(ih->id);
+
+	/* Get the initial tcp "seq". */
+	u32 seq = ntohl(th->seq);
+
+	/* The id of the current fragment (or -1). */
+	long f_id;
+
+	/* The size of the current fragment (or -1). */
+	long f_size;
+
+	/* The bytes used from the current fragment (or -1). */
+	long f_used;
+
+	/* The size of the current piece of payload. */
+	long n;
+
+	/* Prepare checksum info. */
+	unsigned int csum_start = skb_checksum_start_offset(skb);
+
+	/* The header/payload edesc's. */
+	gxio_mpipe_edesc_t edesc_head = { { 0 } };
+	gxio_mpipe_edesc_t edesc_body = { { 0 } };
+
+	/* Total number of edescs needed. */
+	unsigned int num_edescs = 0;
+
+	unsigned long irqflags;
+
+	/* First reserved egress slot. */
+	s64 slot;
+
+	int cid;
+
+	/* Empty packets (etc) would cause trouble below. */
+	TSO_DROP_IF(skb->data_len == 0);
+	TSO_DROP_IF(sh->nr_frags == 0);
+	TSO_DROP_IF(sh->gso_segs == 0);
+
+	/* We assume the frags contain the entire payload. */
+	TSO_DROP_IF(skb_headlen(skb) != sh_len);
+	TSO_DROP_IF(len != sh_len + skb->data_len);
+
+	/* Implicitly verify "gso_segs" and "gso_size". */
+	TSO_DROP_IF(mtu2 > mtu);
+
+	/* We only have HEADER_BYTES for each header. */
+	TSO_DROP_IF(NET_IP_ALIGN + sh_len > HEADER_BYTES);
+
+	/* Paranoia. */
+	TSO_DROP_IF(skb->protocol != htons(ETH_P_IP));
+	TSO_DROP_IF(ih->protocol != IPPROTO_TCP);
+	TSO_DROP_IF(skb->ip_summed != CHECKSUM_PARTIAL);
+	TSO_DROP_IF(csum_start != eh_len + ih_len);
+
+	/* NOTE: ".hwb = 0", so ".size" is unused.
+	 * NOTE: ".stack_idx" determines the TLB.
+	 */
+
+	/* Prepare to egress the headers. */
+	edesc_head.csum = 1;
+	edesc_head.csum_start = csum_start;
+	edesc_head.csum_dest = csum_start + skb->csum_offset;
+	edesc_head.xfer_size = sh_len;
+	edesc_head.stack_idx = large_buffer_stack;
+
+	/* Prepare to egress the body. */
+	edesc_body.stack_idx = large_buffer_stack;
+
+	/* Reset. */
+	f_id = -1;
+	f_size = -1;
+	f_used = -1;
+
+	/* Determine how many edesc's are needed. */
+	for (segment = 0; segment < sh->gso_segs; segment++) {
+
+		/* Detect the final segment. */
+		bool final = (segment == sh->gso_segs - 1);
+
+		/* The segment size (including header). */
+		unsigned int s_len = final ? mtu2 : mtu;
+
+		/* The size of the payload. */
+		unsigned int p_len = s_len - sh_len;
+
+		/* The bytes used from the payload. */
+		unsigned int p_used = 0;
+
+		/* One edesc for the header. */
+		num_edescs++;
+
+		/* One edesc for each piece of the payload. */
+		while (p_used < p_len) {
+
+			/* Advance as needed. */
+			while (f_used >= f_size) {
+				f_id++;
+				f_size = sh->frags[f_id].size;
+				f_used = 0;
+			}
+
+			/* Use bytes from the current fragment. */
+			n = p_len - p_used;
+			if (n > f_size - f_used)
+				n = f_size - f_used;
+			f_used += n;
+			p_used += n;
+
+			num_edescs++;
+		}
+	}
+
+	/* Verify all fragments consumed. */
+	TSO_DROP_IF(f_id + 1 != sh->nr_frags);
+	TSO_DROP_IF(f_used != f_size);
+
+	local_irq_save(irqflags);
+
+	/* Reserve slots, or return NETDEV_TX_BUSY if "full". */
+	slot = gxio_mpipe_equeue_try_reserve(equeue, num_edescs);
+	if (slot < 0) {
+		local_irq_restore(irqflags);
+		/* ISSUE: "Virtual device xxx asks to queue packet". */
+		return NETDEV_TX_BUSY;
+	}
+
+	/* Reset. */
+	f_id = -1;
+	f_size = -1;
+	f_used = -1;
+
+	/* Prepare all the headers. */
+	for (segment = 0; segment < sh->gso_segs; segment++) {
+
+		/* Detect the final segment. */
+		bool final = (segment == sh->gso_segs - 1);
+
+		/* The segment size (including header). */
+		unsigned int s_len = final ? mtu2 : mtu;
+
+		/* The size of the payload. */
+		unsigned int p_len = s_len - sh_len;
+
+		/* The bytes used from the payload. */
+		unsigned int p_used = 0;
+
+		/* Access the header memory for this segment. */
+		unsigned int bn = slot % EQUEUE_ENTRIES;
+		unsigned char *buf =
+			egress->headers + bn * HEADER_BYTES + NET_IP_ALIGN;
+
+		/* The soon-to-be copied "ip" header. */
+		struct iphdr *jh = (struct iphdr *)(buf + eh_len);
+
+		/* The soon-to-be copied "tcp" header. */
+		struct tcphdr *uh = (struct tcphdr *)(buf + eh_len + ih_len);
+
+		unsigned int jsum;
+
+		/* Copy the header. */
+		memcpy(buf, data, sh_len);
+
+		/* The packet size, not including ethernet header. */
+		jh->tot_len = htons(s_len - eh_len);
+
+		/* Update the ip "id". */
+		jh->id = htons(id);
+
+		/* Compute the "ip checksum". */
+		jsum = isum_hack + htons(s_len - eh_len) + htons(id);
+		jh->check = csum_long(jsum) ^ 0xffff;
+
+		/* Update the tcp "seq". */
+		uh->seq = htonl(seq);
+
+		/* Update some flags. */
+		if (!final) {
+			uh->fin = 0;
+			uh->psh = 0;
+		}
+
+		/* Compute the tcp pseudo-header checksum. */
+		uh->check = csum_long(tsum_hack + htons(s_len));
+
+		/* Skip past the header. */
+		slot++;
+
+		/* Skip past the payload. */
+		while (p_used < p_len) {
+
+			/* Advance as needed. */
+			while (f_used >= f_size) {
+				f_id++;
+				f_size = sh->frags[f_id].size;
+				f_used = 0;
+			}
+
+			/* Use bytes from the current fragment. */
+			n = p_len - p_used;
+			if (n > f_size - f_used)
+				n = f_size - f_used;
+			f_used += n;
+			p_used += n;
+
+			slot++;
+		}
+
+		id++;
+		seq += p_len;
+	}
+
+	/* Reset "slot". */
+	slot -= num_edescs;
+
+	/* Flush the headers. */
+	wmb();
+
+	/* Reset. */
+	f_id = -1;
+	f_size = -1;
+	f_used = -1;
+
+	/* Egress all the edescs. */
+	for (segment = 0; segment < sh->gso_segs; segment++) {
+
+		/* Detect the final segment. */
+		bool final = (segment == sh->gso_segs - 1);
+
+		/* The segment size (including header). */
+		unsigned int s_len = final ? mtu2 : mtu;
+
+		/* The size of the payload. */
+		unsigned int p_len = s_len - sh_len;
+
+		/* The bytes used from the payload. */
+		unsigned int p_used = 0;
+
+		/* Access the header memory for this segment. */
+		unsigned int bn = slot % EQUEUE_ENTRIES;
+		unsigned char *buf =
+			egress->headers + bn * HEADER_BYTES + NET_IP_ALIGN;
+
+		void *va;
+
+		/* Egress the header. */
+		edesc_head.va = va_to_tile_io_addr(buf);
+		gxio_mpipe_equeue_put_at(equeue, edesc_head, slot);
+		slot++;
+
+		/* Egress the payload. */
+		while (p_used < p_len) {
+
+			/* Advance as needed. */
+			while (f_used >= f_size) {
+				f_id++;
+				f_size = sh->frags[f_id].size;
+				f_used = 0;
+			}
+
+			va = tile_net_frag_buf(&sh->frags[f_id]) + f_used;
+
+			/* Use bytes from the current fragment. */
+			n = p_len - p_used;
+			if (n > f_size - f_used)
+				n = f_size - f_used;
+			f_used += n;
+			p_used += n;
+
+			/* Egress a piece of the payload. */
+			edesc_body.va = va_to_tile_io_addr(va);
+			edesc_body.xfer_size = n;
+			edesc_body.bound = !(p_used < p_len);
+			gxio_mpipe_equeue_put_at(equeue, edesc_body, slot);
+			slot++;
+		}
+
+		tx_packets++;
+		tx_bytes += s_len;
+	}
+
+	/* Wait for a free completion entry.
+	 * ISSUE: Is this the best logic?
+	 * ISSUE: Can this cause undesirable "blocking"?
+	 */
+	while (comps->comp_next - comps->comp_last >= TILE_NET_MAX_COMPS - 1)
+		tile_net_free_comps(equeue, comps, 32, false);
+
+	/* Update the completions array. */
+	cid = comps->comp_next % TILE_NET_MAX_COMPS;
+	comps->comp_queue[cid].when = slot;
+	comps->comp_queue[cid].skb = skb;
+	comps->comp_next++;
+
+	/* Update stats. */
+	atomic_add(tx_packets, (atomic_t *)&priv->stats.tx_packets);
+	atomic_add(tx_bytes, (atomic_t *)&priv->stats.tx_bytes);
+
+	local_irq_restore(irqflags);
+
+	/* Make sure the egress timer is scheduled. */
+	tile_net_schedule_egress_timer(info);
+
+	return NETDEV_TX_OK;
+}
+
+/* Analyze the body and frags for a transmit request. */
+static unsigned int tile_net_tx_frags(struct frag *frags,
+				       struct sk_buff *skb,
+				       void *b_data, unsigned int b_len)
+{
+	unsigned int i, n = 0;
+
+	struct skb_shared_info *sh = skb_shinfo(skb);
+
+	if (b_len != 0) {
+		frags[n].buf = b_data;
+		frags[n++].length = b_len;
+	}
+
+	for (i = 0; i < sh->nr_frags; i++) {
+		skb_frag_t *f = &sh->frags[i];
+		frags[n].buf = tile_net_frag_buf(f);
+		frags[n++].length = skb_frag_size(f);
+	}
+
+	return n;
+}
+
+/* Help the kernel transmit a packet. */
+static int tile_net_tx(struct sk_buff *skb, struct net_device *dev)
+{
+	struct tile_net_priv *priv = netdev_priv(dev);
+
+	struct tile_net_info *info = &__get_cpu_var(per_cpu_info);
+
+	struct tile_net_egress *egress = &egress_for_echannel[priv->echannel];
+	gxio_mpipe_equeue_t *equeue = egress->equeue;
+
+	struct tile_net_comps *comps =
+		info->comps_for_echannel[priv->echannel];
+
+	struct skb_shared_info *sh = skb_shinfo(skb);
+
+	unsigned int len = skb->len;
+	unsigned char *data = skb->data;
+
+	unsigned int num_frags;
+	struct frag frags[MAX_FRAGS];
+	gxio_mpipe_edesc_t edescs[MAX_FRAGS];
+
+	unsigned int i;
+
+	int cid;
+
+	s64 slot;
+
+	unsigned long irqflags;
+
+	/* Save the timestamp. */
+	dev->trans_start = jiffies;
+
+#ifdef TILE_NET_DUMP_PACKETS
+	/* ISSUE: Does not dump the "frags". */
+	dump_packet(data, skb_headlen(skb), "tx");
+#endif
+
+	if (sh->gso_size != 0)
+		return tile_net_tx_tso(skb, dev);
+
+	/* NOTE: This is usually 2, sometimes 3, for big writes. */
+	num_frags = tile_net_tx_frags(frags, skb, data, skb_headlen(skb));
+
+	/* Prepare the edescs. */
+	for (i = 0; i < num_frags; i++) {
+
+		/* NOTE: ".hwb = 0", so ".size" is unused.
+		 * NOTE: ".stack_idx" determines the TLB.
+		 */
+
+		gxio_mpipe_edesc_t edesc = { { 0 } };
+
+		/* Prepare the basic command. */
+		edesc.bound = (i == num_frags - 1);
+		edesc.xfer_size = frags[i].length;
+		edesc.va = va_to_tile_io_addr(frags[i].buf);
+		edesc.stack_idx = large_buffer_stack;
+
+		edescs[i] = edesc;
+	}
+
+	/* Add checksum info if needed. */
+	if (skb->ip_summed == CHECKSUM_PARTIAL) {
+		unsigned int csum_start = skb->csum_start - skb_headroom(skb);
+		edescs[0].csum = 1;
+		edescs[0].csum_start = csum_start;
+		edescs[0].csum_dest = csum_start + skb->csum_offset;
+	}
+
+	local_irq_save(irqflags);
+
+	/* Reserve slots, or return NETDEV_TX_BUSY if "full". */
+	slot = gxio_mpipe_equeue_try_reserve(equeue, num_frags);
+	if (slot < 0) {
+		local_irq_restore(irqflags);
+		/* ISSUE: "Virtual device xxx asks to queue packet". */
+		return NETDEV_TX_BUSY;
+	}
+
+	for (i = 0; i < num_frags; i++)
+		gxio_mpipe_equeue_put_at(equeue, edescs[i], slot + i);
+
+	/* Wait for a free completion entry.
+	 * ISSUE: Is this the best logic?
+	 * ISSUE: Can this cause undesirable "blocking"?
+	 */
+	while (comps->comp_next - comps->comp_last >= TILE_NET_MAX_COMPS - 1)
+		tile_net_free_comps(equeue, comps, 32, false);
+
+	/* Update the completions array. */
+	cid = comps->comp_next % TILE_NET_MAX_COMPS;
+	comps->comp_queue[cid].when = slot + num_frags;
+	comps->comp_queue[cid].skb = skb;
+	comps->comp_next++;
+
+	/* HACK: Track "expanded" size for short packets (e.g. 42 < 60). */
+	atomic_add(1, (atomic_t *)&priv->stats.tx_packets);
+	atomic_add((len >= ETH_ZLEN) ? len : ETH_ZLEN,
+		   (atomic_t *)&priv->stats.tx_bytes);
+
+	local_irq_restore(irqflags);
+
+	/* Make sure the egress timer is scheduled. */
+	tile_net_schedule_egress_timer(info);
+
+	return NETDEV_TX_OK;
+}
+
+/* Deal with a transmit timeout. */
+static void tile_net_tx_timeout(struct net_device *dev)
+{
+	/* ISSUE: This doesn't seem useful for us. */
+	netif_wake_queue(dev);
+}
+
+/* Ioctl commands. */
+static int tile_net_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
+{
+	return -EOPNOTSUPP;
+}
+
+/* Get system network statistics for device. */
+static struct net_device_stats *tile_net_get_stats(struct net_device *dev)
+{
+	struct tile_net_priv *priv = netdev_priv(dev);
+	return &priv->stats;
+}
+
+/* Change the MTU. */
+static int tile_net_change_mtu(struct net_device *dev, int new_mtu)
+{
+	/* Check ranges. */
+	if ((new_mtu < 68) || (new_mtu > 1500))
+		return -EINVAL;
+
+	/* Accept the value. */
+	dev->mtu = new_mtu;
+
+	return 0;
+}
+
+/* Change the Ethernet address of the NIC.
+ *
+ * The hypervisor driver does not support changing MAC address.  However,
+ * the hardware does not do anything with the MAC address, so the address
+ * which gets used on outgoing packets, and which is accepted on incoming
+ * packets, is completely up to us.
+ *
+ * Returns 0 on success, negative on failure.
+ */
+static int tile_net_set_mac_address(struct net_device *dev, void *p)
+{
+	struct sockaddr *addr = p;
+
+	if (!is_valid_ether_addr(addr->sa_data))
+		return -EINVAL;
+
+	memcpy(dev->dev_addr, addr->sa_data, dev->addr_len);
+
+	return 0;
+}
+
+#ifdef CONFIG_NET_POLL_CONTROLLER
+/* Polling 'interrupt' - used by things like netconsole to send skbs
+ * without having to re-enable interrupts. It's not called while
+ * the interrupt routine is executing.
+ */
+static void tile_net_netpoll(struct net_device *dev)
+{
+	disable_percpu_irq(ingress_irq);
+	tile_net_handle_ingress_irq(ingress_irq, NULL);
+	enable_percpu_irq(ingress_irq, 0);
+}
+#endif
+
+static const struct net_device_ops tile_net_ops = {
+	.ndo_open = tile_net_open,
+	.ndo_stop = tile_net_stop,
+	.ndo_start_xmit = tile_net_tx,
+	.ndo_do_ioctl = tile_net_ioctl,
+	.ndo_get_stats = tile_net_get_stats,
+	.ndo_change_mtu = tile_net_change_mtu,
+	.ndo_tx_timeout = tile_net_tx_timeout,
+	.ndo_set_mac_address = tile_net_set_mac_address,
+#ifdef CONFIG_NET_POLL_CONTROLLER
+	.ndo_poll_controller = tile_net_netpoll,
+#endif
+};
+
+/* The setup function.
+ *
+ * This uses ether_setup() to assign various fields in dev, including
+ * setting IFF_BROADCAST and IFF_MULTICAST, then sets some extra fields.
+ */
+static void tile_net_setup(struct net_device *dev)
+{
+	ether_setup(dev);
+
+	dev->netdev_ops = &tile_net_ops;
+	dev->watchdog_timeo = TILE_NET_TIMEOUT;
+
+	/* We want lockless xmit. */
+	dev->features |= NETIF_F_LLTX;
+
+	/* We support hardware tx checksums. */
+	dev->features |= NETIF_F_HW_CSUM;
+
+	/* We support scatter/gather. */
+	dev->features |= NETIF_F_SG;
+
+#ifdef TILE_NET_GSO
+	/* We support GSO. */
+	dev->features |= NETIF_F_GSO;
+#endif
+
+#ifdef TILE_NET_TSO
+	/* We support TSO. */
+	dev->features |= NETIF_F_TSO;
+#endif
+
+	dev->tx_queue_len = TILE_NET_TX_QUEUE_LEN;
+
+	dev->mtu = 1500;
+}
+
+/* Allocate the device structure, register the device, and obtain the
+ * MAC address from the hypervisor.
+ */
+static void tile_net_dev_init(const char *name, const uint8_t *mac)
+{
+	int ret;
+	int i;
+	int nz_addr = 0;
+	struct net_device *dev;
+	struct tile_net_priv *priv;
+
+	/* HACK: Ignore "loop" links. */
+	if (strncmp(name, "loop", 4) == 0)
+		return;
+
+	/* Allocate the device structure.  This allocates "priv", calls
+	 * tile_net_setup(), and saves "name".  Normally, "name" is a
+	 * template, instantiated by register_netdev(), but not for us.
+	 */
+	dev = alloc_netdev(sizeof(*priv), name, tile_net_setup);
+	if (!dev) {
+		pr_err("alloc_netdev(%s) failed\n", name);
+		return;
+	}
+
+	priv = netdev_priv(dev);
+
+	/* Initialize "priv". */
+
+	memset(priv, 0, sizeof(*priv));
+
+	priv->dev = dev;
+
+	priv->channel = -1;
+	priv->loopify_channel = -1;
+	priv->echannel = -1;
+
+	/* Register the network device. */
+	ret = register_netdev(dev);
+	if (ret) {
+		netdev_err(dev, "register_netdev failed %d\n", ret);
+		free_netdev(dev);
+		return;
+	}
+
+	/* Get the MAC address and set it in the device struct; this must
+	 * be done before the device is opened.  If the MAC is all zeroes,
+	 * we use a random address, since we're probably on the simulator.
+	 */
+	for (i = 0; i < 6; i++)
+		nz_addr |= mac[i];
+
+	if (nz_addr) {
+		memcpy(dev->dev_addr, mac, 6);
+		dev->addr_len = 6;
+	} else {
+		random_ether_addr(dev->dev_addr);
+	}
+}
+
+/* Module initialization. */
+static int __init tile_net_init_module(void)
+{
+	int i;
+	char name[GXIO_MPIPE_LINK_NAME_LEN];
+	uint8_t mac[6];
+
+	pr_info("Tilera Network Driver\n");
+
+	mutex_init(&tile_net_devs_for_channel_mutex);
+
+	/* Initialize each CPU. */
+	on_each_cpu(tile_net_prepare_cpu, NULL, 1);
+
+	/* Find out what devices we have, and initialize them. */
+	for (i = 0; gxio_mpipe_link_enumerate_mac(i, name, mac) >= 0; i++)
+		tile_net_dev_init(name, mac);
+
+	if (!network_cpus_init())
+		network_cpus_map = *cpu_online_mask;
+
+	return 0;
+}
+
+module_init(tile_net_init_module);
-- 
1.6.5.2

^ permalink raw reply related

* Re: [PATCH] net: skb_set_dev do not unconditionally drop ref to dst
From: Frank Blaschka @ 2012-05-09 10:48 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, arnd, linux-s390, eric.dumazet
In-Reply-To: <20120503.022837.2021054007853235758.davem@davemloft.net>

On Thu, May 03, 2012 at 02:28:37AM -0400, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Wed, 02 May 2012 08:59:21 +0200
> 
> > If Arnd doesnt care to even reply, we can just revert his buggy
> > patch, instead of trying to understand and fix all issues.
> 
> Agreed, it's the best sounding solution by far.
Hi Dave,

looks like discussion is done and everybody is ok with reverting
Arnds patch. Can you revert commit 8a83a00b0735190384a348156837918271034144 ?
Thx

Fran ?
Thx

Frank

> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply

* RE: [PATCH] pch_gbe: Adding read memory barriers
From: David Laight @ 2012-05-09 10:47 UTC (permalink / raw)
  To: Erwan Velu, netdev, linux-kernel; +Cc: tshimizu818
In-Reply-To: <4FA822C4.60809@gmail.com>

 
> Under a strong incoming packet stream and/or high cpu usage,
> the pch_gbe driver reports "Receive CRC Error" and discards packet.
> 
> It occurred on an Intel ATOM E620T while running a 
> 300mbit/sec multicast
> network stream leading to a ~100% cpu usage.
> 
> Adding rmb() calls before considering the network card's status solve
> this issue.
> 
> Getting it into stable would be perfect as it solves 
> reliability issues.
> 
> Signed-off-by: Erwan Velu <erwan.velu@zodiacaerospace.com>
> ---
>   .../net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c   |    3 +++
>   1 files changed, 3 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c 
> b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
> index 8035e5f..ace3654 100644
> --- a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
> +++ b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
> @@ -1413,6 +1413,7 @@ static irqreturn_t pch_gbe_intr(int 
> irq, void *data)
>   			pch_gbe_mac_set_pause_packet(hw);
>   		}
>   	}
> +	smp_rmb(); /* prevent any other reads before*/

Under the assumption that your memory references are uncached,
you only need to stop gcc reordering the object code,
Rather than actually adding one of the 'fence' instructions.

So you should only need: asm volatile(:::"memory")
NFI which define generates that, the defines in the copy of
sysdep.h I just looked at always include one of the fences.

	David

^ permalink raw reply

* Re: [v12 PATCH 1/3] NETFILTER added flags to ipv6_find_hdr()
From: Pablo Neira Ayuso @ 2012-05-09 11:01 UTC (permalink / raw)
  To: Hans Schillstrom; +Cc: kaber, jengelh, netfilter-devel, netdev, hans
In-Reply-To: <1335188128-23645-2-git-send-email-hans.schillstrom@ericsson.com>

I have applied this with minor changes.

BTW, please use the following patch tagging next time, I'll save time:

netfilter: ip6_tables: add flags parameter to ipv6_find_hdr()

note the initial netfilter, then ip6_tables, then the description.

This is useful for grepping.

More minor glitches:

On Mon, Apr 23, 2012 at 03:35:26PM +0200, Hans Schillstrom wrote:
> Two new flags to ipv6_find_hdr,
> One that tells us that this is a fragment.
> One that stops at AH if any i.e. treat it like a transport header.
> i.e. make handling of ESP and AH the same.
> Param offset can now point to an inner icmp ipv5 header.
> 
> Version 3:
>     offset param into ipv6_find_hdr set to zero.
> 
> Version 2:
>     wrapper removed and changes made at every call.
> 
> Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
> ---
>  include/linux/netfilter_ipv6/ip6_tables.h |   12 +++++++++-
>  net/ipv6/netfilter/ip6_tables.c           |   35 ++++++++++++++++++++++++----
>  net/ipv6/netfilter/ip6t_ah.c              |    4 +-
>  net/ipv6/netfilter/ip6t_frag.c            |    4 +-
>  net/ipv6/netfilter/ip6t_hbh.c             |    4 +-
>  net/ipv6/netfilter/ip6t_rt.c              |    4 +-
>  net/netfilter/xt_TPROXY.c                 |    4 +-
>  net/netfilter/xt_socket.c                 |    4 +-
>  8 files changed, 53 insertions(+), 18 deletions(-)
> 
> diff --git a/include/linux/netfilter_ipv6/ip6_tables.h b/include/linux/netfilter_ipv6/ip6_tables.h
> index 1bc898b..d96a39d 100644
> --- a/include/linux/netfilter_ipv6/ip6_tables.h
> +++ b/include/linux/netfilter_ipv6/ip6_tables.h
> @@ -287,6 +287,7 @@ extern unsigned int ip6t_do_table(struct sk_buff *skb,
>  				  struct xt_table *table);
>  
>  /* Check for an extension */
> +

removed this extra line.

>  static inline int
>  ip6t_ext_hdr(u8 nexthdr)
>  {	return (nexthdr == IPPROTO_HOPOPTS) ||
> @@ -298,9 +299,18 @@ ip6t_ext_hdr(u8 nexthdr)
>  	       (nexthdr == IPPROTO_DSTOPTS);
>  }
>  
> +

removed double extra line.

> +extern int ip6t_ext_hdr(u8 nexthdr);
> +enum {
> +	IP6T_FH_FRAG,
> +	IP6T_FH_AUTH,

removed these two above, the are not used anywhere in the code.

> +	IP6T_FH_F_FRAG = 1 << IP6T_FH_FRAG,
> +	IP6T_FH_F_AUTH = 1 << IP6T_FH_AUTH,
> +};
> +
>  /* find specified header and get offset to it */
>  extern int ipv6_find_hdr(const struct sk_buff *skb, unsigned int *offset,
> -			 int target, unsigned short *fragoff);
> +			 int target, unsigned short *fragoff, int *fragflg);
>  
>  #ifdef CONFIG_COMPAT
>  #include <net/compat.h>
> diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
> index d4e350f..1f18662 100644
> --- a/net/ipv6/netfilter/ip6_tables.c
> +++ b/net/ipv6/netfilter/ip6_tables.c
> @@ -133,7 +133,7 @@ ip6_packet_match(const struct sk_buff *skb,
>  		int protohdr;
>  		unsigned short _frag_off;
>  
> -		protohdr = ipv6_find_hdr(skb, protoff, -1, &_frag_off);
> +		protohdr = ipv6_find_hdr(skb, protoff, -1, &_frag_off, NULL);
>  		if (protohdr < 0) {
>  			if (_frag_off == 0)
>  				*hotdrop = true;
> @@ -362,6 +362,7 @@ ip6t_do_table(struct sk_buff *skb,
>  		const struct xt_entry_match *ematch;
>  
>  		IP_NF_ASSERT(e);
> +		acpar.thoff = 0;
>  		if (!ip6_packet_match(skb, indev, outdev, &e->ipv6,
>  		    &acpar.thoff, &acpar.fragoff, &acpar.hotdrop)) {
>   no_match:
> @@ -2277,6 +2278,8 @@ static void __exit ip6_tables_fini(void)
>   * find the offset to specified header or the protocol number of last header
>   * if target < 0. "last header" is transport protocol header, ESP, or
>   * "No next header".
> + * Note, *offset is used as input param. an if != 0
> + * it must be an offset to an inner ipv6 header ex. icmp error
>   *
>   * If target header is found, its offset is set in *offset and return protocol
>   * number. Otherwise, return -1.
> @@ -2289,17 +2292,34 @@ static void __exit ip6_tables_fini(void)
>   * *offset is meaningless and fragment offset is stored in *fragoff if fragoff
>   * isn't NULL.
>   *
> + * if flags != NULL AND
> + *    it's a fragment the frag flag "IP6T_FH_F_FRAG" will be set
> + *    it's an AH header and IP6T_FH_F_AUTH is set and target < 0
> + *      stop at AH (i.e. treat is as a transport header)

I've cleaned up these comments. The format does not look very orthodox
(I'm not blaming your English, but the way the text is organized).

^ permalink raw reply

* [PATCH v3][resend] bonding: don't increase rx_dropped after processing LACPDUs
From: Jiri Bohac @ 2012-05-09 11:01 UTC (permalink / raw)
  To: David Miller; +Cc: andy, netdev, fubar

Since commit 3aba891d, bonding processes LACP frames (802.3ad
mode) with bond_handle_frame(). Currently a copy of the skb is
made and the original is left to be processed by other
rx_handlers and the rest of the network stack by returning
RX_HANDLER_ANOTHER.  As there is no protocol handler for
PKT_TYPE_LACPDU, the frame is dropped and dev->rx_dropped
increased.

Fix this by making bond_handle_frame() return RX_HANDLER_CONSUMED
if bonding has processed the LACP frame.

Signed-off-by: Jiri Bohac <jbohac@suse.cz>

diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -2173,9 +2173,10 @@ re_arm:
  * received frames (loopback). Since only the payload is given to this
  * function, it check for loopback.
  */
-static void bond_3ad_rx_indication(struct lacpdu *lacpdu, struct slave *slave, u16 length)
+static int bond_3ad_rx_indication(struct lacpdu *lacpdu, struct slave *slave, u16 length)
 {
 	struct port *port;
+	int ret = RX_HANDLER_ANOTHER;
 
 	if (length >= sizeof(struct lacpdu)) {
 
@@ -2184,11 +2185,12 @@ static void bond_3ad_rx_indication(struct lacpdu *lacpdu, struct slave *slave, u
 		if (!port->slave) {
 			pr_warning("%s: Warning: port of slave %s is uninitialized\n",
 				   slave->dev->name, slave->dev->master->name);
-			return;
+			return ret;
 		}
 
 		switch (lacpdu->subtype) {
 		case AD_TYPE_LACPDU:
+			ret = RX_HANDLER_CONSUMED;
 			pr_debug("Received LACPDU on port %d\n",
 				 port->actor_port_number);
 			/* Protect against concurrent state machines */
@@ -2198,6 +2200,7 @@ static void bond_3ad_rx_indication(struct lacpdu *lacpdu, struct slave *slave, u
 			break;
 
 		case AD_TYPE_MARKER:
+			ret = RX_HANDLER_CONSUMED;
 			// No need to convert fields to Little Endian since we don't use the marker's fields.
 
 			switch (((struct bond_marker *)lacpdu)->tlv_type) {
@@ -2219,6 +2222,7 @@ static void bond_3ad_rx_indication(struct lacpdu *lacpdu, struct slave *slave, u
 			}
 		}
 	}
+	return ret;
 }
 
 /**
@@ -2456,18 +2460,20 @@ out:
 	return NETDEV_TX_OK;
 }
 
-void bond_3ad_lacpdu_recv(struct sk_buff *skb, struct bonding *bond,
+int bond_3ad_lacpdu_recv(struct sk_buff *skb, struct bonding *bond,
 			  struct slave *slave)
 {
+	int ret = RX_HANDLER_ANOTHER;
 	if (skb->protocol != PKT_TYPE_LACPDU)
-		return;
+		return ret;
 
 	if (!pskb_may_pull(skb, sizeof(struct lacpdu)))
-		return;
+		return ret;
 
 	read_lock(&bond->lock);
-	bond_3ad_rx_indication((struct lacpdu *) skb->data, slave, skb->len);
+	ret = bond_3ad_rx_indication((struct lacpdu *) skb->data, slave, skb->len);
 	read_unlock(&bond->lock);
+	return ret;
 }
 
 /*
diff --git a/drivers/net/bonding/bond_3ad.h b/drivers/net/bonding/bond_3ad.h
index 235b2cc..5ee7e3c 100644
--- a/drivers/net/bonding/bond_3ad.h
+++ b/drivers/net/bonding/bond_3ad.h
@@ -274,7 +274,7 @@ void bond_3ad_adapter_duplex_changed(struct slave *slave);
 void bond_3ad_handle_link_change(struct slave *slave, char link);
 int  bond_3ad_get_active_agg_info(struct bonding *bond, struct ad_info *ad_info);
 int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev);
-void bond_3ad_lacpdu_recv(struct sk_buff *skb, struct bonding *bond,
+int bond_3ad_lacpdu_recv(struct sk_buff *skb, struct bonding *bond,
 			  struct slave *slave);
 int bond_3ad_set_carrier(struct bonding *bond);
 void bond_3ad_update_lacp_rate(struct bonding *bond);
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 62d2409..0a0f4a6 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -1444,8 +1444,9 @@ static rx_handler_result_t bond_handle_frame(struct sk_buff **pskb)
 	struct sk_buff *skb = *pskb;
 	struct slave *slave;
 	struct bonding *bond;
-	void (*recv_probe)(struct sk_buff *, struct bonding *,
+	int (*recv_probe)(struct sk_buff *, struct bonding *,
 				struct slave *);
+	int ret = RX_HANDLER_ANOTHER;
 
 	skb = skb_share_check(skb, GFP_ATOMIC);
 	if (unlikely(!skb))
@@ -1464,8 +1465,12 @@ static rx_handler_result_t bond_handle_frame(struct sk_buff **pskb)
 		struct sk_buff *nskb = skb_clone(skb, GFP_ATOMIC);
 
 		if (likely(nskb)) {
-			recv_probe(nskb, bond, slave);
+			ret = recv_probe(nskb, bond, slave);
 			dev_kfree_skb(nskb);
+			if (ret == RX_HANDLER_CONSUMED) {
+				consume_skb(skb);
+				return ret;
+			}
 		}
 	}
 
@@ -1487,7 +1492,7 @@ static rx_handler_result_t bond_handle_frame(struct sk_buff **pskb)
 		memcpy(eth_hdr(skb)->h_dest, bond->dev->dev_addr, ETH_ALEN);
 	}
 
-	return RX_HANDLER_ANOTHER;
+	return ret;
 }
 
 /* enslave device <slave> to bond device <master> */
@@ -2723,7 +2728,7 @@ static void bond_validate_arp(struct bonding *bond, struct slave *slave, __be32
 	}
 }
 
-static void bond_arp_rcv(struct sk_buff *skb, struct bonding *bond,
+static int bond_arp_rcv(struct sk_buff *skb, struct bonding *bond,
 			 struct slave *slave)
 {
 	struct arphdr *arp;
@@ -2731,7 +2736,7 @@ static void bond_arp_rcv(struct sk_buff *skb, struct bonding *bond,
 	__be32 sip, tip;
 
 	if (skb->protocol != __cpu_to_be16(ETH_P_ARP))
-		return;
+		return RX_HANDLER_ANOTHER;
 
 	read_lock(&bond->lock);
 
@@ -2776,6 +2781,7 @@ static void bond_arp_rcv(struct sk_buff *skb, struct bonding *bond,
 
 out_unlock:
 	read_unlock(&bond->lock);
+	return RX_HANDLER_ANOTHER;
 }
 
 /*

-- 
Jiri Bohac <jbohac@suse.cz>
SUSE Labs, SUSE CZ

^ permalink raw reply related

* [PATCH 02/14] batman-adv: introduce is_single_hop_neigh variable to increase readability
From: Antonio Quartulli @ 2012-05-09 11:12 UTC (permalink / raw)
  To: davem; +Cc: netdev, b.a.t.m.a.n, Marek Lindner, Antonio Quartulli
In-Reply-To: <1336561976-16088-1-git-send-email-ordex@autistici.org>

From: Marek Lindner <lindner_marek@yahoo.de>

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Acked-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
---
 net/batman-adv/bat_iv_ogm.c |   16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/net/batman-adv/bat_iv_ogm.c b/net/batman-adv/bat_iv_ogm.c
index 8b2db2e..cd8f473 100644
--- a/net/batman-adv/bat_iv_ogm.c
+++ b/net/batman-adv/bat_iv_ogm.c
@@ -480,7 +480,8 @@ static void bat_iv_ogm_queue_add(struct bat_priv *bat_priv,
 static void bat_iv_ogm_forward(struct orig_node *orig_node,
 			       const struct ethhdr *ethhdr,
 			       struct batman_ogm_packet *batman_ogm_packet,
-			       int directlink, struct hard_iface *if_incoming)
+			       bool is_single_hop_neigh,
+			       struct hard_iface *if_incoming)
 {
 	struct bat_priv *bat_priv = netdev_priv(if_incoming->soft_iface);
 	struct neigh_node *router;
@@ -533,7 +534,7 @@ static void bat_iv_ogm_forward(struct orig_node *orig_node,
 
 	/* switch of primaries first hop flag when forwarding */
 	batman_ogm_packet->flags &= ~PRIMARIES_FIRST_HOP;
-	if (directlink)
+	if (is_single_hop_neigh)
 		batman_ogm_packet->flags |= DIRECTLINK;
 	else
 		batman_ogm_packet->flags &= ~DIRECTLINK;
@@ -918,7 +919,8 @@ static void bat_iv_ogm_process(const struct ethhdr *ethhdr,
 	struct neigh_node *orig_neigh_router = NULL;
 	int has_directlink_flag;
 	int is_my_addr = 0, is_my_orig = 0, is_my_oldorig = 0;
-	int is_broadcast = 0, is_bidirectional, is_single_hop_neigh;
+	int is_broadcast = 0, is_bidirectional;
+	bool is_single_hop_neigh = false;
 	int is_duplicate;
 	uint32_t if_incoming_seqno;
 
@@ -942,8 +944,8 @@ static void bat_iv_ogm_process(const struct ethhdr *ethhdr,
 
 	has_directlink_flag = (batman_ogm_packet->flags & DIRECTLINK ? 1 : 0);
 
-	is_single_hop_neigh = (compare_eth(ethhdr->h_source,
-					   batman_ogm_packet->orig) ? 1 : 0);
+	if (compare_eth(ethhdr->h_source, batman_ogm_packet->orig))
+		is_single_hop_neigh = true;
 
 	bat_dbg(DBG_BATMAN, bat_priv,
 		"Received BATMAN packet via NB: %pM, IF: %s [%pM] (from OG: %pM, via prev OG: %pM, seqno %u, ttvn %u, crc %u, changes %u, td %d, TTL %d, V %d, IDF %d)\n",
@@ -1114,7 +1116,7 @@ static void bat_iv_ogm_process(const struct ethhdr *ethhdr,
 
 		/* mark direct link on incoming interface */
 		bat_iv_ogm_forward(orig_node, ethhdr, batman_ogm_packet,
-				   1, if_incoming);
+				   is_single_hop_neigh, if_incoming);
 
 		bat_dbg(DBG_BATMAN, bat_priv,
 			"Forwarding packet: rebroadcast neighbor packet with direct link flag\n");
@@ -1137,7 +1139,7 @@ static void bat_iv_ogm_process(const struct ethhdr *ethhdr,
 	bat_dbg(DBG_BATMAN, bat_priv,
 		"Forwarding packet: rebroadcast originator packet\n");
 	bat_iv_ogm_forward(orig_node, ethhdr, batman_ogm_packet,
-			   0, if_incoming);
+			   is_single_hop_neigh, if_incoming);
 
 out_neigh:
 	if ((orig_neigh_node) && (!is_single_hop_neigh))
-- 
1.7.9.4

^ permalink raw reply related

* [PATCH 04/14] batman-adv: register batman ogm receive function during protocol init
From: Antonio Quartulli @ 2012-05-09 11:12 UTC (permalink / raw)
  To: davem; +Cc: netdev, b.a.t.m.a.n, Marek Lindner, Antonio Quartulli
In-Reply-To: <1336561976-16088-1-git-send-email-ordex@autistici.org>

From: Marek Lindner <lindner_marek@yahoo.de>

The B.A.T.M.A.N. IV OGM receive function still was hard-coded although
it is a routing protocol specific function. This patch takes advantage
of the dynamic packet handler registration to remove the hard-coded
function calls.

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Acked-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
---
 net/batman-adv/bat_iv_ogm.c |   31 +++++++++++++++++++++++++++----
 net/batman-adv/main.c       |    5 +----
 net/batman-adv/routing.c    |   22 ++++++++++------------
 net/batman-adv/routing.h    |    4 +++-
 net/batman-adv/types.h      |    3 ---
 5 files changed, 41 insertions(+), 24 deletions(-)

diff --git a/net/batman-adv/bat_iv_ogm.c b/net/batman-adv/bat_iv_ogm.c
index cd8f473..e0aaf8c 100644
--- a/net/batman-adv/bat_iv_ogm.c
+++ b/net/batman-adv/bat_iv_ogm.c
@@ -1155,13 +1155,18 @@ out:
 	orig_node_free_ref(orig_node);
 }
 
-static void bat_iv_ogm_receive(struct hard_iface *if_incoming,
-			       struct sk_buff *skb)
+static int bat_iv_ogm_receive(struct sk_buff *skb,
+			      struct hard_iface *if_incoming)
 {
 	struct batman_ogm_packet *batman_ogm_packet;
 	struct ethhdr *ethhdr;
 	int buff_pos = 0, packet_len;
 	unsigned char *tt_buff, *packet_buff;
+	bool ret;
+
+	ret = check_management_packet(skb, if_incoming, BATMAN_OGM_HLEN);
+	if (!ret)
+		return NET_RX_DROP;
 
 	packet_len = skb_headlen(skb);
 	ethhdr = (struct ethhdr *)skb_mac_header(skb);
@@ -1187,6 +1192,9 @@ static void bat_iv_ogm_receive(struct hard_iface *if_incoming,
 						(packet_buff + buff_pos);
 	} while (bat_iv_ogm_aggr_packet(buff_pos, packet_len,
 					batman_ogm_packet->tt_num_changes));
+
+	kfree_skb(skb);
+	return NET_RX_SUCCESS;
 }
 
 static struct bat_algo_ops batman_iv __read_mostly = {
@@ -1197,10 +1205,25 @@ static struct bat_algo_ops batman_iv __read_mostly = {
 	.bat_ogm_update_mac = bat_iv_ogm_update_mac,
 	.bat_ogm_schedule = bat_iv_ogm_schedule,
 	.bat_ogm_emit = bat_iv_ogm_emit,
-	.bat_ogm_receive = bat_iv_ogm_receive,
 };
 
 int __init bat_iv_init(void)
 {
-	return bat_algo_register(&batman_iv);
+	int ret;
+
+	/* batman originator packet */
+	ret = recv_handler_register(BAT_IV_OGM, bat_iv_ogm_receive);
+	if (ret < 0)
+		goto out;
+
+	ret = bat_algo_register(&batman_iv);
+	if (ret < 0)
+		goto handler_unregister;
+
+	goto out;
+
+handler_unregister:
+	recv_handler_unregister(BAT_IV_OGM);
+out:
+	return ret;
 }
diff --git a/net/batman-adv/main.c b/net/batman-adv/main.c
index d19b935..f80c447 100644
--- a/net/batman-adv/main.c
+++ b/net/batman-adv/main.c
@@ -266,8 +266,6 @@ static void recv_handler_init(void)
 	for (i = 0; i < ARRAY_SIZE(recv_packet_handler); i++)
 		recv_packet_handler[i] = recv_unhandled_packet;
 
-	/* batman originator packet */
-	recv_packet_handler[BAT_IV_OGM] = recv_bat_ogm_packet;
 	/* batman icmp packet */
 	recv_packet_handler[BAT_ICMP] = recv_icmp_packet;
 	/* unicast packet */
@@ -334,8 +332,7 @@ int bat_algo_register(struct bat_algo_ops *bat_algo_ops)
 	    !bat_algo_ops->bat_primary_iface_set ||
 	    !bat_algo_ops->bat_ogm_update_mac ||
 	    !bat_algo_ops->bat_ogm_schedule ||
-	    !bat_algo_ops->bat_ogm_emit ||
-	    !bat_algo_ops->bat_ogm_receive) {
+	    !bat_algo_ops->bat_ogm_emit) {
 		pr_info("Routing algo '%s' does not implement required ops\n",
 			bat_algo_ops->name);
 		goto out;
diff --git a/net/batman-adv/routing.c b/net/batman-adv/routing.c
index ff56086..7ed9d8f 100644
--- a/net/batman-adv/routing.c
+++ b/net/batman-adv/routing.c
@@ -248,37 +248,35 @@ int window_protected(struct bat_priv *bat_priv, int32_t seq_num_diff,
 	return 0;
 }
 
-int recv_bat_ogm_packet(struct sk_buff *skb, struct hard_iface *hard_iface)
+bool check_management_packet(struct sk_buff *skb,
+			     struct hard_iface *hard_iface,
+			     int header_len)
 {
-	struct bat_priv *bat_priv = netdev_priv(hard_iface->soft_iface);
 	struct ethhdr *ethhdr;
 
 	/* drop packet if it has not necessary minimum size */
-	if (unlikely(!pskb_may_pull(skb, BATMAN_OGM_HLEN)))
-		return NET_RX_DROP;
+	if (unlikely(!pskb_may_pull(skb, header_len)))
+		return false;
 
 	ethhdr = (struct ethhdr *)skb_mac_header(skb);
 
 	/* packet with broadcast indication but unicast recipient */
 	if (!is_broadcast_ether_addr(ethhdr->h_dest))
-		return NET_RX_DROP;
+		return false;
 
 	/* packet with broadcast sender address */
 	if (is_broadcast_ether_addr(ethhdr->h_source))
-		return NET_RX_DROP;
+		return false;
 
 	/* create a copy of the skb, if needed, to modify it. */
 	if (skb_cow(skb, 0) < 0)
-		return NET_RX_DROP;
+		return false;
 
 	/* keep skb linear */
 	if (skb_linearize(skb) < 0)
-		return NET_RX_DROP;
+		return false;
 
-	bat_priv->bat_algo_ops->bat_ogm_receive(hard_iface, skb);
-
-	kfree_skb(skb);
-	return NET_RX_SUCCESS;
+	return true;
 }
 
 static int recv_my_icmp_packet(struct bat_priv *bat_priv,
diff --git a/net/batman-adv/routing.h b/net/batman-adv/routing.h
index 3d729cb..d6bbbeb 100644
--- a/net/batman-adv/routing.h
+++ b/net/batman-adv/routing.h
@@ -23,6 +23,9 @@
 #define _NET_BATMAN_ADV_ROUTING_H_
 
 void slide_own_bcast_window(struct hard_iface *hard_iface);
+bool check_management_packet(struct sk_buff *skb,
+			     struct hard_iface *hard_iface,
+			     int header_len);
 void update_route(struct bat_priv *bat_priv, struct orig_node *orig_node,
 		  struct neigh_node *neigh_node);
 int recv_icmp_packet(struct sk_buff *skb, struct hard_iface *recv_if);
@@ -30,7 +33,6 @@ int recv_unicast_packet(struct sk_buff *skb, struct hard_iface *recv_if);
 int recv_ucast_frag_packet(struct sk_buff *skb, struct hard_iface *recv_if);
 int recv_bcast_packet(struct sk_buff *skb, struct hard_iface *recv_if);
 int recv_vis_packet(struct sk_buff *skb, struct hard_iface *recv_if);
-int recv_bat_ogm_packet(struct sk_buff *skb, struct hard_iface *recv_if);
 int recv_tt_query(struct sk_buff *skb, struct hard_iface *recv_if);
 int recv_roam_adv(struct sk_buff *skb, struct hard_iface *recv_if);
 struct neigh_node *find_router(struct bat_priv *bat_priv,
diff --git a/net/batman-adv/types.h b/net/batman-adv/types.h
index 2f4848b..50e1895 100644
--- a/net/batman-adv/types.h
+++ b/net/batman-adv/types.h
@@ -390,9 +390,6 @@ struct bat_algo_ops {
 				 int tt_num_changes);
 	/* send scheduled OGM */
 	void (*bat_ogm_emit)(struct forw_packet *forw_packet);
-	/* receive incoming OGM */
-	void (*bat_ogm_receive)(struct hard_iface *if_incoming,
-				struct sk_buff *skb);
 };
 
 #endif /* _NET_BATMAN_ADV_TYPES_H_ */
-- 
1.7.9.4

^ permalink raw reply related

* [PATCH 05/14] batman-adv: rename last_valid to last_seen
From: Antonio Quartulli @ 2012-05-09 11:12 UTC (permalink / raw)
  To: davem; +Cc: netdev, b.a.t.m.a.n, Marek Lindner, Antonio Quartulli
In-Reply-To: <1336561976-16088-1-git-send-email-ordex@autistici.org>

From: Marek Lindner <lindner_marek@yahoo.de>

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Acked-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
---
 net/batman-adv/bat_iv_ogm.c |    8 ++++----
 net/batman-adv/originator.c |   16 ++++++++--------
 net/batman-adv/types.h      |    8 ++++----
 3 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/net/batman-adv/bat_iv_ogm.c b/net/batman-adv/bat_iv_ogm.c
index e0aaf8c..8652a75 100644
--- a/net/batman-adv/bat_iv_ogm.c
+++ b/net/batman-adv/bat_iv_ogm.c
@@ -651,7 +651,7 @@ static void bat_iv_ogm_orig_update(struct bat_priv *bat_priv,
 	rcu_read_unlock();
 
 	orig_node->flags = batman_ogm_packet->flags;
-	neigh_node->last_valid = jiffies;
+	neigh_node->last_seen = jiffies;
 
 	spin_lock_bh(&neigh_node->tq_lock);
 	ring_buffer_set(neigh_node->tq_recv,
@@ -772,11 +772,11 @@ static int bat_iv_ogm_calc_tq(struct orig_node *orig_node,
 	if (!neigh_node)
 		goto out;
 
-	/* if orig_node is direct neighbor update neigh_node last_valid */
+	/* if orig_node is direct neighbor update neigh_node last_seen */
 	if (orig_node == orig_neigh_node)
-		neigh_node->last_valid = jiffies;
+		neigh_node->last_seen = jiffies;
 
-	orig_node->last_valid = jiffies;
+	orig_node->last_seen = jiffies;
 
 	/* find packet count of corresponding one hop neighbor */
 	spin_lock_bh(&orig_node->ogm_cnt_lock);
diff --git a/net/batman-adv/originator.c b/net/batman-adv/originator.c
index ce49698..21c1f83 100644
--- a/net/batman-adv/originator.c
+++ b/net/batman-adv/originator.c
@@ -283,7 +283,7 @@ static bool purge_orig_neighbors(struct bat_priv *bat_priv,
 	hlist_for_each_entry_safe(neigh_node, node, node_tmp,
 				  &orig_node->neigh_list, list) {
 
-		if ((has_timed_out(neigh_node->last_valid, PURGE_TIMEOUT)) ||
+		if ((has_timed_out(neigh_node->last_seen, PURGE_TIMEOUT)) ||
 		    (neigh_node->if_incoming->if_status == IF_INACTIVE) ||
 		    (neigh_node->if_incoming->if_status == IF_NOT_IN_USE) ||
 		    (neigh_node->if_incoming->if_status == IF_TO_BE_REMOVED)) {
@@ -300,9 +300,9 @@ static bool purge_orig_neighbors(struct bat_priv *bat_priv,
 					neigh_node->if_incoming->net_dev->name);
 			else
 				bat_dbg(DBG_BATMAN, bat_priv,
-					"neighbor timeout: originator %pM, neighbor: %pM, last_valid: %lu\n",
+					"neighbor timeout: originator %pM, neighbor: %pM, last_seen: %lu\n",
 					orig_node->orig, neigh_node->addr,
-					(neigh_node->last_valid / HZ));
+					(neigh_node->last_seen / HZ));
 
 			neigh_purged = true;
 
@@ -325,10 +325,10 @@ static bool purge_orig_node(struct bat_priv *bat_priv,
 {
 	struct neigh_node *best_neigh_node;
 
-	if (has_timed_out(orig_node->last_valid, 2 * PURGE_TIMEOUT)) {
+	if (has_timed_out(orig_node->last_seen, 2 * PURGE_TIMEOUT)) {
 		bat_dbg(DBG_BATMAN, bat_priv,
-			"Originator timeout: originator %pM, last_valid %lu\n",
-			orig_node->orig, (orig_node->last_valid / HZ));
+			"Originator timeout: originator %pM, last_seen %lu\n",
+			orig_node->orig, (orig_node->last_seen / HZ));
 		return true;
 	} else {
 		if (purge_orig_neighbors(bat_priv, orig_node,
@@ -446,9 +446,9 @@ int orig_seq_print_text(struct seq_file *seq, void *offset)
 				goto next;
 
 			last_seen_secs = jiffies_to_msecs(jiffies -
-						orig_node->last_valid) / 1000;
+						orig_node->last_seen) / 1000;
 			last_seen_msecs = jiffies_to_msecs(jiffies -
-						orig_node->last_valid) % 1000;
+						orig_node->last_seen) % 1000;
 
 			seq_printf(seq, "%pM %4i.%03is   (%3i) %pM [%10s]:",
 				   orig_node->orig, last_seen_secs,
diff --git a/net/batman-adv/types.h b/net/batman-adv/types.h
index 50e1895..9fa8b73 100644
--- a/net/batman-adv/types.h
+++ b/net/batman-adv/types.h
@@ -52,7 +52,7 @@ struct hard_iface {
 /**
  *	orig_node - structure for orig_list maintaining nodes of mesh
  *	@primary_addr: hosts primary interface address
- *	@last_valid: when last packet from this node was received
+ *	@last_seen: when last packet from this node was received
  *	@bcast_seqno_reset: time when the broadcast seqno window was reset
  *	@batman_seqno_reset: time when the batman seqno window was reset
  *	@gw_flags: flags related to gateway class
@@ -70,7 +70,7 @@ struct orig_node {
 	struct neigh_node __rcu *router; /* rcu protected pointer */
 	unsigned long *bcast_own;
 	uint8_t *bcast_own_sum;
-	unsigned long last_valid;
+	unsigned long last_seen;
 	unsigned long bcast_seqno_reset;
 	unsigned long batman_seqno_reset;
 	uint8_t gw_flags;
@@ -120,7 +120,7 @@ struct gw_node {
 
 /**
  *	neigh_node
- *	@last_valid: when last packet via this neighbor was received
+ *	@last_seen: when last packet via this neighbor was received
  */
 struct neigh_node {
 	struct hlist_node list;
@@ -131,7 +131,7 @@ struct neigh_node {
 	uint8_t tq_avg;
 	uint8_t last_ttl;
 	struct list_head bonding_list;
-	unsigned long last_valid;
+	unsigned long last_seen;
 	DECLARE_BITMAP(real_bits, TQ_LOCAL_WINDOW_SIZE);
 	atomic_t refcount;
 	struct rcu_head rcu;
-- 
1.7.9.4

^ permalink raw reply related

* [PATCH 06/14] batman-adv: replace HZ calculations with jiffies_to_msecs()
From: Antonio Quartulli @ 2012-05-09 11:12 UTC (permalink / raw)
  To: davem; +Cc: netdev, b.a.t.m.a.n, Marek Lindner, Antonio Quartulli
In-Reply-To: <1336561976-16088-1-git-send-email-ordex@autistici.org>

From: Marek Lindner <lindner_marek@yahoo.de>

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Acked-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
---
 net/batman-adv/bat_debugfs.c |    4 ++--
 net/batman-adv/originator.c  |   15 ++++++++++-----
 net/batman-adv/send.c        |    2 +-
 3 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/net/batman-adv/bat_debugfs.c b/net/batman-adv/bat_debugfs.c
index 916380c..3b588f8 100644
--- a/net/batman-adv/bat_debugfs.c
+++ b/net/batman-adv/bat_debugfs.c
@@ -83,8 +83,8 @@ int debug_log(struct bat_priv *bat_priv, const char *fmt, ...)
 
 	va_start(args, fmt);
 	vscnprintf(tmp_log_buf, sizeof(tmp_log_buf), fmt, args);
-	fdebug_log(bat_priv->debug_log, "[%10lu] %s",
-		   (jiffies / HZ), tmp_log_buf);
+	fdebug_log(bat_priv->debug_log, "[%10u] %s",
+		   jiffies_to_msecs(jiffies), tmp_log_buf);
 	va_end(args);
 
 	return 0;
diff --git a/net/batman-adv/originator.c b/net/batman-adv/originator.c
index 21c1f83..962636b 100644
--- a/net/batman-adv/originator.c
+++ b/net/batman-adv/originator.c
@@ -35,7 +35,8 @@ static void purge_orig(struct work_struct *work);
 static void start_purge_timer(struct bat_priv *bat_priv)
 {
 	INIT_DELAYED_WORK(&bat_priv->orig_work, purge_orig);
-	queue_delayed_work(bat_event_workqueue, &bat_priv->orig_work, 1 * HZ);
+	queue_delayed_work(bat_event_workqueue,
+			   &bat_priv->orig_work, msecs_to_jiffies(1000));
 }
 
 /* returns 1 if they are the same originator */
@@ -274,6 +275,7 @@ static bool purge_orig_neighbors(struct bat_priv *bat_priv,
 	struct hlist_node *node, *node_tmp;
 	struct neigh_node *neigh_node;
 	bool neigh_purged = false;
+	unsigned long last_seen;
 
 	*best_neigh_node = NULL;
 
@@ -288,6 +290,8 @@ static bool purge_orig_neighbors(struct bat_priv *bat_priv,
 		    (neigh_node->if_incoming->if_status == IF_NOT_IN_USE) ||
 		    (neigh_node->if_incoming->if_status == IF_TO_BE_REMOVED)) {
 
+			last_seen = neigh_node->last_seen;
+
 			if ((neigh_node->if_incoming->if_status ==
 								IF_INACTIVE) ||
 			    (neigh_node->if_incoming->if_status ==
@@ -300,9 +304,9 @@ static bool purge_orig_neighbors(struct bat_priv *bat_priv,
 					neigh_node->if_incoming->net_dev->name);
 			else
 				bat_dbg(DBG_BATMAN, bat_priv,
-					"neighbor timeout: originator %pM, neighbor: %pM, last_seen: %lu\n",
+					"neighbor timeout: originator %pM, neighbor: %pM, last_seen: %u\n",
 					orig_node->orig, neigh_node->addr,
-					(neigh_node->last_seen / HZ));
+					jiffies_to_msecs(last_seen));
 
 			neigh_purged = true;
 
@@ -327,8 +331,9 @@ static bool purge_orig_node(struct bat_priv *bat_priv,
 
 	if (has_timed_out(orig_node->last_seen, 2 * PURGE_TIMEOUT)) {
 		bat_dbg(DBG_BATMAN, bat_priv,
-			"Originator timeout: originator %pM, last_seen %lu\n",
-			orig_node->orig, (orig_node->last_seen / HZ));
+			"Originator timeout: originator %pM, last_seen %u\n",
+			orig_node->orig,
+			jiffies_to_msecs(orig_node->last_seen));
 		return true;
 	} else {
 		if (purge_orig_neighbors(bat_priv, orig_node,
diff --git a/net/batman-adv/send.c b/net/batman-adv/send.c
index 7c66b61..8e74d97 100644
--- a/net/batman-adv/send.c
+++ b/net/batman-adv/send.c
@@ -292,7 +292,7 @@ static void send_outstanding_bcast_packet(struct work_struct *work)
 	/* if we still have some more bcasts to send */
 	if (forw_packet->num_packets < 3) {
 		_add_bcast_packet_to_list(bat_priv, forw_packet,
-					  ((5 * HZ) / 1000));
+					  msecs_to_jiffies(5));
 		return;
 	}
 
-- 
1.7.9.4

^ permalink raw reply related

* pull request: batman-adv 2012-05-09
From: Antonio Quartulli @ 2012-05-09 11:12 UTC (permalink / raw)
  To: davem-fT/PcQaiUtIeIZ0/mPfg9Q
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	b.a.t.m.a.n-ZwoEplunGu2X36UT3dwllkB+6BGkLq7r

Dear David,

here is a new set of changes I would like to propose for next-next/linux-3.5.
Changes proposed here are the same of the last unsuccessful pull request, issued
on 2012-05-01, plus some other minor fixes.

The patchset is based on your net-next tree.
Please, let me know if there is any problem.


Sorry again for the touble I caused last time, but it was really a mistake, it
was not my intention to let you pull an entire wrong tree.



Thank you very much,
		Antonio


The following changes since commit 9bb862beb6e5839e92f709d33fda07678f062f20:

  Merge branch 'master' of git://1984.lsi.us.es/net-next (2012-05-08 14:40:21 -0400)

are available in the git repository at:


  git://git.open-mesh.org/linux-merge.git tags/batman-adv-for-davem

for you to fetch changes up to ab74e43c05964efe75472b5433ede8800ecbfc43:

  batman-adv: add contributor name (2012-05-09 09:54:55 +0200)

----------------------------------------------------------------
Included changes:

* fix a little bug in the DHCP packet snooping introduced so far
* minor fixes and cleanups
* minor routing protocol API cleanups
* add a new contributor name to translation-table.{c,h}
* update copyright years in file headers

----------------------------------------------------------------
Antonio Quartulli (3):
      batman-adv: fix wrong dhcp option list browsing
      batman-adv: update copyright years
      batman-adv: add contributor name

Linus Luessing (1):
      batman-adv: Adding hard_iface specific sysfs wrapper macros for UINT

Marek Lindner (10):
      batman-adv: introduce is_single_hop_neigh variable to increase readability
      batman-adv: introduce packet type handler array for incoming packets
      batman-adv: register batman ogm receive function during protocol init
      batman-adv: rename last_valid to last_seen
      batman-adv: replace HZ calculations with jiffies_to_msecs()
      batman-adv: split neigh_new function into generic and batman iv specific parts
      batman-adv: ignore protocol packets if the interface did not enable this protocol
      batman-adv: refactoring API: find generalized name for bat_ogm_update_mac callback
      batman-adv: rename sysfs macros to reflect the soft-interface dependency
      batman-adv: fix checkpatch string complaint

 net/batman-adv/bat_debugfs.c           |    4 +-
 net/batman-adv/bat_iv_ogm.c            |  176 +++++++++++++++++++++-----------
 net/batman-adv/bat_sysfs.c             |  100 +++++++++++++-----
 net/batman-adv/bridge_loop_avoidance.c |    2 +-
 net/batman-adv/bridge_loop_avoidance.h |    2 +-
 net/batman-adv/gateway_client.c        |    6 +-
 net/batman-adv/hard-interface.c        |  117 +--------------------
 net/batman-adv/main.c                  |  124 +++++++++++++++++++++-
 net/batman-adv/main.h                  |    6 ++
 net/batman-adv/originator.c            |   50 +++++----
 net/batman-adv/originator.h            |    6 +-
 net/batman-adv/packet.h                |    1 +
 net/batman-adv/routing.c               |   22 ++--
 net/batman-adv/routing.h               |    4 +-
 net/batman-adv/send.c                  |    2 +-
 net/batman-adv/translation-table.c     |    2 +-
 net/batman-adv/translation-table.h     |    2 +-
 net/batman-adv/types.h                 |   16 ++-
 18 files changed, 377 insertions(+), 265 deletions(-)

^ permalink raw reply

* [PATCH 07/14] batman-adv: split neigh_new function into generic and batman iv specific parts
From: Antonio Quartulli @ 2012-05-09 11:12 UTC (permalink / raw)
  To: davem; +Cc: netdev, b.a.t.m.a.n, Marek Lindner, Antonio Quartulli
In-Reply-To: <1336561976-16088-1-git-send-email-ordex@autistici.org>

From: Marek Lindner <lindner_marek@yahoo.de>

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Acked-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
---
 net/batman-adv/bat_iv_ogm.c |   40 ++++++++++++++++++++++++++++++++++------
 net/batman-adv/originator.c |   27 ++++++++++-----------------
 net/batman-adv/originator.h |    6 ++----
 3 files changed, 46 insertions(+), 27 deletions(-)

diff --git a/net/batman-adv/bat_iv_ogm.c b/net/batman-adv/bat_iv_ogm.c
index 8652a75..4baabf9 100644
--- a/net/batman-adv/bat_iv_ogm.c
+++ b/net/batman-adv/bat_iv_ogm.c
@@ -30,6 +30,32 @@
 #include "send.h"
 #include "bat_algo.h"
 
+static struct neigh_node *bat_iv_ogm_neigh_new(struct hard_iface *hard_iface,
+					       const uint8_t *neigh_addr,
+					       struct orig_node *orig_node,
+					       struct orig_node *orig_neigh,
+					       uint32_t seqno)
+{
+	struct neigh_node *neigh_node;
+
+	neigh_node = neigh_node_new(hard_iface, neigh_addr, seqno);
+	if (!neigh_node)
+		goto out;
+
+	INIT_LIST_HEAD(&neigh_node->bonding_list);
+	spin_lock_init(&neigh_node->tq_lock);
+
+	neigh_node->orig_node = orig_neigh;
+	neigh_node->if_incoming = hard_iface;
+
+	spin_lock_bh(&orig_node->neigh_list_lock);
+	hlist_add_head_rcu(&neigh_node->list, &orig_node->neigh_list);
+	spin_unlock_bh(&orig_node->neigh_list_lock);
+
+out:
+	return neigh_node;
+}
+
 static int bat_iv_ogm_iface_enable(struct hard_iface *hard_iface)
 {
 	struct batman_ogm_packet *batman_ogm_packet;
@@ -638,8 +664,9 @@ static void bat_iv_ogm_orig_update(struct bat_priv *bat_priv,
 		if (!orig_tmp)
 			goto unlock;
 
-		neigh_node = create_neighbor(orig_node, orig_tmp,
-					     ethhdr->h_source, if_incoming);
+		neigh_node = bat_iv_ogm_neigh_new(if_incoming, ethhdr->h_source,
+						  orig_node, orig_tmp,
+						  batman_ogm_packet->seqno);
 
 		orig_node_free_ref(orig_tmp);
 		if (!neigh_node)
@@ -764,10 +791,11 @@ static int bat_iv_ogm_calc_tq(struct orig_node *orig_node,
 	rcu_read_unlock();
 
 	if (!neigh_node)
-		neigh_node = create_neighbor(orig_neigh_node,
-					     orig_neigh_node,
-					     orig_neigh_node->orig,
-					     if_incoming);
+		neigh_node = bat_iv_ogm_neigh_new(if_incoming,
+						  orig_neigh_node->orig,
+						  orig_neigh_node,
+						  orig_neigh_node,
+						  batman_ogm_packet->seqno);
 
 	if (!neigh_node)
 		goto out;
diff --git a/net/batman-adv/originator.c b/net/batman-adv/originator.c
index 962636b..1bcc9c5 100644
--- a/net/batman-adv/originator.c
+++ b/net/batman-adv/originator.c
@@ -85,35 +85,28 @@ struct neigh_node *orig_node_get_router(struct orig_node *orig_node)
 	return router;
 }
 
-struct neigh_node *create_neighbor(struct orig_node *orig_node,
-				   struct orig_node *orig_neigh_node,
-				   const uint8_t *neigh,
-				   struct hard_iface *if_incoming)
+struct neigh_node *neigh_node_new(struct hard_iface *hard_iface,
+				  const uint8_t *neigh_addr, uint32_t seqno)
 {
-	struct bat_priv *bat_priv = netdev_priv(if_incoming->soft_iface);
+	struct bat_priv *bat_priv = netdev_priv(hard_iface->soft_iface);
 	struct neigh_node *neigh_node;
 
-	bat_dbg(DBG_BATMAN, bat_priv,
-		"Creating new last-hop neighbor of originator\n");
-
 	neigh_node = kzalloc(sizeof(*neigh_node), GFP_ATOMIC);
 	if (!neigh_node)
-		return NULL;
+		goto out;
 
 	INIT_HLIST_NODE(&neigh_node->list);
-	INIT_LIST_HEAD(&neigh_node->bonding_list);
-	spin_lock_init(&neigh_node->tq_lock);
 
-	memcpy(neigh_node->addr, neigh, ETH_ALEN);
-	neigh_node->orig_node = orig_neigh_node;
-	neigh_node->if_incoming = if_incoming;
+	memcpy(neigh_node->addr, neigh_addr, ETH_ALEN);
 
 	/* extra reference for return */
 	atomic_set(&neigh_node->refcount, 2);
 
-	spin_lock_bh(&orig_node->neigh_list_lock);
-	hlist_add_head_rcu(&neigh_node->list, &orig_node->neigh_list);
-	spin_unlock_bh(&orig_node->neigh_list_lock);
+	bat_dbg(DBG_BATMAN, bat_priv,
+		"Creating new neighbor %pM, initial seqno %d\n",
+		neigh_addr, seqno);
+
+out:
 	return neigh_node;
 }
 
diff --git a/net/batman-adv/originator.h b/net/batman-adv/originator.h
index 3fe2eda..64c5d94 100644
--- a/net/batman-adv/originator.h
+++ b/net/batman-adv/originator.h
@@ -29,10 +29,8 @@ void originator_free(struct bat_priv *bat_priv);
 void purge_orig_ref(struct bat_priv *bat_priv);
 void orig_node_free_ref(struct orig_node *orig_node);
 struct orig_node *get_orig_node(struct bat_priv *bat_priv, const uint8_t *addr);
-struct neigh_node *create_neighbor(struct orig_node *orig_node,
-				   struct orig_node *orig_neigh_node,
-				   const uint8_t *neigh,
-				   struct hard_iface *if_incoming);
+struct neigh_node *neigh_node_new(struct hard_iface *hard_iface,
+				  const uint8_t *neigh_addr, uint32_t seqno);
 void neigh_node_free_ref(struct neigh_node *neigh_node);
 struct neigh_node *orig_node_get_router(struct orig_node *orig_node);
 int orig_seq_print_text(struct seq_file *seq, void *offset);
-- 
1.7.9.4

^ permalink raw reply related

* [PATCH 01/14] batman-adv: fix wrong dhcp option list browsing
From: Antonio Quartulli @ 2012-05-09 11:12 UTC (permalink / raw)
  To: davem-fT/PcQaiUtIeIZ0/mPfg9Q
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	b.a.t.m.a.n-ZwoEplunGu2X36UT3dwllkB+6BGkLq7r
In-Reply-To: <1336561976-16088-1-git-send-email-ordex-GaUfNO9RBHfsrOwW+9ziJQ@public.gmane.org>

In is_type_dhcprequest(), while parsing a DHCP message, if the entry we found in
the option list is neither a padding nor the dhcp-type, we have to ignore it and
jump as many bytes as its length + 1. The "+ 1" byte is given by the subtype
field itself that has to be jumped too.

Reported-by: Marek Lindner <lindner_marek-LWAfsSFWpa4@public.gmane.org>
Signed-off-by: Antonio Quartulli <ordex-GaUfNO9RBHfsrOwW+9ziJQ@public.gmane.org>
---
 net/batman-adv/gateway_client.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/batman-adv/gateway_client.c b/net/batman-adv/gateway_client.c
index 6f9b9b7..47f7186 100644
--- a/net/batman-adv/gateway_client.c
+++ b/net/batman-adv/gateway_client.c
@@ -558,10 +558,10 @@ static bool is_type_dhcprequest(struct sk_buff *skb, int header_len)
 			p++;
 
 			/* ...and then we jump over the data */
-			if (pkt_len < *p)
+			if (pkt_len < 1 + (*p))
 				goto out;
-			pkt_len -= *p;
-			p += (*p);
+			pkt_len -= 1 + (*p);
+			p += 1 + (*p);
 		}
 	}
 out:
-- 
1.7.9.4

^ permalink raw reply related

* [PATCH 08/14] batman-adv: ignore protocol packets if the interface did not enable this protocol
From: Antonio Quartulli @ 2012-05-09 11:12 UTC (permalink / raw)
  To: davem; +Cc: netdev, b.a.t.m.a.n, Marek Lindner, Antonio Quartulli
In-Reply-To: <1336561976-16088-1-git-send-email-ordex@autistici.org>

From: Marek Lindner <lindner_marek@yahoo.de>

Reported-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
---
 net/batman-adv/bat_iv_ogm.c |    6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/net/batman-adv/bat_iv_ogm.c b/net/batman-adv/bat_iv_ogm.c
index 4baabf9..689fc98 100644
--- a/net/batman-adv/bat_iv_ogm.c
+++ b/net/batman-adv/bat_iv_ogm.c
@@ -1186,6 +1186,7 @@ out:
 static int bat_iv_ogm_receive(struct sk_buff *skb,
 			      struct hard_iface *if_incoming)
 {
+	struct bat_priv *bat_priv = netdev_priv(if_incoming->soft_iface);
 	struct batman_ogm_packet *batman_ogm_packet;
 	struct ethhdr *ethhdr;
 	int buff_pos = 0, packet_len;
@@ -1196,6 +1197,11 @@ static int bat_iv_ogm_receive(struct sk_buff *skb,
 	if (!ret)
 		return NET_RX_DROP;
 
+	/* did we receive a B.A.T.M.A.N. IV OGM packet on an interface
+	 * that does not have B.A.T.M.A.N. IV enabled ? */
+	if (bat_priv->bat_algo_ops->bat_ogm_emit != bat_iv_ogm_emit)
+		return NET_RX_DROP;
+
 	packet_len = skb_headlen(skb);
 	ethhdr = (struct ethhdr *)skb_mac_header(skb);
 	packet_buff = skb->data;
-- 
1.7.9.4

^ permalink raw reply related

* [PATCH 09/14] batman-adv: refactoring API: find generalized name for bat_ogm_update_mac callback
From: Antonio Quartulli @ 2012-05-09 11:12 UTC (permalink / raw)
  To: davem; +Cc: netdev, b.a.t.m.a.n, Marek Lindner, Antonio Quartulli
In-Reply-To: <1336561976-16088-1-git-send-email-ordex@autistici.org>

From: Marek Lindner <lindner_marek@yahoo.de>

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
---
 net/batman-adv/bat_iv_ogm.c     |   24 ++++++++++++------------
 net/batman-adv/hard-interface.c |    4 ++--
 net/batman-adv/main.c           |    2 +-
 net/batman-adv/types.h          |    5 +++--
 4 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/net/batman-adv/bat_iv_ogm.c b/net/batman-adv/bat_iv_ogm.c
index 689fc98..a0fe1de 100644
--- a/net/batman-adv/bat_iv_ogm.c
+++ b/net/batman-adv/bat_iv_ogm.c
@@ -93,6 +93,17 @@ static void bat_iv_ogm_iface_disable(struct hard_iface *hard_iface)
 	hard_iface->packet_buff = NULL;
 }
 
+static void bat_iv_ogm_iface_update_mac(struct hard_iface *hard_iface)
+{
+	struct batman_ogm_packet *batman_ogm_packet;
+
+	batman_ogm_packet = (struct batman_ogm_packet *)hard_iface->packet_buff;
+	memcpy(batman_ogm_packet->orig,
+	       hard_iface->net_dev->dev_addr, ETH_ALEN);
+	memcpy(batman_ogm_packet->prev_sender,
+	       hard_iface->net_dev->dev_addr, ETH_ALEN);
+}
+
 static void bat_iv_ogm_primary_iface_set(struct hard_iface *hard_iface)
 {
 	struct batman_ogm_packet *batman_ogm_packet;
@@ -102,17 +113,6 @@ static void bat_iv_ogm_primary_iface_set(struct hard_iface *hard_iface)
 	batman_ogm_packet->header.ttl = TTL;
 }
 
-static void bat_iv_ogm_update_mac(struct hard_iface *hard_iface)
-{
-	struct batman_ogm_packet *batman_ogm_packet;
-
-	batman_ogm_packet = (struct batman_ogm_packet *)hard_iface->packet_buff;
-	memcpy(batman_ogm_packet->orig,
-	       hard_iface->net_dev->dev_addr, ETH_ALEN);
-	memcpy(batman_ogm_packet->prev_sender,
-	       hard_iface->net_dev->dev_addr, ETH_ALEN);
-}
-
 /* when do we schedule our own ogm to be sent */
 static unsigned long bat_iv_ogm_emit_send_time(const struct bat_priv *bat_priv)
 {
@@ -1235,8 +1235,8 @@ static struct bat_algo_ops batman_iv __read_mostly = {
 	.name = "BATMAN IV",
 	.bat_iface_enable = bat_iv_ogm_iface_enable,
 	.bat_iface_disable = bat_iv_ogm_iface_disable,
+	.bat_iface_update_mac = bat_iv_ogm_iface_update_mac,
 	.bat_primary_iface_set = bat_iv_ogm_primary_iface_set,
-	.bat_ogm_update_mac = bat_iv_ogm_update_mac,
 	.bat_ogm_schedule = bat_iv_ogm_schedule,
 	.bat_ogm_emit = bat_iv_ogm_emit,
 };
diff --git a/net/batman-adv/hard-interface.c b/net/batman-adv/hard-interface.c
index 95f869c..0b84bb1 100644
--- a/net/batman-adv/hard-interface.c
+++ b/net/batman-adv/hard-interface.c
@@ -228,7 +228,7 @@ static void hardif_activate_interface(struct hard_iface *hard_iface)
 
 	bat_priv = netdev_priv(hard_iface->soft_iface);
 
-	bat_priv->bat_algo_ops->bat_ogm_update_mac(hard_iface);
+	bat_priv->bat_algo_ops->bat_iface_update_mac(hard_iface);
 	hard_iface->if_status = IF_TO_BE_ACTIVATED;
 
 	/**
@@ -524,7 +524,7 @@ static int hard_if_event(struct notifier_block *this,
 		check_known_mac_addr(hard_iface->net_dev);
 
 		bat_priv = netdev_priv(hard_iface->soft_iface);
-		bat_priv->bat_algo_ops->bat_ogm_update_mac(hard_iface);
+		bat_priv->bat_algo_ops->bat_iface_update_mac(hard_iface);
 
 		primary_if = primary_if_get_selected(bat_priv);
 		if (!primary_if)
diff --git a/net/batman-adv/main.c b/net/batman-adv/main.c
index f80c447..083a299 100644
--- a/net/batman-adv/main.c
+++ b/net/batman-adv/main.c
@@ -329,8 +329,8 @@ int bat_algo_register(struct bat_algo_ops *bat_algo_ops)
 	/* all algorithms must implement all ops (for now) */
 	if (!bat_algo_ops->bat_iface_enable ||
 	    !bat_algo_ops->bat_iface_disable ||
+	    !bat_algo_ops->bat_iface_update_mac ||
 	    !bat_algo_ops->bat_primary_iface_set ||
-	    !bat_algo_ops->bat_ogm_update_mac ||
 	    !bat_algo_ops->bat_ogm_schedule ||
 	    !bat_algo_ops->bat_ogm_emit) {
 		pr_info("Routing algo '%s' does not implement required ops\n",
diff --git a/net/batman-adv/types.h b/net/batman-adv/types.h
index 9fa8b73..2873a90 100644
--- a/net/batman-adv/types.h
+++ b/net/batman-adv/types.h
@@ -381,10 +381,11 @@ struct bat_algo_ops {
 	int (*bat_iface_enable)(struct hard_iface *hard_iface);
 	/* de-init routing info when hard-interface is disabled */
 	void (*bat_iface_disable)(struct hard_iface *hard_iface);
+	/* (re-)init mac addresses of the protocol information
+	 * belonging to this hard-interface */
+	void (*bat_iface_update_mac)(struct hard_iface *hard_iface);
 	/* called when primary interface is selected / changed */
 	void (*bat_primary_iface_set)(struct hard_iface *hard_iface);
-	/* init mac addresses of the OGM belonging to this hard-interface */
-	void (*bat_ogm_update_mac)(struct hard_iface *hard_iface);
 	/* prepare a new outgoing OGM for the send queue */
 	void (*bat_ogm_schedule)(struct hard_iface *hard_iface,
 				 int tt_num_changes);
-- 
1.7.9.4

^ permalink raw reply related

* [PATCH 03/14] batman-adv: introduce packet type handler array for incoming packets
From: Antonio Quartulli @ 2012-05-09 11:12 UTC (permalink / raw)
  To: davem-fT/PcQaiUtIeIZ0/mPfg9Q
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	b.a.t.m.a.n-ZwoEplunGu2X36UT3dwllkB+6BGkLq7r, Marek Lindner
In-Reply-To: <1336561976-16088-1-git-send-email-ordex-GaUfNO9RBHfsrOwW+9ziJQ@public.gmane.org>

From: Marek Lindner <lindner_marek-LWAfsSFWpa4@public.gmane.org>

The packet handler array replaces the growing switch statement, thus
dealing with incoming packets in a more efficient way. It also adds
to possibility to register packet handlers on the fly.

Signed-off-by: Marek Lindner <lindner_marek-LWAfsSFWpa4@public.gmane.org>
Acked-by: Simon Wunderlich <siwu-MaAgPAbsBIVS8oHt8HbXEIQuADTiUCJX@public.gmane.org>
Signed-off-by: Antonio Quartulli <ordex-GaUfNO9RBHfsrOwW+9ziJQ@public.gmane.org>
---
 net/batman-adv/hard-interface.c |  113 ------------------------------------
 net/batman-adv/main.c           |  121 +++++++++++++++++++++++++++++++++++++++
 net/batman-adv/main.h           |    6 ++
 3 files changed, 127 insertions(+), 113 deletions(-)

diff --git a/net/batman-adv/hard-interface.c b/net/batman-adv/hard-interface.c
index 47c79d7..95f869c 100644
--- a/net/batman-adv/hard-interface.c
+++ b/net/batman-adv/hard-interface.c
@@ -32,12 +32,6 @@
 
 #include <linux/if_arp.h>
 
-
-static int batman_skb_recv(struct sk_buff *skb,
-			   struct net_device *dev,
-			   struct packet_type *ptype,
-			   struct net_device *orig_dev);
-
 void hardif_free_rcu(struct rcu_head *rcu)
 {
 	struct hard_iface *hard_iface;
@@ -551,113 +545,6 @@ out:
 	return NOTIFY_DONE;
 }
 
-/* incoming packets with the batman ethertype received on any active hard
- * interface */
-static int batman_skb_recv(struct sk_buff *skb, struct net_device *dev,
-			   struct packet_type *ptype,
-			   struct net_device *orig_dev)
-{
-	struct bat_priv *bat_priv;
-	struct batman_ogm_packet *batman_ogm_packet;
-	struct hard_iface *hard_iface;
-	int ret;
-
-	hard_iface = container_of(ptype, struct hard_iface, batman_adv_ptype);
-	skb = skb_share_check(skb, GFP_ATOMIC);
-
-	/* skb was released by skb_share_check() */
-	if (!skb)
-		goto err_out;
-
-	/* packet should hold at least type and version */
-	if (unlikely(!pskb_may_pull(skb, 2)))
-		goto err_free;
-
-	/* expect a valid ethernet header here. */
-	if (unlikely(skb->mac_len != ETH_HLEN || !skb_mac_header(skb)))
-		goto err_free;
-
-	if (!hard_iface->soft_iface)
-		goto err_free;
-
-	bat_priv = netdev_priv(hard_iface->soft_iface);
-
-	if (atomic_read(&bat_priv->mesh_state) != MESH_ACTIVE)
-		goto err_free;
-
-	/* discard frames on not active interfaces */
-	if (hard_iface->if_status != IF_ACTIVE)
-		goto err_free;
-
-	batman_ogm_packet = (struct batman_ogm_packet *)skb->data;
-
-	if (batman_ogm_packet->header.version != COMPAT_VERSION) {
-		bat_dbg(DBG_BATMAN, bat_priv,
-			"Drop packet: incompatible batman version (%i)\n",
-			batman_ogm_packet->header.version);
-		goto err_free;
-	}
-
-	/* all receive handlers return whether they received or reused
-	 * the supplied skb. if not, we have to free the skb. */
-
-	switch (batman_ogm_packet->header.packet_type) {
-		/* batman originator packet */
-	case BAT_IV_OGM:
-		ret = recv_bat_ogm_packet(skb, hard_iface);
-		break;
-
-		/* batman icmp packet */
-	case BAT_ICMP:
-		ret = recv_icmp_packet(skb, hard_iface);
-		break;
-
-		/* unicast packet */
-	case BAT_UNICAST:
-		ret = recv_unicast_packet(skb, hard_iface);
-		break;
-
-		/* fragmented unicast packet */
-	case BAT_UNICAST_FRAG:
-		ret = recv_ucast_frag_packet(skb, hard_iface);
-		break;
-
-		/* broadcast packet */
-	case BAT_BCAST:
-		ret = recv_bcast_packet(skb, hard_iface);
-		break;
-
-		/* vis packet */
-	case BAT_VIS:
-		ret = recv_vis_packet(skb, hard_iface);
-		break;
-		/* Translation table query (request or response) */
-	case BAT_TT_QUERY:
-		ret = recv_tt_query(skb, hard_iface);
-		break;
-		/* Roaming advertisement */
-	case BAT_ROAM_ADV:
-		ret = recv_roam_adv(skb, hard_iface);
-		break;
-	default:
-		ret = NET_RX_DROP;
-	}
-
-	if (ret == NET_RX_DROP)
-		kfree_skb(skb);
-
-	/* return NET_RX_SUCCESS in any case as we
-	 * most probably dropped the packet for
-	 * routing-logical reasons. */
-
-	return NET_RX_SUCCESS;
-
-err_free:
-	kfree_skb(skb);
-err_out:
-	return NET_RX_DROP;
-}
-
 /* This function returns true if the interface represented by ifindex is a
  * 802.11 wireless device */
 bool is_wifi_iface(int ifindex)
diff --git a/net/batman-adv/main.c b/net/batman-adv/main.c
index 7913272..d19b935 100644
--- a/net/batman-adv/main.c
+++ b/net/batman-adv/main.c
@@ -39,6 +39,7 @@
 /* List manipulations on hardif_list have to be rtnl_lock()'ed,
  * list traversals just rcu-locked */
 struct list_head hardif_list;
+static int (*recv_packet_handler[256])(struct sk_buff *, struct hard_iface *);
 char bat_routing_algo[20] = "BATMAN IV";
 static struct hlist_head bat_algo_list;
 
@@ -46,11 +47,15 @@ unsigned char broadcast_addr[] = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff};
 
 struct workqueue_struct *bat_event_workqueue;
 
+static void recv_handler_init(void);
+
 static int __init batman_init(void)
 {
 	INIT_LIST_HEAD(&hardif_list);
 	INIT_HLIST_HEAD(&bat_algo_list);
 
+	recv_handler_init();
+
 	bat_iv_init();
 
 	/* the name should not be longer than 10 chars - see
@@ -179,6 +184,122 @@ int is_my_mac(const uint8_t *addr)
 	return 0;
 }
 
+static int recv_unhandled_packet(struct sk_buff *skb,
+				 struct hard_iface *recv_if)
+{
+	return NET_RX_DROP;
+}
+
+/* incoming packets with the batman ethertype received on any active hard
+ * interface
+ */
+int batman_skb_recv(struct sk_buff *skb, struct net_device *dev,
+		    struct packet_type *ptype, struct net_device *orig_dev)
+{
+	struct bat_priv *bat_priv;
+	struct batman_ogm_packet *batman_ogm_packet;
+	struct hard_iface *hard_iface;
+	uint8_t idx;
+	int ret;
+
+	hard_iface = container_of(ptype, struct hard_iface, batman_adv_ptype);
+	skb = skb_share_check(skb, GFP_ATOMIC);
+
+	/* skb was released by skb_share_check() */
+	if (!skb)
+		goto err_out;
+
+	/* packet should hold at least type and version */
+	if (unlikely(!pskb_may_pull(skb, 2)))
+		goto err_free;
+
+	/* expect a valid ethernet header here. */
+	if (unlikely(skb->mac_len != ETH_HLEN || !skb_mac_header(skb)))
+		goto err_free;
+
+	if (!hard_iface->soft_iface)
+		goto err_free;
+
+	bat_priv = netdev_priv(hard_iface->soft_iface);
+
+	if (atomic_read(&bat_priv->mesh_state) != MESH_ACTIVE)
+		goto err_free;
+
+	/* discard frames on not active interfaces */
+	if (hard_iface->if_status != IF_ACTIVE)
+		goto err_free;
+
+	batman_ogm_packet = (struct batman_ogm_packet *)skb->data;
+
+	if (batman_ogm_packet->header.version != COMPAT_VERSION) {
+		bat_dbg(DBG_BATMAN, bat_priv,
+			"Drop packet: incompatible batman version (%i)\n",
+			batman_ogm_packet->header.version);
+		goto err_free;
+	}
+
+	/* all receive handlers return whether they received or reused
+	 * the supplied skb. if not, we have to free the skb.
+	 */
+	idx = batman_ogm_packet->header.packet_type;
+	ret = (*recv_packet_handler[idx])(skb, hard_iface);
+
+	if (ret == NET_RX_DROP)
+		kfree_skb(skb);
+
+	/* return NET_RX_SUCCESS in any case as we
+	 * most probably dropped the packet for
+	 * routing-logical reasons.
+	 */
+	return NET_RX_SUCCESS;
+
+err_free:
+	kfree_skb(skb);
+err_out:
+	return NET_RX_DROP;
+}
+
+static void recv_handler_init(void)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(recv_packet_handler); i++)
+		recv_packet_handler[i] = recv_unhandled_packet;
+
+	/* batman originator packet */
+	recv_packet_handler[BAT_IV_OGM] = recv_bat_ogm_packet;
+	/* batman icmp packet */
+	recv_packet_handler[BAT_ICMP] = recv_icmp_packet;
+	/* unicast packet */
+	recv_packet_handler[BAT_UNICAST] = recv_unicast_packet;
+	/* fragmented unicast packet */
+	recv_packet_handler[BAT_UNICAST_FRAG] = recv_ucast_frag_packet;
+	/* broadcast packet */
+	recv_packet_handler[BAT_BCAST] = recv_bcast_packet;
+	/* vis packet */
+	recv_packet_handler[BAT_VIS] = recv_vis_packet;
+	/* Translation table query (request or response) */
+	recv_packet_handler[BAT_TT_QUERY] = recv_tt_query;
+	/* Roaming advertisement */
+	recv_packet_handler[BAT_ROAM_ADV] = recv_roam_adv;
+}
+
+int recv_handler_register(uint8_t packet_type,
+			  int (*recv_handler)(struct sk_buff *,
+					      struct hard_iface *))
+{
+	if (recv_packet_handler[packet_type] != &recv_unhandled_packet)
+		return -EBUSY;
+
+	recv_packet_handler[packet_type] = recv_handler;
+	return 0;
+}
+
+void recv_handler_unregister(uint8_t packet_type)
+{
+	recv_packet_handler[packet_type] = recv_unhandled_packet;
+}
+
 static struct bat_algo_ops *bat_algo_get(char *name)
 {
 	struct bat_algo_ops *bat_algo_ops = NULL, *bat_algo_ops_tmp;
diff --git a/net/batman-adv/main.h b/net/batman-adv/main.h
index d9832ac..fd83acd 100644
--- a/net/batman-adv/main.h
+++ b/net/batman-adv/main.h
@@ -155,6 +155,12 @@ void mesh_free(struct net_device *soft_iface);
 void inc_module_count(void);
 void dec_module_count(void);
 int is_my_mac(const uint8_t *addr);
+int batman_skb_recv(struct sk_buff *skb, struct net_device *dev,
+		    struct packet_type *ptype, struct net_device *orig_dev);
+int recv_handler_register(uint8_t packet_type,
+			  int (*recv_handler)(struct sk_buff *,
+					      struct hard_iface *));
+void recv_handler_unregister(uint8_t packet_type);
 int bat_algo_register(struct bat_algo_ops *bat_algo_ops);
 int bat_algo_select(struct bat_priv *bat_priv, char *name);
 int bat_algo_seq_print_text(struct seq_file *seq, void *offset);
-- 
1.7.9.4

^ permalink raw reply related

* [PATCH 10/14] batman-adv: rename sysfs macros to reflect the soft-interface dependency
From: Antonio Quartulli @ 2012-05-09 11:12 UTC (permalink / raw)
  To: davem; +Cc: netdev, b.a.t.m.a.n, Marek Lindner, Antonio Quartulli
In-Reply-To: <1336561976-16088-1-git-send-email-ordex@autistici.org>

From: Marek Lindner <lindner_marek@yahoo.de>

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
---
 net/batman-adv/bat_sysfs.c |   57 ++++++++++++++++++++++----------------------
 1 file changed, 29 insertions(+), 28 deletions(-)

diff --git a/net/batman-adv/bat_sysfs.c b/net/batman-adv/bat_sysfs.c
index 2c81688..913299d 100644
--- a/net/batman-adv/bat_sysfs.c
+++ b/net/batman-adv/bat_sysfs.c
@@ -63,7 +63,7 @@ struct bat_attribute bat_attr_##_name = {	\
 	.store  = _store,			\
 };
 
-#define BAT_ATTR_STORE_BOOL(_name, _post_func)				\
+#define BAT_ATTR_SIF_STORE_BOOL(_name, _post_func)			\
 ssize_t store_##_name(struct kobject *kobj, struct attribute *attr,	\
 		      char *buff, size_t count)				\
 {									\
@@ -73,9 +73,9 @@ ssize_t store_##_name(struct kobject *kobj, struct attribute *attr,	\
 				 &bat_priv->_name, net_dev);		\
 }
 
-#define BAT_ATTR_SHOW_BOOL(_name)					\
-ssize_t show_##_name(struct kobject *kobj, struct attribute *attr,	\
-			    char *buff)					\
+#define BAT_ATTR_SIF_SHOW_BOOL(_name)					\
+ssize_t show_##_name(struct kobject *kobj,				\
+		     struct attribute *attr, char *buff)		\
 {									\
 	struct bat_priv *bat_priv = kobj_to_batpriv(kobj);		\
 	return sprintf(buff, "%s\n",					\
@@ -83,16 +83,17 @@ ssize_t show_##_name(struct kobject *kobj, struct attribute *attr,	\
 		       "disabled" : "enabled");				\
 }									\
 
-/* Use this, if you are going to turn a [name] in bat_priv on or off */
-#define BAT_ATTR_BOOL(_name, _mode, _post_func)				\
-	static BAT_ATTR_STORE_BOOL(_name, _post_func)			\
-	static BAT_ATTR_SHOW_BOOL(_name)				\
+/* Use this, if you are going to turn a [name] in the soft-interface
+ * (bat_priv) on or off */
+#define BAT_ATTR_SIF_BOOL(_name, _mode, _post_func)			\
+	static BAT_ATTR_SIF_STORE_BOOL(_name, _post_func)		\
+	static BAT_ATTR_SIF_SHOW_BOOL(_name)				\
 	static BAT_ATTR(_name, _mode, show_##_name, store_##_name)
 
 
-#define BAT_ATTR_STORE_UINT(_name, _min, _max, _post_func)		\
+#define BAT_ATTR_SIF_STORE_UINT(_name, _min, _max, _post_func)		\
 ssize_t store_##_name(struct kobject *kobj, struct attribute *attr,	\
-			     char *buff, size_t count)			\
+		      char *buff, size_t count)				\
 {									\
 	struct net_device *net_dev = kobj_to_netdev(kobj);		\
 	struct bat_priv *bat_priv = netdev_priv(net_dev);		\
@@ -100,19 +101,19 @@ ssize_t store_##_name(struct kobject *kobj, struct attribute *attr,	\
 				 attr, &bat_priv->_name, net_dev);	\
 }
 
-#define BAT_ATTR_SHOW_UINT(_name)					\
-ssize_t show_##_name(struct kobject *kobj, struct attribute *attr,	\
-			    char *buff)					\
+#define BAT_ATTR_SIF_SHOW_UINT(_name)					\
+ssize_t show_##_name(struct kobject *kobj,				\
+		     struct attribute *attr, char *buff)		\
 {									\
 	struct bat_priv *bat_priv = kobj_to_batpriv(kobj);		\
 	return sprintf(buff, "%i\n", atomic_read(&bat_priv->_name));	\
 }									\
 
-/* Use this, if you are going to set [name] in bat_priv to unsigned integer
- * values only */
-#define BAT_ATTR_UINT(_name, _mode, _min, _max, _post_func)		\
-	static BAT_ATTR_STORE_UINT(_name, _min, _max, _post_func)	\
-	static BAT_ATTR_SHOW_UINT(_name)				\
+/* Use this, if you are going to set [name] in the soft-interface
+ * (bat_priv) to an unsigned integer value */
+#define BAT_ATTR_SIF_UINT(_name, _mode, _min, _max, _post_func)		\
+	static BAT_ATTR_SIF_STORE_UINT(_name, _min, _max, _post_func)	\
+	static BAT_ATTR_SIF_SHOW_UINT(_name)				\
 	static BAT_ATTR(_name, _mode, show_##_name, store_##_name)
 
 
@@ -384,24 +385,24 @@ static ssize_t store_gw_bwidth(struct kobject *kobj, struct attribute *attr,
 	return gw_bandwidth_set(net_dev, buff, count);
 }
 
-BAT_ATTR_BOOL(aggregated_ogms, S_IRUGO | S_IWUSR, NULL);
-BAT_ATTR_BOOL(bonding, S_IRUGO | S_IWUSR, NULL);
+BAT_ATTR_SIF_BOOL(aggregated_ogms, S_IRUGO | S_IWUSR, NULL);
+BAT_ATTR_SIF_BOOL(bonding, S_IRUGO | S_IWUSR, NULL);
 #ifdef CONFIG_BATMAN_ADV_BLA
-BAT_ATTR_BOOL(bridge_loop_avoidance, S_IRUGO | S_IWUSR, NULL);
+BAT_ATTR_SIF_BOOL(bridge_loop_avoidance, S_IRUGO | S_IWUSR, NULL);
 #endif
-BAT_ATTR_BOOL(fragmentation, S_IRUGO | S_IWUSR, update_min_mtu);
-BAT_ATTR_BOOL(ap_isolation, S_IRUGO | S_IWUSR, NULL);
+BAT_ATTR_SIF_BOOL(fragmentation, S_IRUGO | S_IWUSR, update_min_mtu);
+BAT_ATTR_SIF_BOOL(ap_isolation, S_IRUGO | S_IWUSR, NULL);
 static BAT_ATTR(vis_mode, S_IRUGO | S_IWUSR, show_vis_mode, store_vis_mode);
 static BAT_ATTR(routing_algo, S_IRUGO, show_bat_algo, NULL);
 static BAT_ATTR(gw_mode, S_IRUGO | S_IWUSR, show_gw_mode, store_gw_mode);
-BAT_ATTR_UINT(orig_interval, S_IRUGO | S_IWUSR, 2 * JITTER, INT_MAX, NULL);
-BAT_ATTR_UINT(hop_penalty, S_IRUGO | S_IWUSR, 0, TQ_MAX_VALUE, NULL);
-BAT_ATTR_UINT(gw_sel_class, S_IRUGO | S_IWUSR, 1, TQ_MAX_VALUE,
-	      post_gw_deselect);
+BAT_ATTR_SIF_UINT(orig_interval, S_IRUGO | S_IWUSR, 2 * JITTER, INT_MAX, NULL);
+BAT_ATTR_SIF_UINT(hop_penalty, S_IRUGO | S_IWUSR, 0, TQ_MAX_VALUE, NULL);
+BAT_ATTR_SIF_UINT(gw_sel_class, S_IRUGO | S_IWUSR, 1, TQ_MAX_VALUE,
+		  post_gw_deselect);
 static BAT_ATTR(gw_bandwidth, S_IRUGO | S_IWUSR, show_gw_bwidth,
 		store_gw_bwidth);
 #ifdef CONFIG_BATMAN_ADV_DEBUG
-BAT_ATTR_UINT(log_level, S_IRUGO | S_IWUSR, 0, 15, NULL);
+BAT_ATTR_SIF_UINT(log_level, S_IRUGO | S_IWUSR, 0, 15, NULL);
 #endif
 
 static struct bat_attribute *mesh_attrs[] = {
-- 
1.7.9.4

^ permalink raw reply related

* [PATCH 11/14] batman-adv: Adding hard_iface specific sysfs wrapper macros for UINT
From: Antonio Quartulli @ 2012-05-09 11:12 UTC (permalink / raw)
  To: davem; +Cc: netdev, b.a.t.m.a.n, Linus Luessing, Marek Lindner,
	Antonio Quartulli
In-Reply-To: <1336561976-16088-1-git-send-email-ordex@autistici.org>

From: Linus Luessing <linus.luessing@web.de>

This allows us to easily add a sysfs parameter for an unsigned int
later, which is not for a batman mesh interface (e.g. bat0), but for a
common interface instead. It allows reading and writing an atomic_t in
hard_iface (instead of bat_priv compared to the mesh variant).

Developed by Linus during a 6 months trainee study period in Ascom
(Switzerland) AG.

Signed-off-by: Linus Luessing <linus.luessing@web.de>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
---
 net/batman-adv/bat_iv_ogm.c |   61 +++++++++++++++++++++++--------------------
 net/batman-adv/bat_sysfs.c  |   43 ++++++++++++++++++++++++++++++
 net/batman-adv/packet.h     |    1 +
 3 files changed, 76 insertions(+), 29 deletions(-)

diff --git a/net/batman-adv/bat_iv_ogm.c b/net/batman-adv/bat_iv_ogm.c
index a0fe1de..cc160c0 100644
--- a/net/batman-adv/bat_iv_ogm.c
+++ b/net/batman-adv/bat_iv_ogm.c
@@ -507,11 +507,10 @@ static void bat_iv_ogm_forward(struct orig_node *orig_node,
 			       const struct ethhdr *ethhdr,
 			       struct batman_ogm_packet *batman_ogm_packet,
 			       bool is_single_hop_neigh,
+			       bool is_from_best_next_hop,
 			       struct hard_iface *if_incoming)
 {
 	struct bat_priv *bat_priv = netdev_priv(if_incoming->soft_iface);
-	struct neigh_node *router;
-	uint8_t in_tq, in_ttl, tq_avg = 0;
 	uint8_t tt_num_changes;
 
 	if (batman_ogm_packet->header.ttl <= 1) {
@@ -519,41 +518,31 @@ static void bat_iv_ogm_forward(struct orig_node *orig_node,
 		return;
 	}
 
-	router = orig_node_get_router(orig_node);
+	if (!is_from_best_next_hop) {
+		/**
+		* Mark the forwarded packet when it is not coming from our best
+		* next hop. We still need to forward the packet for our neighbor
+		* link quality detection to work in case the packet originated
+		* from a single hop neighbor. Otherwise we can simply drop the
+		* ogm.
+		*/
+		if (is_single_hop_neigh)
+			batman_ogm_packet->flags |= NOT_BEST_NEXT_HOP;
+		else
+			return;
+	}
 
-	in_tq = batman_ogm_packet->tq;
-	in_ttl = batman_ogm_packet->header.ttl;
 	tt_num_changes = batman_ogm_packet->tt_num_changes;
 
 	batman_ogm_packet->header.ttl--;
 	memcpy(batman_ogm_packet->prev_sender, ethhdr->h_source, ETH_ALEN);
 
-	/* rebroadcast tq of our best ranking neighbor to ensure the rebroadcast
-	 * of our best tq value */
-	if (router && router->tq_avg != 0) {
-
-		/* rebroadcast ogm of best ranking neighbor as is */
-		if (!compare_eth(router->addr, ethhdr->h_source)) {
-			batman_ogm_packet->tq = router->tq_avg;
-
-			if (router->last_ttl)
-				batman_ogm_packet->header.ttl =
-					router->last_ttl - 1;
-		}
-
-		tq_avg = router->tq_avg;
-	}
-
-	if (router)
-		neigh_node_free_ref(router);
-
 	/* apply hop penalty */
 	batman_ogm_packet->tq = hop_penalty(batman_ogm_packet->tq, bat_priv);
 
 	bat_dbg(DBG_BATMAN, bat_priv,
-		"Forwarding packet: tq_orig: %i, tq_avg: %i, tq_forw: %i, ttl_orig: %i, ttl_forw: %i\n",
-		in_tq, tq_avg, batman_ogm_packet->tq, in_ttl - 1,
-		batman_ogm_packet->header.ttl);
+		"Forwarding packet: tq: %i, ttl: %i\n",
+		batman_ogm_packet->tq, batman_ogm_packet->header.ttl);
 
 	batman_ogm_packet->seqno = htonl(batman_ogm_packet->seqno);
 	batman_ogm_packet->tt_crc = htons(batman_ogm_packet->tt_crc);
@@ -949,6 +938,7 @@ static void bat_iv_ogm_process(const struct ethhdr *ethhdr,
 	int is_my_addr = 0, is_my_orig = 0, is_my_oldorig = 0;
 	int is_broadcast = 0, is_bidirectional;
 	bool is_single_hop_neigh = false;
+	bool is_from_best_next_hop = false;
 	int is_duplicate;
 	uint32_t if_incoming_seqno;
 
@@ -1070,6 +1060,13 @@ static void bat_iv_ogm_process(const struct ethhdr *ethhdr,
 		return;
 	}
 
+	if (batman_ogm_packet->flags & NOT_BEST_NEXT_HOP) {
+		bat_dbg(DBG_BATMAN, bat_priv,
+			"Drop packet: ignoring all packets not forwarded from "
+			"the best next hop (sender: %pM)\n", ethhdr->h_source);
+		return;
+	}
+
 	orig_node = get_orig_node(bat_priv, batman_ogm_packet->orig);
 	if (!orig_node)
 		return;
@@ -1094,6 +1091,10 @@ static void bat_iv_ogm_process(const struct ethhdr *ethhdr,
 	if (router)
 		router_router = orig_node_get_router(router->orig_node);
 
+	if ((router && router->tq_avg != 0) &&
+	    (compare_eth(router->addr, ethhdr->h_source)))
+		is_from_best_next_hop = true;
+
 	/* avoid temporary routing loops */
 	if (router && router_router &&
 	    (compare_eth(router->addr, batman_ogm_packet->prev_sender)) &&
@@ -1144,7 +1145,8 @@ static void bat_iv_ogm_process(const struct ethhdr *ethhdr,
 
 		/* mark direct link on incoming interface */
 		bat_iv_ogm_forward(orig_node, ethhdr, batman_ogm_packet,
-				   is_single_hop_neigh, if_incoming);
+				   is_single_hop_neigh, is_from_best_next_hop,
+				   if_incoming);
 
 		bat_dbg(DBG_BATMAN, bat_priv,
 			"Forwarding packet: rebroadcast neighbor packet with direct link flag\n");
@@ -1167,7 +1169,8 @@ static void bat_iv_ogm_process(const struct ethhdr *ethhdr,
 	bat_dbg(DBG_BATMAN, bat_priv,
 		"Forwarding packet: rebroadcast originator packet\n");
 	bat_iv_ogm_forward(orig_node, ethhdr, batman_ogm_packet,
-			   is_single_hop_neigh, if_incoming);
+			   is_single_hop_neigh, is_from_best_next_hop,
+			   if_incoming);
 
 out_neigh:
 	if ((orig_neigh_node) && (!is_single_hop_neigh))
diff --git a/net/batman-adv/bat_sysfs.c b/net/batman-adv/bat_sysfs.c
index 913299d..5bc7b66 100644
--- a/net/batman-adv/bat_sysfs.c
+++ b/net/batman-adv/bat_sysfs.c
@@ -117,6 +117,49 @@ ssize_t show_##_name(struct kobject *kobj,				\
 	static BAT_ATTR(_name, _mode, show_##_name, store_##_name)
 
 
+#define BAT_ATTR_HIF_STORE_UINT(_name, _min, _max, _post_func)		\
+ssize_t store_##_name(struct kobject *kobj, struct attribute *attr,	\
+		      char *buff, size_t count)				\
+{									\
+	struct net_device *net_dev = kobj_to_netdev(kobj);		\
+	struct hard_iface *hard_iface = hardif_get_by_netdev(net_dev);	\
+	ssize_t length;							\
+									\
+	if (!hard_iface)						\
+		return 0;						\
+									\
+	length = __store_uint_attr(buff, count, _min, _max, _post_func,	\
+				   attr, &hard_iface->_name, net_dev);	\
+									\
+	hardif_free_ref(hard_iface);					\
+	return length;							\
+}
+
+#define BAT_ATTR_HIF_SHOW_UINT(_name)					\
+ssize_t show_##_name(struct kobject *kobj,				\
+		     struct attribute *attr, char *buff)		\
+{									\
+	struct net_device *net_dev = kobj_to_netdev(kobj);		\
+	struct hard_iface *hard_iface = hardif_get_by_netdev(net_dev);	\
+	ssize_t length;							\
+									\
+	if (!hard_iface)						\
+		return 0;						\
+									\
+	length = sprintf(buff, "%i\n", atomic_read(&hard_iface->_name));\
+									\
+	hardif_free_ref(hard_iface);					\
+	return length;							\
+}
+
+/* Use this, if you are going to set [name] in hard_iface to an
+ * unsigned integer value*/
+#define BAT_ATTR_HIF_UINT(_name, _mode, _min, _max, _post_func)		\
+	static BAT_ATTR_HIF_STORE_UINT(_name, _min, _max, _post_func)	\
+	static BAT_ATTR_HIF_SHOW_UINT(_name)				\
+	static BAT_ATTR(_name, _mode, show_##_name, store_##_name)
+
+
 static int store_bool_attr(char *buff, size_t count,
 			   struct net_device *net_dev,
 			   const char *attr_name, atomic_t *attr)
diff --git a/net/batman-adv/packet.h b/net/batman-adv/packet.h
index f54969c..0ee1af7 100644
--- a/net/batman-adv/packet.h
+++ b/net/batman-adv/packet.h
@@ -39,6 +39,7 @@ enum bat_packettype {
 #define COMPAT_VERSION 14
 
 enum batman_iv_flags {
+	NOT_BEST_NEXT_HOP   = 1 << 3,
 	PRIMARIES_FIRST_HOP = 1 << 4,
 	VIS_SERVER	    = 1 << 5,
 	DIRECTLINK	    = 1 << 6
-- 
1.7.9.4

^ permalink raw reply related

* [PATCH 12/14] batman-adv: fix checkpatch string complaint
From: Antonio Quartulli @ 2012-05-09 11:12 UTC (permalink / raw)
  To: davem; +Cc: netdev, b.a.t.m.a.n, Marek Lindner, Antonio Quartulli
In-Reply-To: <1336561976-16088-1-git-send-email-ordex@autistici.org>

From: Marek Lindner <lindner_marek@yahoo.de>

Regression introduced by: f76d019194e0a88c57371df169ecc979690a04c2

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
---
 net/batman-adv/bat_iv_ogm.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/batman-adv/bat_iv_ogm.c b/net/batman-adv/bat_iv_ogm.c
index cc160c0..dd5b667 100644
--- a/net/batman-adv/bat_iv_ogm.c
+++ b/net/batman-adv/bat_iv_ogm.c
@@ -1062,8 +1062,8 @@ static void bat_iv_ogm_process(const struct ethhdr *ethhdr,
 
 	if (batman_ogm_packet->flags & NOT_BEST_NEXT_HOP) {
 		bat_dbg(DBG_BATMAN, bat_priv,
-			"Drop packet: ignoring all packets not forwarded from "
-			"the best next hop (sender: %pM)\n", ethhdr->h_source);
+			"Drop packet: ignoring all packets not forwarded from the best next hop (sender: %pM)\n",
+			ethhdr->h_source);
 		return;
 	}
 
-- 
1.7.9.4

^ permalink raw reply related

* [PATCH 13/14] batman-adv: update copyright years
From: Antonio Quartulli @ 2012-05-09 11:12 UTC (permalink / raw)
  To: davem; +Cc: netdev, b.a.t.m.a.n, Antonio Quartulli
In-Reply-To: <1336561976-16088-1-git-send-email-ordex@autistici.org>

update copyright years in order to include 2012

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
---
 net/batman-adv/bridge_loop_avoidance.c |    2 +-
 net/batman-adv/bridge_loop_avoidance.h |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/batman-adv/bridge_loop_avoidance.c b/net/batman-adv/bridge_loop_avoidance.c
index ad394c6..8bf9751 100644
--- a/net/batman-adv/bridge_loop_avoidance.c
+++ b/net/batman-adv/bridge_loop_avoidance.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2011 B.A.T.M.A.N. contributors:
+ * Copyright (C) 2011-2012 B.A.T.M.A.N. contributors:
  *
  * Simon Wunderlich
  *
diff --git a/net/batman-adv/bridge_loop_avoidance.h b/net/batman-adv/bridge_loop_avoidance.h
index 4a8e4fc..e39f93a 100644
--- a/net/batman-adv/bridge_loop_avoidance.h
+++ b/net/batman-adv/bridge_loop_avoidance.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (C) 2011 B.A.T.M.A.N. contributors:
+ * Copyright (C) 2011-2012 B.A.T.M.A.N. contributors:
  *
  * Simon Wunderlich
  *
-- 
1.7.9.4

^ permalink raw reply related

* [PATCH 14/14] batman-adv: add contributor name
From: Antonio Quartulli @ 2012-05-09 11:12 UTC (permalink / raw)
  To: davem; +Cc: netdev, b.a.t.m.a.n, Antonio Quartulli
In-Reply-To: <1336561976-16088-1-git-send-email-ordex@autistici.org>

translation_table.{c,h} have been heavily modified by another contributor and
for legal purposes it is better to include his name into the contributor list

Signed-off-by: Antonio Quartulli <ordex@autistici.org>
---
 net/batman-adv/translation-table.c |    2 +-
 net/batman-adv/translation-table.h |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/batman-adv/translation-table.c b/net/batman-adv/translation-table.c
index a38d315..2cb46f0 100644
--- a/net/batman-adv/translation-table.c
+++ b/net/batman-adv/translation-table.c
@@ -1,7 +1,7 @@
 /*
  * Copyright (C) 2007-2012 B.A.T.M.A.N. contributors:
  *
- * Marek Lindner, Simon Wunderlich
+ * Marek Lindner, Simon Wunderlich, Antonio Quartulli
  *
  * This program is free software; you can redistribute it and/or
  * modify it under the terms of version 2 of the GNU General Public
diff --git a/net/batman-adv/translation-table.h b/net/batman-adv/translation-table.h
index bfebe26..593d1b3 100644
--- a/net/batman-adv/translation-table.h
+++ b/net/batman-adv/translation-table.h
@@ -1,7 +1,7 @@
 /*
  * Copyright (C) 2007-2012 B.A.T.M.A.N. contributors:
  *
- * Marek Lindner, Simon Wunderlich
+ * Marek Lindner, Simon Wunderlich, Antonio Quartulli
  *
  * This program is free software; you can redistribute it and/or
  * modify it under the terms of version 2 of the GNU General Public
-- 
1.7.9.4

^ permalink raw reply related

* [PATCH 1/5] netfilter: ip6_tables: add flags parameter to ipv6_find_hdr()
From: pablo @ 2012-05-09 11:33 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1336563188-6720-1-git-send-email-pablo@netfilter.org>

From: Hans Schillstrom <hans.schillstrom@ericsson.com>

This patch adds the flags parameter to ipv6_find_hdr. This flags
allows us to:

* know if this is a fragment.
* stop at the AH header, so the information contained in that header
  can be used for some specific packet handling.

This patch also adds the offset parameter for inspection of one
inner IPv6 header that is contained in error messages.

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/linux/netfilter_ipv6/ip6_tables.h |    7 +++++-
 net/ipv6/netfilter/ip6_tables.c           |   36 +++++++++++++++++++++++++----
 net/ipv6/netfilter/ip6t_ah.c              |    4 ++--
 net/ipv6/netfilter/ip6t_frag.c            |    4 ++--
 net/ipv6/netfilter/ip6t_hbh.c             |    4 ++--
 net/ipv6/netfilter/ip6t_rt.c              |    4 ++--
 net/netfilter/xt_TPROXY.c                 |    4 ++--
 net/netfilter/xt_socket.c                 |    4 ++--
 8 files changed, 49 insertions(+), 18 deletions(-)

diff --git a/include/linux/netfilter_ipv6/ip6_tables.h b/include/linux/netfilter_ipv6/ip6_tables.h
index 1bc898b..08c2cbb 100644
--- a/include/linux/netfilter_ipv6/ip6_tables.h
+++ b/include/linux/netfilter_ipv6/ip6_tables.h
@@ -298,9 +298,14 @@ ip6t_ext_hdr(u8 nexthdr)
 	       (nexthdr == IPPROTO_DSTOPTS);
 }
 
+enum {
+	IP6T_FH_F_FRAG	= (1 << 0),
+	IP6T_FH_F_AUTH	= (1 << 1),
+};
+
 /* find specified header and get offset to it */
 extern int ipv6_find_hdr(const struct sk_buff *skb, unsigned int *offset,
-			 int target, unsigned short *fragoff);
+			 int target, unsigned short *fragoff, int *fragflg);
 
 #ifdef CONFIG_COMPAT
 #include <net/compat.h>
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index d4e350f..308bdd6 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -133,7 +133,7 @@ ip6_packet_match(const struct sk_buff *skb,
 		int protohdr;
 		unsigned short _frag_off;
 
-		protohdr = ipv6_find_hdr(skb, protoff, -1, &_frag_off);
+		protohdr = ipv6_find_hdr(skb, protoff, -1, &_frag_off, NULL);
 		if (protohdr < 0) {
 			if (_frag_off == 0)
 				*hotdrop = true;
@@ -362,6 +362,7 @@ ip6t_do_table(struct sk_buff *skb,
 		const struct xt_entry_match *ematch;
 
 		IP_NF_ASSERT(e);
+		acpar.thoff = 0;
 		if (!ip6_packet_match(skb, indev, outdev, &e->ipv6,
 		    &acpar.thoff, &acpar.fragoff, &acpar.hotdrop)) {
  no_match:
@@ -2278,6 +2279,10 @@ static void __exit ip6_tables_fini(void)
  * if target < 0. "last header" is transport protocol header, ESP, or
  * "No next header".
  *
+ * Note that *offset is used as input/output parameter. an if it is not zero,
+ * then it must be a valid offset to an inner IPv6 header. This can be used
+ * to explore inner IPv6 header, eg. ICMPv6 error messages.
+ *
  * If target header is found, its offset is set in *offset and return protocol
  * number. Otherwise, return -1.
  *
@@ -2289,17 +2294,33 @@ static void __exit ip6_tables_fini(void)
  * *offset is meaningless and fragment offset is stored in *fragoff if fragoff
  * isn't NULL.
  *
+ * if flags is not NULL and it's a fragment, then the frag flag IP6T_FH_F_FRAG
+ * will be set. If it's an AH header, the IP6T_FH_F_AUTH flag is set and
+ * target < 0, then this function will stop at the AH header.
  */
 int ipv6_find_hdr(const struct sk_buff *skb, unsigned int *offset,
-		  int target, unsigned short *fragoff)
+		  int target, unsigned short *fragoff, int *flags)
 {
 	unsigned int start = skb_network_offset(skb) + sizeof(struct ipv6hdr);
 	u8 nexthdr = ipv6_hdr(skb)->nexthdr;
-	unsigned int len = skb->len - start;
+	unsigned int len;
 
 	if (fragoff)
 		*fragoff = 0;
 
+	if (*offset) {
+		struct ipv6hdr _ip6, *ip6;
+
+		ip6 = skb_header_pointer(skb, *offset, sizeof(_ip6), &_ip6);
+		if (!ip6 || (ip6->version != 6)) {
+			printk(KERN_ERR "IPv6 header not found\n");
+			return -EBADMSG;
+		}
+		start = *offset + sizeof(struct ipv6hdr);
+		nexthdr = ip6->nexthdr;
+	}
+	len = skb->len - start;
+
 	while (nexthdr != target) {
 		struct ipv6_opt_hdr _hdr, *hp;
 		unsigned int hdrlen;
@@ -2316,6 +2337,9 @@ int ipv6_find_hdr(const struct sk_buff *skb, unsigned int *offset,
 		if (nexthdr == NEXTHDR_FRAGMENT) {
 			unsigned short _frag_off;
 			__be16 *fp;
+
+			if (flags)	/* Indicate that this is a fragment */
+				*flags |= IP6T_FH_F_FRAG;
 			fp = skb_header_pointer(skb,
 						start+offsetof(struct frag_hdr,
 							       frag_off),
@@ -2336,9 +2360,11 @@ int ipv6_find_hdr(const struct sk_buff *skb, unsigned int *offset,
 				return -ENOENT;
 			}
 			hdrlen = 8;
-		} else if (nexthdr == NEXTHDR_AUTH)
+		} else if (nexthdr == NEXTHDR_AUTH) {
+			if (flags && (*flags & IP6T_FH_F_AUTH) && (target < 0))
+				break;
 			hdrlen = (hp->hdrlen + 2) << 2;
-		else
+		} else
 			hdrlen = ipv6_optlen(hp);
 
 		nexthdr = hp->nexthdr;
diff --git a/net/ipv6/netfilter/ip6t_ah.c b/net/ipv6/netfilter/ip6t_ah.c
index 89cccc5..04099ab 100644
--- a/net/ipv6/netfilter/ip6t_ah.c
+++ b/net/ipv6/netfilter/ip6t_ah.c
@@ -41,11 +41,11 @@ static bool ah_mt6(const struct sk_buff *skb, struct xt_action_param *par)
 	struct ip_auth_hdr _ah;
 	const struct ip_auth_hdr *ah;
 	const struct ip6t_ah *ahinfo = par->matchinfo;
-	unsigned int ptr;
+	unsigned int ptr = 0;
 	unsigned int hdrlen = 0;
 	int err;
 
-	err = ipv6_find_hdr(skb, &ptr, NEXTHDR_AUTH, NULL);
+	err = ipv6_find_hdr(skb, &ptr, NEXTHDR_AUTH, NULL, NULL);
 	if (err < 0) {
 		if (err != -ENOENT)
 			par->hotdrop = true;
diff --git a/net/ipv6/netfilter/ip6t_frag.c b/net/ipv6/netfilter/ip6t_frag.c
index eda898f..3b5735e 100644
--- a/net/ipv6/netfilter/ip6t_frag.c
+++ b/net/ipv6/netfilter/ip6t_frag.c
@@ -40,10 +40,10 @@ frag_mt6(const struct sk_buff *skb, struct xt_action_param *par)
 	struct frag_hdr _frag;
 	const struct frag_hdr *fh;
 	const struct ip6t_frag *fraginfo = par->matchinfo;
-	unsigned int ptr;
+	unsigned int ptr = 0;
 	int err;
 
-	err = ipv6_find_hdr(skb, &ptr, NEXTHDR_FRAGMENT, NULL);
+	err = ipv6_find_hdr(skb, &ptr, NEXTHDR_FRAGMENT, NULL, NULL);
 	if (err < 0) {
 		if (err != -ENOENT)
 			par->hotdrop = true;
diff --git a/net/ipv6/netfilter/ip6t_hbh.c b/net/ipv6/netfilter/ip6t_hbh.c
index 59df051..01df142 100644
--- a/net/ipv6/netfilter/ip6t_hbh.c
+++ b/net/ipv6/netfilter/ip6t_hbh.c
@@ -50,7 +50,7 @@ hbh_mt6(const struct sk_buff *skb, struct xt_action_param *par)
 	const struct ipv6_opt_hdr *oh;
 	const struct ip6t_opts *optinfo = par->matchinfo;
 	unsigned int temp;
-	unsigned int ptr;
+	unsigned int ptr = 0;
 	unsigned int hdrlen = 0;
 	bool ret = false;
 	u8 _opttype;
@@ -62,7 +62,7 @@ hbh_mt6(const struct sk_buff *skb, struct xt_action_param *par)
 
 	err = ipv6_find_hdr(skb, &ptr,
 			    (par->match == &hbh_mt6_reg[0]) ?
-			    NEXTHDR_HOP : NEXTHDR_DEST, NULL);
+			    NEXTHDR_HOP : NEXTHDR_DEST, NULL, NULL);
 	if (err < 0) {
 		if (err != -ENOENT)
 			par->hotdrop = true;
diff --git a/net/ipv6/netfilter/ip6t_rt.c b/net/ipv6/netfilter/ip6t_rt.c
index d8488c5..2c99b94 100644
--- a/net/ipv6/netfilter/ip6t_rt.c
+++ b/net/ipv6/netfilter/ip6t_rt.c
@@ -42,14 +42,14 @@ static bool rt_mt6(const struct sk_buff *skb, struct xt_action_param *par)
 	const struct ipv6_rt_hdr *rh;
 	const struct ip6t_rt *rtinfo = par->matchinfo;
 	unsigned int temp;
-	unsigned int ptr;
+	unsigned int ptr = 0;
 	unsigned int hdrlen = 0;
 	bool ret = false;
 	struct in6_addr _addr;
 	const struct in6_addr *ap;
 	int err;
 
-	err = ipv6_find_hdr(skb, &ptr, NEXTHDR_ROUTING, NULL);
+	err = ipv6_find_hdr(skb, &ptr, NEXTHDR_ROUTING, NULL, NULL);
 	if (err < 0) {
 		if (err != -ENOENT)
 			par->hotdrop = true;
diff --git a/net/netfilter/xt_TPROXY.c b/net/netfilter/xt_TPROXY.c
index 35a959a..146033a 100644
--- a/net/netfilter/xt_TPROXY.c
+++ b/net/netfilter/xt_TPROXY.c
@@ -282,10 +282,10 @@ tproxy_tg6_v1(struct sk_buff *skb, const struct xt_action_param *par)
 	struct sock *sk;
 	const struct in6_addr *laddr;
 	__be16 lport;
-	int thoff;
+	int thoff = 0;
 	int tproto;
 
-	tproto = ipv6_find_hdr(skb, &thoff, -1, NULL);
+	tproto = ipv6_find_hdr(skb, &thoff, -1, NULL, NULL);
 	if (tproto < 0) {
 		pr_debug("unable to find transport header in IPv6 packet, dropping\n");
 		return NF_DROP;
diff --git a/net/netfilter/xt_socket.c b/net/netfilter/xt_socket.c
index 72bb07f..9ea482d 100644
--- a/net/netfilter/xt_socket.c
+++ b/net/netfilter/xt_socket.c
@@ -263,10 +263,10 @@ socket_mt6_v1(const struct sk_buff *skb, struct xt_action_param *par)
 	struct sock *sk;
 	struct in6_addr *daddr, *saddr;
 	__be16 dport, sport;
-	int thoff, tproto;
+	int thoff = 0, tproto;
 	const struct xt_socket_mtinfo1 *info = (struct xt_socket_mtinfo1 *) par->matchinfo;
 
-	tproto = ipv6_find_hdr(skb, &thoff, -1, NULL);
+	tproto = ipv6_find_hdr(skb, &thoff, -1, NULL, NULL);
 	if (tproto < 0) {
 		pr_debug("unable to find transport header in IPv6 packet, dropping\n");
 		return NF_DROP;
-- 
1.7.9.5


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox