Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH net-next] include/linux/skbuff.h: move CONFIG_XFRM check inside the skb_sec_path()
From: Denis Kirjanov @ 2013-09-29 14:07 UTC (permalink / raw)
  To: netdev, davem; +Cc: Denis Kirjanov

And thus we have only one function definition

Signed-off-by: Denis Kirjanov <kda@linux-powerpc.org>
---
 include/linux/skbuff.h | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 2ddb48d..7399045 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2736,17 +2736,14 @@ extern u16 __skb_tx_hash(const struct net_device *dev,
 			 const struct sk_buff *skb,
 			 unsigned int num_tx_queues);
 
-#ifdef CONFIG_XFRM
 static inline struct sec_path *skb_sec_path(struct sk_buff *skb)
 {
+#ifdef CONFIG_XFRM
 	return skb->sp;
-}
 #else
-static inline struct sec_path *skb_sec_path(struct sk_buff *skb)
-{
 	return NULL;
-}
 #endif
+}
 
 /* Keeps track of mac header offset relative to skb->head.
  * It is useful for TSO of Tunneling protocol. e.g. GRE.
-- 
1.8.0.2

^ permalink raw reply related

* Re: [PATCH v2.40 2/7] odp: Allow VLAN actions after MPLS actions
From: Ben Pfaff @ 2013-09-29 17:05 UTC (permalink / raw)
  To: Joe Stringer
  Cc: Simon Horman, dev@openvswitch.org, netdev, Jesse Gross,
	Pravin B Shelar, Ravi K, Isaku Yamahata
In-Reply-To: <CAOftzPg_FJoyS0oS8Ua7cpMaCh=BboWwyn8kVVoXGT9X9HQ0hg@mail.gmail.com>

On Sun, Sep 29, 2013 at 02:51:19PM +1300, Joe Stringer wrote:
> On Sat, Sep 28, 2013 at 7:21 AM, Ben Pfaff <blp@nicira.com> wrote:
> 
> > On Fri, Sep 27, 2013 at 09:18:31AM +0900, Simon Horman wrote:
> > > From: Joe Stringer <joe@wand.net.nz>
> > >
> > > OpenFlow 1.2 and 1.3 differ on their handling of MPLS actions in the
> > > presence of VLAN tags. To allow correct behaviour to be committed in
> > > each situation, this patch adds a second round of VLAN tag action
> > > handling to commit_odp_actions(), which occurs after MPLS actions. This
> > > is implemented with a new field in 'struct xlate_in' called 'vlan_tci'.
> > >
> > > When an push_mpls action is composed, the flow's current VLAN state is
> > > stored into xin->vlan_tci, and flow->vlan_tci is set to 0 (pop_vlan). If
> > > a VLAN tag is present, it is stripped; if not, then there is no change.
> > > Any later modifications to the VLAN state is written to xin->vlan_tci.
> > > When committing the actions, flow->vlan_tci is used before MPLS actions,
> > > and xin->vlan_tci is used afterwards. This retains the current datapath
> > > behaviour, but allows VLAN actions to be applied in a more flexible
> > > manner.
> > >
> > > Signed-off-by: Joe Stringer <joe@wand.net.nz>
> > > Signed-off-by: Simon Horman <horms@verge.net.au>
> >
> > The commit message talks about handling OpenFlow 1.2 and 1.3 both
> > properly, but I don't see how the code distinguishes between the cases.
> > Can you explain?  Maybe this is something added in a later patch, in
> > which case it would be nice to mention that in the commit message.
> >
> 
> It is added in patch #5. IIRC this patch should cause no difference in
> behaviour, but set up infrastructure to be used later.
> 
>  There seems to be a typo in the comment in vlan_tci_restore() here:
> > > +    /* If MPLS actions were executed after MPLS, copy the final
> > vlan_tci out
> > > +     * and restore the intermediate VLAN state. */
> >
> 
> Right, that should probably be "...executed after VLAN actions...".
> 
> I was a little confused by the new local variable 'vlan_tci' in
> > do_xlate_actions().  Partly this was because the fact that I didn't find
> > it obvious that sometimes it points to different VLAN tags.  I think
> > this would be easier to see if it were initially assigned just under the
> > big comment rather than in an initializer, so that one would know to
> > look at the comment.
> >
> 
> Perhaps the big comment could be rearranged and put above the initialiser,
> something like the following:-
> 
> /* VLAN actions are stored to '*vlan_tci'. This variable initially points
> to 'xin->flow->vlan_tci', so that
>  * VLAN actions are applied before any MPLS actions. When an MPLS action is
> translated,
>  * 'vlan_tci' is updated to point to 'xin->vlan_tci'. This causes later
> VLAN actions to be applied after MPLS actions.
>  * Each time through the loop, 'xin->vlan_tci' is updated to reflect the
> final VLAN state of the flow. */
> 
> Then, the place where 'xin->vlan_tci' is updated to '*vlan_tci' could have
> a simple comment to refer back:-
> 
> /* Update the final vlan state to match the current state. */

All that makes sense, thanks.

^ permalink raw reply

* Re: [PATCH] IPv6: Allow the MTU of ipip6 tunnel to be set below 1280
From: Oussama Ghorbel @ 2013-09-29 16:33 UTC (permalink / raw)
  To: Oussama Ghorbel, David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, netdev, linux-kernel,
	Oussama Ghorbel
In-Reply-To: <20130929154546.GA10771@order.stressinduktion.org>

On Sun, Sep 29, 2013 at 4:45 PM, Hannes Frederic Sowa
<hannes@stressinduktion.org> wrote:
> On Sun, Sep 29, 2013 at 10:40:11AM +0100, Oussama Ghorbel wrote:
>> On Fri, Sep 27, 2013 at 6:03 PM, Hannes Frederic Sowa
>> <hannes@stressinduktion.org> wrote:
>> > Ok, let's go with one function per protocol type. Seems easier.
>> >
>> > It seems to get more hairy, because it depends on the tunnel driver if the
>> > prepended ip header is accounted in hard_header_len. :/
>> >
>> > I don't know if it works out cleanly. Otherwise I would be ok if the checks
>> > just get repeated in ip6_tunnel and leave the rest as-is.
>> >
>> Yes, It will be the clean way to do it.
>
> Fine. :)
>
>> >
>> > Linux currently cannot create "jumbograms" (only the receiving side
>> > is supported).
>> >
>> I understand, but what are the benefit from this limit or the harm
>> from not specifying it?
>> Please check this comment from eth.c
>>
>> /**
>>  * eth_change_mtu - set new MTU size
>>  * @dev: network device
>>  * @new_mtu: new Maximum Transfer Unit
>>  *
>>  * Allow changing MTU size. Needs to be overridden for devices
>>  * supporting jumbo frames.
>>  */
>> int eth_change_mtu(struct net_device *dev, int new_mtu)
>
> Hmm, I cannot judge without the full patch. Will it be applicable
> to all net_devices or just ethernet ones? The name could be a bit
> misleading. Remindes me a lot of dev_set_mtu based on the signature, btw.

Normally to all net_devices, otherwise it could get complicated to
check for every dev separately ...
But, never mind, the comment below solve the issue

>
>> So wouldn't be a good idea to let our function open for jumbo frames...?
>
> Hm, we can document the fact that the function would needed to be updated in
> that case. But we should not allow to set a mtu which would require jumbograms
> currently.

OK, sounds a good. I will check the mtu against the limit
IPV6_MAXPLEN, and document the jumbo restriction ...

>
> Greetings,
>
>   Hannes
>

Regards,
Oussama

^ permalink raw reply

* Re: [PATCH] IPv6: Allow the MTU of ipip6 tunnel to be set below 1280
From: Hannes Frederic Sowa @ 2013-09-29 15:45 UTC (permalink / raw)
  To: Oussama Ghorbel
  Cc: David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, netdev, linux-kernel,
	ou.ghorbel
In-Reply-To: <CA+ev270d2Ztq9i34Lv9U15y1TbH7W-fioXFn-Q_sJd5F4biHuw@mail.gmail.com>

On Sun, Sep 29, 2013 at 10:40:11AM +0100, Oussama Ghorbel wrote:
> On Fri, Sep 27, 2013 at 6:03 PM, Hannes Frederic Sowa
> <hannes@stressinduktion.org> wrote:
> > Ok, let's go with one function per protocol type. Seems easier.
> >
> > It seems to get more hairy, because it depends on the tunnel driver if the
> > prepended ip header is accounted in hard_header_len. :/
> >
> > I don't know if it works out cleanly. Otherwise I would be ok if the checks
> > just get repeated in ip6_tunnel and leave the rest as-is.
> >
> Yes, It will be the clean way to do it.

Fine. :)

> >
> > Linux currently cannot create "jumbograms" (only the receiving side
> > is supported).
> >
> I understand, but what are the benefit from this limit or the harm
> from not specifying it?
> Please check this comment from eth.c
> 
> /**
>  * eth_change_mtu - set new MTU size
>  * @dev: network device
>  * @new_mtu: new Maximum Transfer Unit
>  *
>  * Allow changing MTU size. Needs to be overridden for devices
>  * supporting jumbo frames.
>  */
> int eth_change_mtu(struct net_device *dev, int new_mtu)

Hmm, I cannot judge without the full patch. Will it be applicable
to all net_devices or just ethernet ones? The name could be a bit
misleading. Remindes me a lot of dev_set_mtu based on the signature, btw.

> So wouldn't be a good idea to let our function open for jumbo frames...?

Hm, we can document the fact that the function would needed to be updated in
that case. But we should not allow to set a mtu which would require jumbograms
currently.

Greetings,

  Hannes

^ permalink raw reply

* Re: [PATCH] tcp: TSQ can use a dynamic limit
From: Cong Wang @ 2013-09-29 15:41 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, Wei Liu, Linux Kernel Network Developers,
	Yuchung Cheng, Neal Cardwell
In-Reply-To: <1380277734.30872.25.camel@edumazet-glaptop.roam.corp.google.com>

On Fri, Sep 27, 2013 at 3:28 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
> Additional change : skb->destructor should be set to tcp_wfree().
>
> A future patch (for linux 3.13+) might remove tcp_limit_output_bytes

Yeah, now it is only used in tcp_xmit_size_goal()...

^ permalink raw reply

* [PATCH] ath6kl: fix compilation warning in ath6kl_htc_pipe_conn_service
From: Vladimir Murzin @ 2013-09-29 13:54 UTC (permalink / raw)
  To: netdev; +Cc: kvalo, linville, Vladimir Murzin

Fix the warning

drivers/net/wireless/ath/ath6kl/htc_pipe.c: In function
'ath6kl_htc_pipe_conn_service':
drivers/net/wireless/ath/ath6kl/htc_pipe.c:1293:26: warning: integer overflow
in expression [-Woverflow]

by giving a hint to compiler about unsigned nature of
HTC_CONN_FLGS_SET_RECV_ALLOC_MASK

Signed-off-by: Vladimir Murzin <murzin.v@gmail.com>
---
 drivers/net/wireless/ath/ath6kl/htc.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath6kl/htc.h b/drivers/net/wireless/ath/ath6kl/htc.h
index a2c8ff8..14cab14 100644
--- a/drivers/net/wireless/ath/ath6kl/htc.h
+++ b/drivers/net/wireless/ath/ath6kl/htc.h
@@ -60,7 +60,7 @@
 /* disable credit flow control on a specific service */
 #define HTC_CONN_FLGS_DISABLE_CRED_FLOW_CTRL          (1 << 3)
 #define HTC_CONN_FLGS_SET_RECV_ALLOC_SHIFT    8
-#define HTC_CONN_FLGS_SET_RECV_ALLOC_MASK     0xFF00
+#define HTC_CONN_FLGS_SET_RECV_ALLOC_MASK     0xFF00U
 
 /* connect response status codes */
 #define HTC_SERVICE_SUCCESS      0
-- 
1.8.1.5

^ permalink raw reply related

* [PATCH V2 net-next] fib_trie: avoid a redundant bit judgement in inflate
From: baker.kernel @ 2013-09-29 13:31 UTC (permalink / raw)
  To: davem; +Cc: kuznet, jmorris, yoshfuji, kaber, netdev, linux-kernel,
	baker.zhang

From: "baker.zhang" <baker.kernel@gmail.com>

Because 'node' is the i'st child of 'oldnode',
thus, here 'i' equals
tkey_extract_bits(node->key, oldtnode->pos, oldtnode->bits)

we just get 1 more bit,
and need not care the detail value of this bits.

Signed-off-by: baker.zhang <baker.kernel@gmail.com>
---
 net/ipv4/fib_trie.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index 3df6d3e..45c74ba 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -762,12 +762,9 @@ static struct tnode *inflate(struct trie *t, struct tnode *tn)
 
 		if (IS_LEAF(node) || ((struct tnode *) node)->pos >
 		   tn->pos + tn->bits - 1) {
-			if (tkey_extract_bits(node->key,
-					      oldtnode->pos + oldtnode->bits,
-					      1) == 0)
-				put_child(tn, 2*i, node);
-			else
-				put_child(tn, 2*i+1, node);
+			put_child(tn,
+				tkey_extract_bits(node->key, oldtnode->pos, oldtnode->bits + 1),
+				node);
 			continue;
 		}
 
-- 
1.8.1.2

^ permalink raw reply related

* [PATCH net-next] fib_trie: avoid a redundant bit judgement in inflate
From: baker.kernel @ 2013-09-29 13:24 UTC (permalink / raw)
  To: davem; +Cc: kuznet, jmorris, yoshfuji, kaber, netdev, linux-kernel,
	baker.zhang

From: "baker.zhang" <baker.kernel@gmail.com>

Because 'node' is the i'st child of 'oldnode',
thus, here 'i' equals
tkey_extract_bits(node->key, oldtnode->pos, oldtnode->bits)

we just get 1 more bit,
and need not care the detail value of this bits.

Signed-off-by: baker.zhang <baker.kernel@gmail.com>
---
 net/ipv4/fib_trie.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index 3df6d3e..92a4c7f8 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -762,12 +762,9 @@ static struct tnode *inflate(struct trie *t, struct tnode *tn)
 
 		if (IS_LEAF(node) || ((struct tnode *) node)->pos >
 		   tn->pos + tn->bits - 1) {
-			if (tkey_extract_bits(node->key,
-					      oldtnode->pos + oldtnode->bits,
-					      1) == 0)
-				put_child(tn, 2*i, node);
-			else
-				put_child(tn, 2*i+1, node);
+			put_child(t, tn,
+				tkey_extract_bits(node->key, oldtnode->pos, oldtnode->bits + 1),
+				node);
 			continue;
 		}
 
-- 
1.8.1.2

^ permalink raw reply related

* [PATCH v3] bgmac: add support for Byte Queue Limits
From: Hauke Mehrtens @ 2013-09-29 11:54 UTC (permalink / raw)
  To: davem; +Cc: zajec5, eric.dumazet, netdev, Hauke Mehrtens

This makes it possible to use some more advanced queuing
techniques with this driver.

When multi queue support will be added some changes to Byte Queue
handling is needed.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
---

v3: now netdev_sent_queue() is called after the descriptor is written 
to the memory, but before the hardware is told that there is a new 
descriptor, the package can not be dropped by the driver any more and 
netdev_completed_queue() have not been called for this skb, because the 
driver haven't told the hardware about this package yet.

 drivers/net/ethernet/broadcom/bgmac.c |   10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/bgmac.c b/drivers/net/ethernet/broadcom/bgmac.c
index 249468f..7eca5a1 100644
--- a/drivers/net/ethernet/broadcom/bgmac.c
+++ b/drivers/net/ethernet/broadcom/bgmac.c
@@ -149,6 +149,8 @@ static netdev_tx_t bgmac_dma_tx_add(struct bgmac *bgmac,
 	dma_desc->ctl0 = cpu_to_le32(ctl0);
 	dma_desc->ctl1 = cpu_to_le32(ctl1);
 
+	netdev_sent_queue(net_dev, skb->len);
+
 	wmb();
 
 	/* Increase ring->end to point empty slot. We tell hardware the first
@@ -178,6 +180,7 @@ static void bgmac_dma_tx_free(struct bgmac *bgmac, struct bgmac_dma_ring *ring)
 	struct device *dma_dev = bgmac->core->dma_dev;
 	int empty_slot;
 	bool freed = false;
+	unsigned bytes_compl = 0, pkts_compl = 0;
 
 	/* The last slot that hardware didn't consume yet */
 	empty_slot = bgmac_read(bgmac, ring->mmio_base + BGMAC_DMA_TX_STATUS);
@@ -195,6 +198,9 @@ static void bgmac_dma_tx_free(struct bgmac *bgmac, struct bgmac_dma_ring *ring)
 					 slot->skb->len, DMA_TO_DEVICE);
 			slot->dma_addr = 0;
 
+			bytes_compl += slot->skb->len;
+			pkts_compl++;
+
 			/* Free memory! :) */
 			dev_kfree_skb(slot->skb);
 			slot->skb = NULL;
@@ -208,6 +214,8 @@ static void bgmac_dma_tx_free(struct bgmac *bgmac, struct bgmac_dma_ring *ring)
 		freed = true;
 	}
 
+	netdev_completed_queue(bgmac->net_dev, pkts_compl, bytes_compl);
+
 	if (freed && netif_queue_stopped(bgmac->net_dev))
 		netif_wake_queue(bgmac->net_dev);
 }
@@ -988,6 +996,8 @@ static void bgmac_chip_reset(struct bgmac *bgmac)
 	bgmac_miiconfig(bgmac);
 	bgmac_phy_init(bgmac);
 
+	netdev_reset_queue(bgmac->net_dev);
+
 	bgmac->int_status = 0;
 }
 
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH net-next 4/4] bonding: add rtnl lock and remove read lock for bond sysfs
From: Ding Tianhong @ 2013-09-29 11:45 UTC (permalink / raw)
  To: Jay Vosburgh, Andy Gospodarek, David S. Miller,
	Nikolay Aleksandrov, Veaceslav Falico, Netdev

The bond_for_each_slave() will not be protected by read_lock(),
only protected by rtnl_lock(), so need to replace read_lock()
with rtnl_lock().

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
---
 drivers/net/bonding/bond_sysfs.c | 30 +++++++++++++++++-------------
 1 file changed, 17 insertions(+), 13 deletions(-)

diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index e06c644..2ba1114 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -179,7 +179,9 @@ static ssize_t bonding_show_slaves(struct device *d,
 	struct slave *slave;
 	int res = 0;
 
-	read_lock(&bond->lock);
+	if (!rtnl_trylock())
+		return restart_syscall();
+
 	bond_for_each_slave(bond, slave, iter) {
 		if (res > (PAGE_SIZE - IFNAMSIZ)) {
 			/* not enough space for another interface name */
@@ -190,7 +192,9 @@ static ssize_t bonding_show_slaves(struct device *d,
 		}
 		res += sprintf(buf + res, "%s ", slave->dev->name);
 	}
-	read_unlock(&bond->lock);
+
+	rtnl_unlock();
+
 	if (res)
 		buf[res-1] = '\n'; /* eat the leftover space */
 
@@ -628,6 +632,9 @@ static ssize_t bonding_store_arp_targets(struct device *d,
 	unsigned long *targets_rx;
 	int ind, i, j, ret = -EINVAL;
 
+	if (!rtnl_trylock())
+		return restart_syscall();
+
 	targets = bond->params.arp_targets;
 	newtarget = in_aton(buf + 1);
 	/* look for adds */
@@ -701,6 +708,7 @@ static ssize_t bonding_store_arp_targets(struct device *d,
 
 	ret = count;
 out:
+	rtnl_unlock();
 	return ret;
 }
 static DEVICE_ATTR(arp_ip_target, S_IRUGO | S_IWUSR , bonding_show_arp_targets, bonding_store_arp_targets);
@@ -1469,7 +1477,6 @@ static ssize_t bonding_show_queue_id(struct device *d,
 	if (!rtnl_trylock())
 		return restart_syscall();
 
-	read_lock(&bond->lock);
 	bond_for_each_slave(bond, slave, iter) {
 		if (res > (PAGE_SIZE - IFNAMSIZ - 6)) {
 			/* not enough space for another interface_name:queue_id pair */
@@ -1481,9 +1488,9 @@ static ssize_t bonding_show_queue_id(struct device *d,
 		res += sprintf(buf + res, "%s:%d ",
 			       slave->dev->name, slave->queue_id);
 	}
-	read_unlock(&bond->lock);
 	if (res)
 		buf[res-1] = '\n'; /* eat the leftover space */
+
 	rtnl_unlock();
 
 	return res;
@@ -1532,8 +1539,6 @@ static ssize_t bonding_store_queue_id(struct device *d,
 	if (!sdev)
 		goto err_no_cmd;
 
-	read_lock(&bond->lock);
-
 	/* Search for thes slave and check for duplicate qids */
 	update_slave = NULL;
 	bond_for_each_slave(bond, slave, iter) {
@@ -1544,23 +1549,20 @@ static ssize_t bonding_store_queue_id(struct device *d,
 			 */
 			update_slave = slave;
 		else if (qid && qid == slave->queue_id) {
-			goto err_no_cmd_unlock;
+			goto err_no_cmd;
 		}
 	}
 
 	if (!update_slave)
-		goto err_no_cmd_unlock;
+		goto err_no_cmd;
 
 	/* Actually set the qids for the slave */
 	update_slave->queue_id = qid;
 
-	read_unlock(&bond->lock);
 out:
 	rtnl_unlock();
 	return ret;
 
-err_no_cmd_unlock:
-	read_unlock(&bond->lock);
 err_no_cmd:
 	pr_info("invalid input for queue_id set for %s.\n",
 		bond->dev->name);
@@ -1593,6 +1595,9 @@ static ssize_t bonding_store_slaves_active(struct device *d,
 	struct list_head *iter;
 	struct slave *slave;
 
+	if (!rtnl_trylock())
+		return restart_syscall();
+
 	if (sscanf(buf, "%d", &new_value) != 1) {
 		pr_err("%s: no all_slaves_active value specified.\n",
 		       bond->dev->name);
@@ -1612,7 +1617,6 @@ static ssize_t bonding_store_slaves_active(struct device *d,
 		goto out;
 	}
 
-	read_lock(&bond->lock);
 	bond_for_each_slave(bond, slave, iter) {
 		if (!bond_is_active_slave(slave)) {
 			if (new_value)
@@ -1621,8 +1625,8 @@ static ssize_t bonding_store_slaves_active(struct device *d,
 				slave->inactive = 1;
 		}
 	}
-	read_unlock(&bond->lock);
 out:
+	rtnl_unlock();
 	return ret;
 }
 static DEVICE_ATTR(all_slaves_active, S_IRUGO | S_IWUSR,
-- 
1.8.2.1

^ permalink raw reply related

* [PATCH net-next 3/4] bonding: replace read_lock to rcu_read_lock for bond_3ad_get_active_agg_info()
From: Ding Tianhong @ 2013-09-29 11:45 UTC (permalink / raw)
  To: Jay Vosburgh, Andy Gospodarek, David S. Miller,
	Nikolay Aleksandrov, Veaceslav Falico, Netdev

The bond slave list will protected by rcu or rtnl, so the read lock
could not protect list any more, replace it with rcu_read_lock().

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
---
 drivers/net/bonding/bond_3ad.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index 971a7dc..686d0fa 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -2369,9 +2369,9 @@ int bond_3ad_get_active_agg_info(struct bonding *bond, struct ad_info *ad_info)
 {
 	int ret;
 
-	read_lock(&bond->lock);
+	rcu_read_lock();
 	ret = __bond_3ad_get_active_agg_info(bond, ad_info);
-	read_unlock(&bond->lock);
+	rcu_read_unlock();
 
 	return ret;
 }
-- 
1.8.2.1

^ permalink raw reply related

* [PATCH net-next 2/4] bonding: use RCU protection for alb xmit path
From: Ding Tianhong @ 2013-09-29 11:45 UTC (permalink / raw)
  To: Jay Vosburgh, Andy Gospodarek, David S. Miller,
	Nikolay Aleksandrov, Veaceslav Falico, Netdev

The commit 278b20837511776dc9d5f6ee1c7fabd5479838bb
(bonding: initial RCU conversion) has convert the roundrobin,
active-backup, broadcast and xor xmit path to rcu protection,
the performance will be better for these mode, so this time,
convert xmit path for alb mode.

Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Cc: Nikolay Aleksandrov <nikolay@redhat.com>
Cc: Veaceslav Falico <vfalico@redhat.com>
---
 drivers/net/bonding/bond_alb.c | 23 +++++++++--------------
 drivers/net/bonding/bonding.h  |  2 +-
 2 files changed, 10 insertions(+), 15 deletions(-)

diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index e960418..a1444d5 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -230,7 +230,7 @@ static struct slave *tlb_get_least_loaded_slave(struct bonding *bond)
 	max_gap = LLONG_MIN;
 
 	/* Find the slave with the largest gap */
-	bond_for_each_slave(bond, slave, iter) {
+	bond_for_each_slave_rcu(bond, slave, iter) {
 		if (SLAVE_IS_OK(slave)) {
 			long long gap = compute_gap(slave);
 
@@ -387,7 +387,7 @@ static struct slave *rlb_next_rx_slave(struct bonding *bond)
 	struct list_head *iter;
 	bool found = false;
 
-	bond_for_each_slave(bond, slave, iter) {
+	bond_for_each_slave_rcu(bond, slave, iter) {
 		if (!SLAVE_IS_OK(slave))
 			continue;
 		if (!found) {
@@ -628,12 +628,14 @@ static struct slave *rlb_choose_channel(struct sk_buff *skb, struct bonding *bon
 {
 	struct alb_bond_info *bond_info = &(BOND_ALB_INFO(bond));
 	struct arp_pkt *arp = arp_pkt(skb);
-	struct slave *assigned_slave;
+	struct slave *assigned_slave, *curr_active_slave;
 	struct rlb_client_info *client_info;
 	u32 hash_index = 0;
 
 	_lock_rx_hashtbl(bond);
 
+	curr_active_slave = rcu_dereference(bond->curr_active_slave);
+
 	hash_index = _simple_hash((u8 *)&arp->ip_dst, sizeof(arp->ip_dst));
 	client_info = &(bond_info->rx_hashtbl[hash_index]);
 
@@ -658,8 +660,8 @@ static struct slave *rlb_choose_channel(struct sk_buff *skb, struct bonding *bon
 			 * that the new client can be assigned to this entry.
 			 */
 			if (bond->curr_active_slave &&
-			    client_info->slave != bond->curr_active_slave) {
-				client_info->slave = bond->curr_active_slave;
+			    client_info->slave != curr_active_slave) {
+				client_info->slave = curr_active_slave;
 				rlb_update_client(client_info);
 			}
 		}
@@ -1343,11 +1345,6 @@ int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
 	skb_reset_mac_header(skb);
 	eth_data = eth_hdr(skb);
 
-	/* make sure that the curr_active_slave do not change during tx
-	 */
-	read_lock(&bond->lock);
-	read_lock(&bond->curr_slave_lock);
-
 	switch (ntohs(skb->protocol)) {
 	case ETH_P_IP: {
 		const struct iphdr *iph = ip_hdr(skb);
@@ -1429,12 +1426,12 @@ int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
 
 	if (!tx_slave) {
 		/* unbalanced or unassigned, send through primary */
-		tx_slave = bond->curr_active_slave;
+		tx_slave = rcu_dereference(bond->curr_active_slave);
 		bond_info->unbalanced_load += skb->len;
 	}
 
 	if (tx_slave && SLAVE_IS_OK(tx_slave)) {
-		if (tx_slave != bond->curr_active_slave) {
+		if (tx_slave != rcu_dereference(bond->curr_active_slave)) {
 			memcpy(eth_data->h_source,
 			       tx_slave->dev->dev_addr,
 			       ETH_ALEN);
@@ -1449,8 +1446,6 @@ int bond_alb_xmit(struct sk_buff *skb, struct net_device *bond_dev)
 		}
 	}
 
-	read_unlock(&bond->curr_slave_lock);
-	read_unlock(&bond->lock);
 	if (res) {
 		/* no suitable interface, frame not sent */
 		kfree_skb(skb);
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index 713b6af..1c7d559 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -459,7 +459,7 @@ static inline struct slave *bond_slave_has_mac(struct bonding *bond,
 	struct list_head *iter;
 	struct slave *tmp;
 
-	bond_for_each_slave(bond, tmp, iter)
+	bond_for_each_slave_rcu(bond, tmp, iter)
 		if (ether_addr_equal_64bits(mac, tmp->dev->dev_addr))
 			return tmp;
 
-- 
1.8.2.1

^ permalink raw reply related

* [PATCH net-next 1/4] bonding: use RCU protection for 3ad xmit path
From: Ding Tianhong @ 2013-09-29 11:44 UTC (permalink / raw)
  To: Jay Vosburgh, Andy Gospodarek, David S. Miller,
	Nikolay Aleksandrov, Veaceslav Falico, Netdev

The commit 278b20837511776dc9d5f6ee1c7fabd5479838bb
(bonding: initial RCU conversion) has convert the roundrobin,
active-backup, broadcast and xor xmit path to rcu protection,
the performance will be better for these mode, so this time,
convert xmit path for 3ad mode.

Suggested-by: Nikolay Aleksandrov <nikolay@redhat.com>
Suggested-by: Veaceslav Falico <vfalico@redhat.com>
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Cc: Nikolay Aleksandrov <nikolay@redhat.com>
Cc: Veaceslav Falico <vfalico@redhat.com>
---
 drivers/net/bonding/bond_3ad.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index c62606a..971a7dc 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -2344,7 +2344,7 @@ int __bond_3ad_get_active_agg_info(struct bonding *bond,
 	struct slave *slave;
 	struct port *port;
 
-	bond_for_each_slave(bond, slave, iter) {
+	bond_for_each_slave_rcu(bond, slave, iter) {
 		port = &(SLAVE_AD_INFO(slave).port);
 		if (port->aggregator && port->aggregator->is_active) {
 			aggregator = port->aggregator;
@@ -2388,7 +2388,6 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev)
 	int res = 1;
 	int agg_id;
 
-	read_lock(&bond->lock);
 	if (__bond_3ad_get_active_agg_info(bond, &ad_info)) {
 		pr_debug("%s: Error: __bond_3ad_get_active_agg_info failed\n",
 			 dev->name);
@@ -2406,7 +2405,7 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev)
 	slave_agg_no = bond->xmit_hash_policy(skb, slaves_in_agg);
 	first_ok_slave = NULL;
 
-	bond_for_each_slave(bond, slave, iter) {
+	bond_for_each_slave_rcu(bond, slave, iter) {
 		agg = SLAVE_AD_INFO(slave).port.aggregator;
 		if (!agg || agg->aggregator_identifier != agg_id)
 			continue;
@@ -2436,7 +2435,6 @@ int bond_3ad_xmit_xor(struct sk_buff *skb, struct net_device *dev)
 		res = bond_dev_queue_xmit(bond, skb, first_ok_slave->dev);
 
 out:
-	read_unlock(&bond->lock);
 	if (res) {
 		/* no suitable interface, frame not sent */
 		kfree_skb(skb);
-- 
1.8.2.1

^ permalink raw reply related

* [PATCH]: iproute action: typo nat fix
From: Jamal Hadi Salim @ 2013-09-29 11:39 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev@vger.kernel.org, herbert

[-- Attachment #1: Type: text/plain, Size: 26 bytes --]


attached.

cheers,
jamal

[-- Attachment #2: nat-1 --]
[-- Type: text/plain, Size: 779 bytes --]

commit 012b8c4b3af629039d2bb47ade6e0d3e6980a72b
Author: Jamal Hadi Salim <hadi@mojatatu.com>
Date:   Sun Sep 29 07:35:21 2013 -0400

    If you taketh you giveth.
    I Went the LinuxWay and copied this for m_simple.c and noticed
    this one typo (I wonder where it came from?;->).
    
    Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>

diff --git a/tc/m_nat.c b/tc/m_nat.c
index 01ec032..d502a81 100644
--- a/tc/m_nat.c
+++ b/tc/m_nat.c
@@ -146,7 +146,7 @@ parse_nat(struct action_util *a, int *argc_p, char ***argv_p, int tca_id, struct
 		if (matches(*argv, "index") == 0) {
 			NEXT_ARG();
 			if (get_u32(&sel.index, *argv, 10)) {
-				fprintf(stderr, "Pedit: Illegal \"index\"\n");
+				fprintf(stderr, "Nat: Illegal \"index\"\n");
 				return -1;
 			}
 			argc--;

^ permalink raw reply related

* [PATCH]: iproute action: Introduce simple action
From: Jamal Hadi Salim @ 2013-09-29 11:33 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 286 bytes --]


I had to explain this to someone after they said it wasnt in
iproute2.

Note: For some reason not all kernel headers are copied. You need
at least have linux/tc_act/tc_defact.h for this to compile.
I could send you a patch - but i think this is something you
should do.

cheers,
jamal

[-- Attachment #2: simp-p1 --]
[-- Type: text/plain, Size: 6290 bytes --]

commit ce8f6c550ff714203ca565699e9f332a93704e41
Author: Jamal Hadi Salim <hadi@mojatatu.com>
Date:   Sun Sep 29 07:19:23 2013 -0400

    Simple action is already in the kernel for years now as an
    example. This complements it with user space control.
    
    Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>

diff --git a/tc/Makefile b/tc/Makefile
index 1eeabd8..f54a955 100644
--- a/tc/Makefile
+++ b/tc/Makefile
@@ -38,6 +38,7 @@ TCMODULES += m_nat.o
 TCMODULES += m_pedit.o
 TCMODULES += m_skbedit.o
 TCMODULES += m_csum.o
+TCMODULES += m_simple.o
 TCMODULES += p_ip.o
 TCMODULES += p_icmp.o
 TCMODULES += p_tcp.o
diff --git a/tc/m_simple.c b/tc/m_simple.c
new file mode 100644
index 0000000..0224440
--- /dev/null
+++ b/tc/m_simple.c
@@ -0,0 +1,202 @@
+/*
+ * m_simple.c	simple action
+ *
+ *		This program is free software; you can distribute it and/or
+ *		modify it under the terms of the GNU General Public License
+ *		as published by the Free Software Foundation; either version
+ *		2 of the License, or (at your option) any later version.
+ *
+ * Authors:	J Hadi Salim <jhs@mojatatu.com>
+ *
+ * Pedagogical example. Adds a string that will be printed everytime
+ * the simple instance is hit. 
+ * Use this as a skeleton action and keep modifying it to meet your needs.
+ * Look at linux/tc_act/tc_defact.h for the different components ids and
+ * definitions used in  this actions
+ *
+ * example use, yell "Incoming ICMP!" every time you see an incoming ICMP on
+ * eth0. Steps are:
+ * 1) Add an ingress qdisc point to eth0
+ * 2) Start a chain on ingress of eth0 that first matches ICMP then invokes
+ *    the simple action to shout.
+ * 3) display stats and show that no packet has been seen by the action
+ * 4) Send one ping packet to google (expect to receive a response back)
+ * 5) grep the logs to see the logged message
+ * 6) display stats again and observe increment by 1
+ *
+  hadi@noma1:$ tc qdisc add dev eth0 ingress
+  hadi@noma1:$tc filter add dev eth0 parent ffff: protocol ip prio 5 \
+   	 u32 match ip protocol 1 0xff flowid 1:1 action simple "Incoming ICMP"
+  
+  hadi@noma1:$ sudo tc -s filter ls  dev eth0 parent ffff:
+   filter protocol ip pref 5 u32 
+   filter protocol ip pref 5 u32 fh 800: ht divisor 1 
+   filter protocol ip pref 5 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:1 
+     match 00010000/00ff0000 at 8
+     	action order 1: Simple <Incoming ICMP>
+     	 index 4 ref 1 bind 1 installed 29 sec used 29 sec
+     	 Action statistics:
+     		Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0) 
+     	 	backlog 0b 0p requeues 0 
+  
+  
+  hadi@noma1$ ping -c 1 www.google.ca
+  PING www.google.ca (74.125.225.120) 56(84) bytes of data.
+  64 bytes from ord08s08-in-f24.1e100.net (74.125.225.120): icmp_req=1 ttl=53 time=31.3 ms
+
+  --- www.google.ca ping statistics ---
+  1 packets transmitted, 1 received, 0% packet loss, time 0ms
+  rtt min/avg/max/mdev = 31.316/31.316/31.316/0.000 ms
+
+  hadi@noma1$ dmesg | grep simple
+  [135354.473951] simple: Incoming ICMP_1
+
+  hadi@noma1$ sudo tc/tc -s filter ls  dev eth0 parent ffff:
+  filter protocol ip pref 5 u32 
+  filter protocol ip pref 5 u32 fh 800: ht divisor 1 
+  filter protocol ip pref 5 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:1 
+    match 00010000/00ff0000 at 8
+	action order 1: Simple <Incoming ICMP>
+	 index 4 ref 1 bind 1 installed 206 sec used 67 sec
+	Action statistics:
+	Sent 84 bytes 1 pkt (dropped 0, overlimits 0 requeues 0) 
+	backlog 0b 0p requeues 0 
+*/
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <unistd.h>
+#include <syslog.h>
+#include <fcntl.h>
+#include <sys/socket.h>
+#include <netinet/in.h>
+#include <arpa/inet.h>
+#include <string.h>
+#include "utils.h"
+#include "tc_util.h"
+#include <linux/tc_act/tc_defact.h>
+
+#ifndef SIMP_MAX_DATA
+#define SIMP_MAX_DATA   32
+#endif
+static void explain(void)
+{
+	fprintf(stderr, "Usage: ... simple STRING\n"
+		"STRING being an arbitrary string\n" 
+		"example: \"simple blah\"\n");
+}
+
+static void usage(void)
+{
+	explain();
+	exit(-1);
+}
+
+static int
+parse_simple(struct action_util *a, int *argc_p, char ***argv_p, int tca_id,
+	     struct nlmsghdr *n)
+{
+	struct tc_defact sel = {};
+	int argc = *argc_p;
+	char **argv = *argv_p;
+	int ok = 0;
+	struct rtattr *tail;
+	char *simpdata = NULL;
+
+
+	while (argc > 0) {
+		if (matches(*argv, "simple") == 0) {
+			NEXT_ARG();
+			simpdata = *argv;
+			ok = 1;
+			argc--;
+			argv++;
+			break;
+		} else if (matches(*argv, "help") == 0) {
+			usage();
+		} else {
+			break;
+		}
+
+	}
+
+	if (!ok) {
+		explain();
+		return -1;
+	}
+
+	if (argc) {
+		if (matches(*argv, "index") == 0) {
+			NEXT_ARG();
+			if (get_u32(&sel.index, *argv, 10)) {
+				fprintf(stderr, "simple: Illegal \"index\"\n");
+				return -1;
+			}
+			argc--;
+			argv++;
+		}
+	}
+
+	if (strlen(simpdata) > (SIMP_MAX_DATA - 1)) {
+		fprintf(stderr, "simple: Illegal string len %ld <%s> \n",
+			strlen(simpdata), simpdata);
+		return -1;
+	}
+
+	sel.action = TC_ACT_PIPE;
+
+	tail = NLMSG_TAIL(n);
+	addattr_l(n, MAX_MSG, tca_id, NULL, 0);
+	addattr_l(n, MAX_MSG, TCA_DEF_PARMS, &sel, sizeof(sel));
+	addattr_l(n, MAX_MSG, TCA_DEF_DATA, simpdata, SIMP_MAX_DATA);
+	tail->rta_len = (char *)NLMSG_TAIL(n) - (char *)tail;
+
+	*argc_p = argc;
+	*argv_p = argv;
+	return 0;
+}
+
+static int print_simple(struct action_util *au, FILE * f, struct rtattr *arg)
+{
+	struct tc_defact *sel;
+	struct rtattr *tb[TCA_DEF_MAX + 1];
+	char *simpdata;
+
+	if (arg == NULL)
+		return -1;
+
+	parse_rtattr_nested(tb, TCA_DEF_MAX, arg);
+
+	if (tb[TCA_DEF_PARMS] == NULL) {
+		fprintf(f, "[NULL simple parameters]");
+		return -1;
+	}
+	sel = RTA_DATA(tb[TCA_DEF_PARMS]);
+
+	if (tb[TCA_DEF_DATA] == NULL) {
+		fprintf(f, "[missing simple string]");
+		return -1;
+	}
+
+	simpdata = RTA_DATA(tb[TCA_DEF_DATA]);
+
+	fprintf(f, "Simple <%s>\n", simpdata);
+	fprintf(f, "\t index %d ref %d bind %d", sel->index,
+		sel->refcnt, sel->bindcnt);
+
+	if (show_stats) {
+		if (tb[TCA_DEF_TM]) {
+			struct tcf_t *tm = RTA_DATA(tb[TCA_DEF_TM]);
+			print_tm(f, tm);
+			fprintf(f, "\n");
+		}
+	}
+
+	return 0;
+}
+
+struct action_util simple_action_util = {
+	.id = "simple",
+	.parse_aopt = parse_simple,
+	.print_aopt = print_simple,
+};

^ permalink raw reply related

* Re: [PATCH net-next] qdisc: basic classifier - remove unnecessary initialization
From: Jamal Hadi Salim @ 2013-09-29 11:28 UTC (permalink / raw)
  To: Stephen Hemminger, David Miller; +Cc: netdev
In-Reply-To: <20130926174216.447555bd@nehalam.linuxnetplumber.net>

On 13-09-26 08:42 PM, Stephen Hemminger wrote:
> err is set once, then first code resets it.
>    err = tcf_exts_validate(...)
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
>
>
> --- a/net/sched/cls_basic.c	2013-08-10 10:36:11.657498301 -0700
> +++ b/net/sched/cls_basic.c	2013-09-05 18:05:14.718200833 -0700
> @@ -137,7 +137,7 @@ static int basic_set_parms(struct net *n
>   			   struct nlattr **tb,
>   			   struct nlattr *est)
>   {
> -	int err = -EINVAL;
> +	int err;
>   	struct tcf_exts e;
>   	struct tcf_ematch_tree t;
>


Acked-by: Jamal Hadi Salim <hadi@mojatatu.com>


cheers,
jamal

^ permalink raw reply

* Re: [PATCH] IPv6: Allow the MTU of ipip6 tunnel to be set below 1280
From: Oussama Ghorbel @ 2013-09-29  9:40 UTC (permalink / raw)
  To: Oussama Ghorbel, David S. Miller, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy, netdev, linux-kernel
  Cc: ou.ghorbel
In-Reply-To: <20130927170345.GC10343@order.stressinduktion.org>

On Fri, Sep 27, 2013 at 6:03 PM, Hannes Frederic Sowa
<hannes@stressinduktion.org> wrote:
> On Fri, Sep 27, 2013 at 05:36:45PM +0100, Oussama Ghorbel wrote:
>> Please see my comments below
>>
>> Regards,
>> Oussama
>>
>> On Fri, Sep 27, 2013 at 11:58 AM, Hannes Frederic Sowa
>> <hannes@stressinduktion.org> wrote:
>> > On Fri, Sep 27, 2013 at 11:45:48AM +0100, Oussama Ghorbel wrote:
>> >> The ip6_tunnel.c module would be then dependent on ip_tunnel.c and may
>> >> be it would not be good thing?
>> >
>> > It could just be a static inline in some shared header. So there would
>> > be no compile-time dependency.
>> >
>>
>> The higher limit of mtu in ip_tunnel_change_mtu() is a calculated value.
>> This high limit is calculated using the netdev priv struct ip_tunnel
>> (field hlen).
>> This netdev priv struct is different in ipv6, it's a ip6_tnl struct.
>> Therefore implementing a beautiful function like
>> ip_tunnel_valid_mtu(struct net_device *dev, int mtu) won't be
>> feasible, unless we implement it in macro or something like like
>> ip_tunnel_valid_mtu(struct net_device *dev, int hlen, int mtu) which
>> seems not very nice ...
>> What do yo think?
>
> Ok, let's go with one function per protocol type. Seems easier.
>
> It seems to get more hairy, because it depends on the tunnel driver if the
> prepended ip header is accounted in hard_header_len. :/
>
> I don't know if it works out cleanly. Otherwise I would be ok if the checks
> just get repeated in ip6_tunnel and leave the rest as-is.
>
Yes, It will be the clean way to do it.
>>
>> >> As I have check in v3.10 there is no call from ip6_tunnel to ip_tunnel...
>> >>
>> >> For information, there is no check for the maximum MTU for ipv4 in the
>> >> patch as this is not done for ipv6.
>> >
>> > I understand, but it would be better to limit the MTU here. There is a
>> > (non-jumo) IPV6_MAXPLEN constant.
>> >
>> > Looking through the source it seems grev6 does actually check this,
>> > so it would not hurt adding them here, too.
>>
>> what if jumbograms is used, in that case, we can't use IPV6_MAXPLEN.
>> the limit would be the the full unsigned int.
>> However checking the higher limit for ipv4 would be useful.
>
> Linux currently cannot create "jumbograms" (only the receiving side
> is supported).
>
I understand, but what are the benefit from this limit or the harm
from not specifying it?
Please check this comment from eth.c

/**
 * eth_change_mtu - set new MTU size
 * @dev: network device
 * @new_mtu: new Maximum Transfer Unit
 *
 * Allow changing MTU size. Needs to be overridden for devices
 * supporting jumbo frames.
 */
int eth_change_mtu(struct net_device *dev, int new_mtu)

So wouldn't be a good idea to let our function open for jumbo frames...?

>> Please also note in case the tunnel mode is any (ipv4 or ipv6 means
>> ip6_tnl.parms.proto = 0), then we will be required to take the most
>> restrict limit from both ipv4 and ipv6 which means the lower limit
>> will be 1280 and the higher limit will be about 65535
>> Do you agree on this?
>
> Agreed, this would be needed for e.g. the sit driver, which now can
> handle both protocols.
>
> Thanks,
>
>   Hannes
>

Thanks,
Oussama

^ permalink raw reply

* Re: [PATCH V3 4/6] vhost_net: determine whether or not to use zerocopy at one time
From: Jason Wang @ 2013-09-29  9:36 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: netdev, linux-kernel, kvm, virtualization
In-Reply-To: <5243B859.3070302@redhat.com>

On 09/26/2013 12:30 PM, Jason Wang wrote:
> On 09/23/2013 03:16 PM, Michael S. Tsirkin wrote:
>> > On Thu, Sep 05, 2013 at 10:54:44AM +0800, Jason Wang wrote:
>>>> >> > On 09/04/2013 07:59 PM, Michael S. Tsirkin wrote:
>>>>>> >>> > > On Mon, Sep 02, 2013 at 04:40:59PM +0800, Jason Wang wrote:
>>>>>>>> >>>> > >> Currently, even if the packet length is smaller than VHOST_GOODCOPY_LEN, if
>>>>>>>> >>>> > >> upend_idx != done_idx we still set zcopy_used to true and rollback this choice
>>>>>>>> >>>> > >> later. This could be avoided by determining zerocopy once by checking all
>>>>>>>> >>>> > >> conditions at one time before.
>>>>>>>> >>>> > >>
>>>>>>>> >>>> > >> Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>>>>>> >>>> > >> ---
>>>>>>>> >>>> > >>  drivers/vhost/net.c |   47 ++++++++++++++++++++---------------------------
>>>>>>>> >>>> > >>  1 files changed, 20 insertions(+), 27 deletions(-)
>>>>>>>> >>>> > >>
>>>>>>>> >>>> > >> diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
>>>>>>>> >>>> > >> index 8a6dd0d..3f89dea 100644
>>>>>>>> >>>> > >> --- a/drivers/vhost/net.c
>>>>>>>> >>>> > >> +++ b/drivers/vhost/net.c
>>>>>>>> >>>> > >> @@ -404,43 +404,36 @@ static void handle_tx(struct vhost_net *net)
>>>>>>>> >>>> > >>  			       iov_length(nvq->hdr, s), hdr_size);
>>>>>>>> >>>> > >>  			break;
>>>>>>>> >>>> > >>  		}
>>>>>>>> >>>> > >> -		zcopy_used = zcopy && (len >= VHOST_GOODCOPY_LEN ||
>>>>>>>> >>>> > >> -				       nvq->upend_idx != nvq->done_idx);
>>>>>>>> >>>> > >> +
>>>>>>>> >>>> > >> +		zcopy_used = zcopy && len >= VHOST_GOODCOPY_LEN
>>>>>>>> >>>> > >> +				   && (nvq->upend_idx + 1) % UIO_MAXIOV !=
>>>>>>>> >>>> > >> +				      nvq->done_idx
>>>>>> >>> > > Thinking about this, this looks strange.
>>>>>> >>> > > The original idea was that once we start doing zcopy, we keep
>>>>>> >>> > > using the heads ring even for short packets until no zcopy is outstanding.
>>>> >> > 
>>>> >> > What's the reason for keep using the heads ring?
>> > To keep completions in order.
> Ok, I will do some test to see the impact.

Since the our of order completion will happen when switching between
zero copy and non zero copy. I test this by using two sessions of
netperf in burst mode, one with 1 byte TCP_RR another with 512 bytes of
TCP_RR. There's no difference with the patch applied.

^ permalink raw reply

* RE: [PATCH net-next] net: skb_is_gso_v6() requires skb_is_gso()
From: Dmitry Kravkov @ 2013-09-29  9:03 UTC (permalink / raw)
  To: Eric Dumazet, David Miller; +Cc: netdev, Eilon Greenstein
In-Reply-To: <1380442892.3596.22.camel@edumazet-glaptop.roam.corp.google.com>

> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-
> owner@vger.kernel.org] On Behalf Of Eric Dumazet
> Sent: Sunday, September 29, 2013 11:22 AM
> To: David Miller
> Cc: netdev; Eilon Greenstein
> Subject: [PATCH net-next] net: skb_is_gso_v6() requires skb_is_gso()
> 
> From: Eric Dumazet <edumazet@google.com>
> 
> bnx2x makes a dangerous use of skb_is_gso_v6().
> 
> It should first make sure skb is a gso packet
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Eilon Greenstein <eilong@broadcom.com>

Thanks, Eric!

Acked-by: Dmitry Kravkov <dmitry@broadcom.com>

^ permalink raw reply

* RE: ADMINISTRATOR HELP DESK
From: Greene, Tanya @ 2013-09-29  8:26 UTC (permalink / raw)
  To: Greene, Tanya
In-Reply-To: <0FF456CBAA9B324987CEA675439D7EF801151DF4AE@EXCH10-MB3.paterson.k12.nj.us>

Your Mailbox Has Exceeded It Storage Limit As Set By Your Administrator from the server, And You Will Not Be  Able To Receive New Mails until You Re-Validate It click Re-validate<http://postmaster-supportit.coffeecup.com/forms/HELP%20DESK/> .Thank you courtesy © 2013 by Intellectual Reserve, Inc. All rights reserved

^ permalink raw reply

* [PATCH net-next] net: skb_is_gso_v6() requires skb_is_gso()
From: Eric Dumazet @ 2013-09-29  8:21 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Eilon Greenstein

From: Eric Dumazet <edumazet@google.com>

bnx2x makes a dangerous use of skb_is_gso_v6().

It should first make sure skb is a gso packet

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Eilon Greenstein <eilong@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c |   18 +++++++-------
 include/linux/skbuff.h                          |    1 
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index 61726af..0c7fb1e 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -3256,14 +3256,16 @@ static u32 bnx2x_xmit_type(struct bnx2x *bp, struct sk_buff *skb)
 	if (prot == IPPROTO_TCP)
 		rc |= XMIT_CSUM_TCP;
 
-	if (skb_is_gso_v6(skb)) {
-		rc |= (XMIT_GSO_V6 | XMIT_CSUM_TCP);
-		if (rc & XMIT_CSUM_ENC)
-			rc |= XMIT_GSO_ENC_V6;
-	} else if (skb_is_gso(skb)) {
-		rc |= (XMIT_GSO_V4 | XMIT_CSUM_TCP);
-		if (rc & XMIT_CSUM_ENC)
-			rc |= XMIT_GSO_ENC_V4;
+	if (skb_is_gso(skb)) {
+		if (skb_is_gso_v6(skb)) {
+			rc |= (XMIT_GSO_V6 | XMIT_CSUM_TCP);
+			if (rc & XMIT_CSUM_ENC)
+				rc |= XMIT_GSO_ENC_V6;
+		} else {
+			rc |= (XMIT_GSO_V4 | XMIT_CSUM_TCP);
+			if (rc & XMIT_CSUM_ENC)
+				rc |= XMIT_GSO_ENC_V4;
+		}
 	}
 
 	return rc;
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 2ddb48d..d48fde30 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2783,6 +2783,7 @@ static inline bool skb_is_gso(const struct sk_buff *skb)
 	return skb_shinfo(skb)->gso_size;
 }
 
+/* Note: Should be called only if skb_is_gso(skb) is true */
 static inline bool skb_is_gso_v6(const struct sk_buff *skb)
 {
 	return skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6;

^ permalink raw reply related

* [PATCH net-next] net: add missing sk_max_pacing_rate doc
From: Eric Dumazet @ 2013-09-29  8:12 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20130928.153610.2280100769720851440.davem@davemloft.net>

From: Eric Dumazet <edumazet@google.com>

Warning(include/net/sock.h:411): No description found for parameter
'sk_max_pacing_rate'

Lets please "make htmldocs" and kbuild bot.

Reported-by: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/net/sock.h |    1 +
 1 file changed, 1 insertion(+)

diff --git a/include/net/sock.h b/include/net/sock.h
index 240aa3f..cf91c8e 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -233,6 +233,7 @@ struct cg_proto;
   *	@sk_ll_usec: usecs to busypoll when there is no data
   *	@sk_allocation: allocation mode
   *	@sk_pacing_rate: Pacing rate (if supported by transport/packet scheduler)
+  *	@sk_max_pacing_rate: Maximum pacing rate (%SO_MAX_PACING_RATE)
   *	@sk_sndbuf: size of send buffer in bytes
   *	@sk_flags: %SO_LINGER (l_onoff), %SO_BROADCAST, %SO_KEEPALIVE,
   *		   %SO_OOBINLINE settings, %SO_TIMESTAMPING settings

^ permalink raw reply related

* Re: [PATCH] ipv6: gre: correct calculation of max_headroom
From: Eric Dumazet @ 2013-09-29  8:05 UTC (permalink / raw)
  To: Hannes Frederic Sowa; +Cc: netdev, xeb
In-Reply-To: <20130929034050.GB8602@order.stressinduktion.org>

On Sun, 2013-09-29 at 05:40 +0200, Hannes Frederic Sowa wrote:
> gre_hlen already accounts for sizeof(struct ipv6_hdr) + gre header,
> so initialize max_headroom to zero. Otherwise the
> 
> 	if (encap_limit >= 0) {
> 		max_headroom += 8;
> 		mtu -= 8;
> 	}
> 
> increments an uninitialized variable before max_headroom was reset.
> 
> Found with coverity: 728539
> 
> Cc: Dmitry Kozlov <xeb@mail.ru>
> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
> ---

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply

* Re: [PATCH] ipv6: sit: ensure prefixlen <= 128 before calling ipv6_addr_prefix
From: Hannes Frederic Sowa @ 2013-09-29  3:53 UTC (permalink / raw)
  To: netdev
In-Reply-To: <20130929031535.GA8602@order.stressinduktion.org>

On Sun, Sep 29, 2013 at 05:15:35AM +0200, Hannes Frederic Sowa wrote:
> Found with coverity: CID 139492
> 
> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>

Disregard, I was fooled by the report. This is not necessary.

^ permalink raw reply

* [PATCH] ipv6: gre: correct calculation of max_headroom
From: Hannes Frederic Sowa @ 2013-09-29  3:40 UTC (permalink / raw)
  To: netdev; +Cc: xeb

gre_hlen already accounts for sizeof(struct ipv6_hdr) + gre header,
so initialize max_headroom to zero. Otherwise the

	if (encap_limit >= 0) {
		max_headroom += 8;
		mtu -= 8;
	}

increments an uninitialized variable before max_headroom was reset.

Found with coverity: 728539

Cc: Dmitry Kozlov <xeb@mail.ru>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 net/ipv6/ip6_gre.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index 6b26e9f..7bb5446 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -618,7 +618,7 @@ static netdev_tx_t ip6gre_xmit2(struct sk_buff *skb,
 	struct ip6_tnl *tunnel = netdev_priv(dev);
 	struct net_device *tdev;    /* Device to other host */
 	struct ipv6hdr  *ipv6h;     /* Our new IP header */
-	unsigned int max_headroom;  /* The extra header space needed */
+	unsigned int max_headroom = 0; /* The extra header space needed */
 	int    gre_hlen;
 	struct ipv6_tel_txoption opt;
 	int    mtu;
@@ -693,7 +693,7 @@ static netdev_tx_t ip6gre_xmit2(struct sk_buff *skb,
 
 	skb_scrub_packet(skb, !net_eq(tunnel->net, dev_net(dev)));
 
-	max_headroom = LL_RESERVED_SPACE(tdev) + gre_hlen + dst->header_len;
+	max_headroom += LL_RESERVED_SPACE(tdev) + gre_hlen + dst->header_len;
 
 	if (skb_headroom(skb) < max_headroom || skb_shared(skb) ||
 	    (skb_cloned(skb) && !skb_clone_writable(skb, 0))) {
-- 
1.8.3.1

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox