Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next] openvswitch: Create right mask with disabled megaflows
From: David Miller @ 2014-10-17 20:20 UTC (permalink / raw)
  To: pshelar; +Cc: netdev, ddiproietto, azhou
In-Reply-To: <1413521745-1492-1-git-send-email-pshelar@nicira.com>

From: Pravin B Shelar <pshelar@nicira.com>
Date: Thu, 16 Oct 2014 21:55:45 -0700

> If megaflows are disabled, the userspace does not send the netlink attribute
> OVS_FLOW_ATTR_MASK, and the kernel must create an exact match mask.
> 
> sw_flow_mask_set() sets every bytes (in 'range') of the mask to 0xff, even the
> bytes that represent padding for struct sw_flow, or the bytes that represent
> fields that may not be set during ovs_flow_extract().
> This is a problem, because when we extract a flow from a packet,
> we do not memset() anymore the struct sw_flow to 0.
> 
> This commit gets rid of sw_flow_mask_set() and introduces mask_set_nlattr(),
> which operates on the netlink attributes rather than on the mask key. Using
> this approach we are sure that only the bytes that the user provided in the
> flow are matched.
> 
> Also, if the parse_flow_mask_nlattrs() for the mask ENCAP attribute fails, we
> now return with an error.
> 
> This bug is introduced by commit 0714812134d7dcadeb7ecfbfeb18788aa7e1eaac
> ("openvswitch: Eliminate memset() from flow_extract").
> 
> Reported-by: Alex Wang <alexw@nicira.com>
> Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
> Signed-off-by: Andy Zhou <azhou@nicira.com>
> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>

Why are you targetting a bug fix like this to net-next?

That makes no sense at all.

^ permalink raw reply

* Re: [PATCH] openvswitch: fix a use after free
From: David Miller @ 2014-10-17 20:23 UTC (permalink / raw)
  To: roy.qing.li; +Cc: netdev, edumazet, jesse
In-Reply-To: <1413525788-2272-1-git-send-email-roy.qing.li@gmail.com>

From: roy.qing.li@gmail.com
Date: Fri, 17 Oct 2014 14:03:08 +0800

> From: Li RongQing <roy.qing.li@gmail.com>
> 
> pskb_may_pull() called by arphdr_ok can change skb->data, so put the arp
> setting after arphdr_ok to avoid the use the freed memory
> 
> Fixes: 0714812134d7d ("openvswitch: Eliminate memset() from flow_extract.")
> Cc: Jesse Gross <jesse@nicira.com>
> Cc: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Li RongQing <roy.qing.li@gmail.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH v2] vxlan: fix a free after use
From: David Miller @ 2014-10-17 20:24 UTC (permalink / raw)
  To: roy.qing.li; +Cc: netdev, edumazet
In-Reply-To: <1413525976-2624-1-git-send-email-roy.qing.li@gmail.com>

From: roy.qing.li@gmail.com
Date: Fri, 17 Oct 2014 14:06:16 +0800

> From: Li RongQing <roy.qing.li@gmail.com>
> 
> pskb_may_pull maybe change skb->data and make eth pointer oboslete,
> so eth needs to reload
> 
> Fixes: 91269e390d062 ("vxlan: using pskb_may_pull as early as possible")
> Cc: Eric Dumazet <edumazet@google.com>
> Signed-off-by: Li RongQing <roy.qing.li@gmail.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH v2] vxlan: remove the dead codes
From: David Miller @ 2014-10-17 20:24 UTC (permalink / raw)
  To: cwang; +Cc: roy.qing.li, netdev, stephen
In-Reply-To: <CAHA+R7O7Yyd2XMkMnv5b_TLFekPCRRtOGG7bHgHs2Qd+YRu3Hg@mail.gmail.com>

From: Cong Wang <cwang@twopensource.com>
Date: Fri, 17 Oct 2014 10:33:08 -0700

> On Thu, Oct 16, 2014 at 11:04 PM,  <roy.qing.li@gmail.com> wrote:
>> From: Li RongQing <roy.qing.li@gmail.com>
>>
>> remove the dead codes, no user uses vxlan_salt
>>
>> Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
>> ---
>>  drivers/net/vxlan.c |    4 ----
>>  1 file changed, 4 deletions(-)
>>
>> diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
>> index 77ab844..855a81d 100644
>> --- a/drivers/net/vxlan.c
>> +++ b/drivers/net/vxlan.c
>> @@ -152,8 +152,6 @@ struct vxlan_dev {
>>         struct hlist_head fdb_head[FDB_HASH_SIZE];
>>  };
>>
>> -/* salt for hash table */
>> -static u32 vxlan_salt __read_mostly;
> 
> Hmm, it has never been used since it was born. Maybe we should
> really use for vxlan hash table?

Yes, agreed, and I am very sure that this was Stephen Hemminger's
original intention.

^ permalink raw reply

* Re: [PATCH net-next] openvswitch: Create right mask with disabled megaflows
From: Pravin Shelar @ 2014-10-17 20:40 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Daniele Di Proietto, Andy Zhou
In-Reply-To: <20141017.162058.9617975607356013.davem@davemloft.net>

On Fri, Oct 17, 2014 at 1:20 PM, David Miller <davem@davemloft.net> wrote:
> From: Pravin B Shelar <pshelar@nicira.com>
> Date: Thu, 16 Oct 2014 21:55:45 -0700
>
>> If megaflows are disabled, the userspace does not send the netlink attribute
>> OVS_FLOW_ATTR_MASK, and the kernel must create an exact match mask.
>>
>> sw_flow_mask_set() sets every bytes (in 'range') of the mask to 0xff, even the
>> bytes that represent padding for struct sw_flow, or the bytes that represent
>> fields that may not be set during ovs_flow_extract().
>> This is a problem, because when we extract a flow from a packet,
>> we do not memset() anymore the struct sw_flow to 0.
>>
>> This commit gets rid of sw_flow_mask_set() and introduces mask_set_nlattr(),
>> which operates on the netlink attributes rather than on the mask key. Using
>> this approach we are sure that only the bytes that the user provided in the
>> flow are matched.
>>
>> Also, if the parse_flow_mask_nlattrs() for the mask ENCAP attribute fails, we
>> now return with an error.
>>
>> This bug is introduced by commit 0714812134d7dcadeb7ecfbfeb18788aa7e1eaac
>> ("openvswitch: Eliminate memset() from flow_extract").
>>
>> Reported-by: Alex Wang <alexw@nicira.com>
>> Signed-off-by: Daniele Di Proietto <ddiproietto@vmware.com>
>> Signed-off-by: Andy Zhou <azhou@nicira.com>
>> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
>
> Why are you targetting a bug fix like this to net-next?
>
I meant it for net tree. Can you take this patch for net?

Thanks.

^ permalink raw reply

* Re: [PATCH] Move rlb table dump to procfs
From: David Miller @ 2014-10-17 20:48 UTC (permalink / raw)
  To: emunson; +Cc: netdev, andy, vfalico, j.vosburgh, linux-kernel
In-Reply-To: <1413571313-10137-1-git-send-email-emunson@akamai.com>

From: Eric B Munson <emunson@akamai.com>
Date: Fri, 17 Oct 2014 14:41:52 -0400

> Because debugfs is not net namespace aware, it is currently not
> possible to dump the RLB hash table when network namespaces are enabled.
> This patch creates a second proc file called IFNAME_rlb_table for
> bonding interfaces that use adaptive load balancing which will expose
> this table to userspace.  It also removes all the debugfs code from the
> bonding module as the RLB hash was the only entry in debugfs.
> 
> Signed-off-by: Eric B Munson <emunson@akamai.com>

It's meant to be a debugging facility, which has no stable API
and is not something the user can depend upon it.

Once you move it to procfs, it's basically over and we are
stuck with this forever.

I think the intention of this meaning to be a debug facility for
developers is very clear.

I'm not applying this, sorry.

^ permalink raw reply

* Re: [PATCH net-next] openvswitch: Create right mask with disabled megaflows
From: David Miller @ 2014-10-17 20:49 UTC (permalink / raw)
  To: pshelar; +Cc: netdev, ddiproietto, azhou
In-Reply-To: <CALnjE+qehC2StaqpznbDBAVivoJ+-ej7KFZSXZr5GzVQ+FKYgg@mail.gmail.com>

From: Pravin Shelar <pshelar@nicira.com>
Date: Fri, 17 Oct 2014 13:40:24 -0700

> I meant it for net tree. Can you take this patch for net?

Yep, done.

^ permalink raw reply

* Re: Regarding tx-nocache-copy in the Sheevaplug
From: Benjamin Poirier @ 2014-10-17 20:55 UTC (permalink / raw)
  To: Lluís Batlle i Rossell
  Cc: Eric Dumazet, linux-kernel, netdev, Carles Pagès,
	linux-arm-kernel
In-Reply-To: <20141016174628.GL1972@vicerveza.homeunix.net>

On 2014/10/16 19:46, Lluís Batlle i Rossell wrote:
[...]
> 
> Hello all,
> 
> it seems I was a bit wrong - although enabling back tx-nocache-copy makes the
> tx-errors happen much less often (ssh complaining about HMAC), they still
> happen. It seems that something was introduced in some recent kernels that broke
> the tx offload.
> 
> I have no idea what it can be, but since 2.6 until at least 3.10 the network
> driver worked fine with tx offload in this sheevaplug board.

It's not the most pleasant alternative but if you can be sure enough
whether the problem is occurring or not, you could try bisecting,
possibly limiting the bisection to mv643xx

$ git bisect start v3.16.3 v3.10 -- drivers/net/ethernet/marvell/mv643xx_eth.c
Bisecting: 16 revisions left to test after this (roughly 4 steps)

The problem might be outside of the driver though.

^ permalink raw reply

* Re: [PATCH] net: use hardware buffer pool to allocate skb
From: Alexander Duyck @ 2014-10-17 22:13 UTC (permalink / raw)
  To: Eric Dumazet, Alexander Duyck
  Cc: David Laight, Jiafei.Pan@freescale.com, David Miller,
	jkosina@suse.cz, netdev@vger.kernel.org, LeoLi@freescale.com,
	linux-doc@vger.kernel.org
In-Reply-To: <1413575515.27176.10.camel@edumazet-glaptop2.roam.corp.google.com>

On 10/17/2014 12:51 PM, Eric Dumazet wrote:
> On Fri, 2014-10-17 at 12:38 -0700, Alexander Duyck wrote:
>
>> That's not what I am saying, but there is a trade-off we always have to
>> take into account.  Cutting memory overhead will likely have an impact
>> on performance.  I would like to make the best informed trade-off in
>> that regard rather than just assuming worst case always for the driver.
> It seems you misunderstood me. You believe I suggested doing another
> allocation strategy in the drivers.
>
> This was not the case.
>
> This allocation strategy is wonderful. I repeat : This is wonderful.

No, I think I understand you.  I'm just not sure listing this as a 4K 
allocation in truesize makes sense.  The problem is the actual 
allocation can be either 2K or 4K, and my concern is that by setting it 
to 4K we are going to be hurting the case where the actual allocation to 
the socket is only 2K for the half page w/ reuse.

I was brining up the other allocation strategy to prove a point. From my 
perspective it wouldn't make any more sense to assign 32K to the 
truesize for an allocated fragment using __netdev_alloc_frag, but it can 
suffer the same type of issues only to a greater extent due to the use 
of the compound page.  Just because it is shared among many more uses 
doesn't mean it couldn't end up in a scenario where one socket somehow 
keeps queueing up the 32K pages and sitting on them.  I would think all 
it would take is 1 bad acting flow interleaved in ~20 active flows to 
suddenly gobble up a ton of memory without it being accounted for.

> We only have to make sure we do not fool memory management layers, when
> they do not understand where the memory is.
>
> Apparently you think it is hard, while it really is not.

I think you are over simplifying it.  By setting it to 4K there are 
situations where a socket will be double charged for getting two halves 
of the same page.  In these cases there will be a negative impact on 
performance as the number of frames that can be queued is reduced.  What 
you are proposing is possibly overreporting memory use by a factor of 2 
instead of possibly under-reporting it by a factor of 2.

I would be more moved by data than just conjecture on what the driver is 
or isn't doing.  My theory is that most of the time the page is reused 
so 2K is the correct value to report, and very seldom would 4K ever be 
the correct value.  This is what I have seen historically with igb/ixgbe 
using the page reuse.  If you have cases that show that the page isn't 
being reused then we can explore the 4K truesize change, but until then 
I think the page is likely being reused and we should probably just 
stick with the 2K value as we should be getting at least 2 uses per page.

Thanks,

Alex

^ permalink raw reply

* Re: [PATCH] dsa: Fix conversion from host device to mii bus
From: Alexander Duyck @ 2014-10-17 23:02 UTC (permalink / raw)
  To: Guenter Roeck, David S. Miller
  Cc: Florian Fainelli, Alexander Duyck, netdev, linux-kernel
In-Reply-To: <1413574258-21766-1-git-send-email-linux@roeck-us.net>

On 10/17/2014 12:30 PM, Guenter Roeck wrote:
> Commit b4d2394d01bc ("dsa: Replace mii_bus with a generic host device")
> replaces mii_bus with a generic host_dev, and introduces
> dsa_host_dev_to_mii_bus() to support conversion from host_dev to mii_bus.
> However, in some cases it uses to_mii_bus to perform that conversion.
> Since host_dev is not the phy bus device but typically a platform device,
> this fails and results in a crash with the affected drivers.
>
> BUG: unable to handle kernel NULL pointer dereference at           (null)
> IP: [<ffffffff81781d35>] __mutex_lock_slowpath+0x75/0x100
> PGD 406783067 PUD 406784067 PMD 0
> Oops: 0002 [#1] SMP
> ...
> Call Trace:
> [<ffffffff810a538b>] ? pick_next_task_fair+0x61b/0x880
> [<ffffffff81781de3>] mutex_lock+0x23/0x37
> [<ffffffff81533244>] mdiobus_read+0x34/0x60
> [<ffffffff8153b95a>] __mv88e6xxx_reg_read+0x8a/0xa0
> [<ffffffff8153b9bc>] mv88e6xxx_reg_read+0x4c/0xa0
>
> Fixes: b4d2394d01bc ("dsa: Replace mii_bus with a generic host device")
> Cc: Alexander Duyck <alexander.h.duyck@intel.com>
> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
> ---

The fix looks good to me.

Acked-by: Alexander Duyck <alexander.h.duyck@redhat.com>

^ permalink raw reply

* [PATCH net v2] net: dsa: add includes for ethtool and phy_fixed definitions
From: Florian Fainelli @ 2014-10-17 23:02 UTC (permalink / raw)
  To: netdev; +Cc: davem, Florian Fainelli

net/dsa/slave.c uses functions and structures declared in phy_fixed.h
but does not explicitely include it, while dsa.h needs structure
declarations for 'struct ethtool_wolinfo' and 'struct ethtool_eee', fix
those by including the correct header files.

Fixes: ec9436baedb6 ("net: dsa: allow drivers to do link adjustment")
Fixes: ce31b31c68e7 ("net: dsa: allow updating fixed PHY link information")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
Changes in v2:
- actually include the headers instead of using forward declarations

 include/net/dsa.h | 1 +
 net/dsa/slave.c   | 1 +
 2 files changed, 2 insertions(+)

diff --git a/include/net/dsa.h b/include/net/dsa.h
index 58ad8c6492db..b76559293535 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -18,6 +18,7 @@
 #include <linux/of.h>
 #include <linux/phy.h>
 #include <linux/phy_fixed.h>
+#include <linux/ethtool.h>
 
 enum dsa_tag_protocol {
 	DSA_TAG_PROTO_NONE = 0,
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 8030489d9cbe..a851e9f14118 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -11,6 +11,7 @@
 #include <linux/list.h>
 #include <linux/etherdevice.h>
 #include <linux/phy.h>
+#include <linux/phy_fixed.h>
 #include <linux/of_net.h>
 #include <linux/of_mdio.h>
 #include "dsa_priv.h"
-- 
1.9.1

^ permalink raw reply related

* [PATCH net] openvswitch: Set flow-key members.
From: Pravin B Shelar @ 2014-10-17 20:56 UTC (permalink / raw)
  To: davem; +Cc: netdev, Pravin B Shelar

This patch adds missing memset which are required to initialize
flow key member. For example for IP flow we need to initialize
ip.frag for all cases.

Found by inspection.

This bug is introduced by commit 0714812134d7dcadeb7ecfbfeb18788aa7e1eaac
("openvswitch: Eliminate memset() from flow_extract").

Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
---
 net/openvswitch/flow.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
index c5cfc72..2b78789 100644
--- a/net/openvswitch/flow.c
+++ b/net/openvswitch/flow.c
@@ -274,6 +274,8 @@ static int parse_ipv6hdr(struct sk_buff *skb, struct sw_flow_key *key)
 			key->ip.frag = OVS_FRAG_TYPE_LATER;
 		else
 			key->ip.frag = OVS_FRAG_TYPE_FIRST;
+	} else {
+		key->ip.frag = OVS_FRAG_TYPE_NONE;
 	}
 
 	nh_len = payload_ofs - nh_ofs;
@@ -358,6 +360,7 @@ static int parse_icmpv6(struct sk_buff *skb, struct sw_flow_key *key,
 	 */
 	key->tp.src = htons(icmp->icmp6_type);
 	key->tp.dst = htons(icmp->icmp6_code);
+	memset(&key->ipv6.nd, 0, sizeof(key->ipv6.nd));
 
 	if (icmp->icmp6_code == 0 &&
 	    (icmp->icmp6_type == NDISC_NEIGHBOUR_SOLICITATION ||
@@ -674,9 +677,6 @@ int ovs_flow_key_extract(struct ovs_tunnel_info *tun_info,
 	key->ovs_flow_hash = 0;
 	key->recirc_id = 0;
 
-	/* Flags are always used as part of stats */
-	key->tp.flags = 0;
-
 	return key_extract(skb, key);
 }
 
-- 
1.7.1

^ permalink raw reply related

* Re: [PATCH] dsa: Fix conversion from host device to mii bus
From: Florian Fainelli @ 2014-10-17 23:40 UTC (permalink / raw)
  To: Guenter Roeck, David S. Miller; +Cc: Alexander Duyck, netdev, linux-kernel
In-Reply-To: <1413574258-21766-1-git-send-email-linux@roeck-us.net>

On 10/17/2014 12:30 PM, Guenter Roeck wrote:
> Commit b4d2394d01bc ("dsa: Replace mii_bus with a generic host device")
> replaces mii_bus with a generic host_dev, and introduces
> dsa_host_dev_to_mii_bus() to support conversion from host_dev to mii_bus.
> However, in some cases it uses to_mii_bus to perform that conversion.
> Since host_dev is not the phy bus device but typically a platform device,
> this fails and results in a crash with the affected drivers.
> 
> BUG: unable to handle kernel NULL pointer dereference at           (null)
> IP: [<ffffffff81781d35>] __mutex_lock_slowpath+0x75/0x100
> PGD 406783067 PUD 406784067 PMD 0
> Oops: 0002 [#1] SMP
> ...
> Call Trace:
> [<ffffffff810a538b>] ? pick_next_task_fair+0x61b/0x880
> [<ffffffff81781de3>] mutex_lock+0x23/0x37
> [<ffffffff81533244>] mdiobus_read+0x34/0x60
> [<ffffffff8153b95a>] __mv88e6xxx_reg_read+0x8a/0xa0
> [<ffffffff8153b9bc>] mv88e6xxx_reg_read+0x4c/0xa0
> 
> Fixes: b4d2394d01bc ("dsa: Replace mii_bus with a generic host device")
> Cc: Alexander Duyck <alexander.h.duyck@intel.com>
> Signed-off-by: Guenter Roeck <linux@roeck-us.net>

Acked-by: Florian Fainelli <f.fainelli@gmail.com>

> ---
>  drivers/net/dsa/mv88e6060.c | 16 ++++++++++++----
>  drivers/net/dsa/mv88e6xxx.c | 14 ++++++++++----
>  2 files changed, 22 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/dsa/mv88e6060.c b/drivers/net/dsa/mv88e6060.c
> index 776e965..05b0ca3 100644
> --- a/drivers/net/dsa/mv88e6060.c
> +++ b/drivers/net/dsa/mv88e6060.c
> @@ -21,8 +21,12 @@
>  
>  static int reg_read(struct dsa_switch *ds, int addr, int reg)
>  {
> -	return mdiobus_read(to_mii_bus(ds->master_dev),
> -			    ds->pd->sw_addr + addr, reg);
> +	struct mii_bus *bus = dsa_host_dev_to_mii_bus(ds->master_dev);
> +
> +	if (bus == NULL)
> +		return -EINVAL;
> +
> +	return mdiobus_read(bus, ds->pd->sw_addr + addr, reg);
>  }
>  
>  #define REG_READ(addr, reg)					\
> @@ -38,8 +42,12 @@ static int reg_read(struct dsa_switch *ds, int addr, int reg)
>  
>  static int reg_write(struct dsa_switch *ds, int addr, int reg, u16 val)
>  {
> -	return mdiobus_write(to_mii_bus(ds->master_dev),
> -			     ds->pd->sw_addr + addr, reg, val);
> +	struct mii_bus *bus = dsa_host_dev_to_mii_bus(ds->master_dev);
> +
> +	if (bus == NULL)
> +		return -EINVAL;
> +
> +	return mdiobus_write(bus, ds->pd->sw_addr + addr, reg, val);
>  }
>  
>  #define REG_WRITE(addr, reg, val)				\
> diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
> index d6f6428..a6c90cf 100644
> --- a/drivers/net/dsa/mv88e6xxx.c
> +++ b/drivers/net/dsa/mv88e6xxx.c
> @@ -75,11 +75,14 @@ int __mv88e6xxx_reg_read(struct mii_bus *bus, int sw_addr, int addr, int reg)
>  int mv88e6xxx_reg_read(struct dsa_switch *ds, int addr, int reg)
>  {
>  	struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> +	struct mii_bus *bus = dsa_host_dev_to_mii_bus(ds->master_dev);
>  	int ret;
>  
> +	if (bus == NULL)
> +		return -EINVAL;
> +
>  	mutex_lock(&ps->smi_mutex);
> -	ret = __mv88e6xxx_reg_read(to_mii_bus(ds->master_dev),
> -				   ds->pd->sw_addr, addr, reg);
> +	ret = __mv88e6xxx_reg_read(bus, ds->pd->sw_addr, addr, reg);
>  	mutex_unlock(&ps->smi_mutex);
>  
>  	return ret;
> @@ -119,11 +122,14 @@ int __mv88e6xxx_reg_write(struct mii_bus *bus, int sw_addr, int addr,
>  int mv88e6xxx_reg_write(struct dsa_switch *ds, int addr, int reg, u16 val)
>  {
>  	struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> +	struct mii_bus *bus = dsa_host_dev_to_mii_bus(ds->master_dev);
>  	int ret;
>  
> +	if (bus == NULL)
> +		return -EINVAL;
> +
>  	mutex_lock(&ps->smi_mutex);
> -	ret = __mv88e6xxx_reg_write(to_mii_bus(ds->master_dev),
> -				    ds->pd->sw_addr, addr, reg, val);
> +	ret = __mv88e6xxx_reg_write(bus, ds->pd->sw_addr, addr, reg, val);
>  	mutex_unlock(&ps->smi_mutex);
>  
>  	return ret;
> 

^ permalink raw reply

* Re: [PATCH net] ipv6: introduce tcp_v6_iif()
From: Cong Wang @ 2014-10-18  0:00 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <1413569996.25949.6.camel@edumazet-glaptop2.roam.corp.google.com>

On Fri, Oct 17, 2014 at 11:19 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Fri, 2014-10-17 at 09:58 -0700, Cong Wang wrote:
>
>> I doubt we still need to store iif in IP6CB() since we have skb->skb_iif,
>> we can probably just make inet6_iif() be like inet_iif() so that IP6CB()->iif
>> can be just removed? Does this make any sense to you?
>>
>
> This makes sense and would gain 4 bytes in skb->cb[], so definitely
> worth it.
>


Hmm, after testing my patch, I found that some IPv6 code (scope_id) calls
inet6_iif() at socket layer too, where skb dst is already dropped. So we can't
simply use skb_dst() for all inet6_iif(). This is also why IPv4 is different.

Therefore, I think your patch is fine at least for net.

^ permalink raw reply

* Re: [PATCH net] ipv6: introduce tcp_v6_iif()
From: Cong Wang @ 2014-10-18  0:02 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <1413562640.24953.26.camel@edumazet-glaptop2.roam.corp.google.com>

On Fri, Oct 17, 2014 at 9:17 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> Commit 971f10eca186 ("tcp: better TCP_SKB_CB layout to reduce cache line
> misses") added a regression for SO_BINDTODEVICE on IPv6.
>
> This is because we still use inet6_iif() which expects that IP6 control
> block is still at the beginning of skb->cb[]
>
> This patch adds tcp_v6_iif() helper and uses it where necessary.
>
> Because __inet6_lookup_skb() is used by TCP and DCCP, we add an iif
> parameter to it.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Fixes: 971f10eca186 ("tcp: better TCP_SKB_CB layout to reduce cache line misses")

As I already explained in previous reply, this is probably the best solution
for net, so:

Acked-by: Cong Wang <cwang@twopensource.com>

^ permalink raw reply

* Re: [PATCH] net: use hardware buffer pool to allocate skb
From: Eric Dumazet @ 2014-10-18  0:26 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: David Laight, Jiafei.Pan@freescale.com, David Miller,
	jkosina@suse.cz, netdev@vger.kernel.org, LeoLi@freescale.com,
	linux-doc@vger.kernel.org
In-Reply-To: <1413571980.25949.9.camel@edumazet-glaptop2.roam.corp.google.com>

On Fri, 2014-10-17 at 11:53 -0700, Eric Dumazet wrote:

> Don't worry, I will work on it.

A very simple head drop strategy instead of tail drop on the socket
backlog seems to do the trick :

TCP behaves way better when there are head drops, as fast retransmits
can usually close the gap without any added delay. Sender automatically
adjusts its cwnd (or rate) according to the receiver skb->truesize /
skb->len ratio

^ permalink raw reply

* Re: [PATCH net-next,v3] hyperv: Add handling of IP header with option field in netvsc_set_hash()
From: David Miller @ 2014-10-18  3:43 UTC (permalink / raw)
  To: haiyangz; +Cc: olaf, netdev, jasowang, driverdev-devel, linux-kernel
In-Reply-To: <1413496078-25066-1-git-send-email-haiyangz@microsoft.com>

From: Haiyang Zhang <haiyangz@microsoft.com>
Date: Thu, 16 Oct 2014 14:47:58 -0700

> In case that the IP header has optional field at the end, this patch will
> get the port numbers after that field, and compute the hash. The general
> parser skb_flow_dissect() is used here.
> 
> Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH] ipv4: fix a potential use after free
From: David Miller @ 2014-10-18  3:46 UTC (permalink / raw)
  To: roy.qing.li; +Cc: netdev, pshelar
In-Reply-To: <1413536003-15646-1-git-send-email-roy.qing.li@gmail.com>

From: roy.qing.li@gmail.com
Date: Fri, 17 Oct 2014 16:53:23 +0800

> From: Li RongQing <roy.qing.li@gmail.com>
> 
> pskb_may_pull() maybe change skb->data and make eth pointer oboslete,
> so set eth after pskb_may_pull()
> 
> Fixes:3d7b46cd("ip_tunnel: push generic protocol handling to ip_tunnel module")
> Cc: Pravin B Shelar <pshelar@nicira.com>
> Signed-off-by: Li RongQing <roy.qing.li@gmail.com>

Applied and queued up for -stable.

^ permalink raw reply

* Re: [PATCH] ipv4: fix a potential use after free in fou.c
From: David Miller @ 2014-10-18  3:46 UTC (permalink / raw)
  To: roy.qing.li; +Cc: netdev, therbert
In-Reply-To: <1413536027-15700-1-git-send-email-roy.qing.li@gmail.com>

From: roy.qing.li@gmail.com
Date: Fri, 17 Oct 2014 16:53:47 +0800

> From: Li RongQing <roy.qing.li@gmail.com>
> 
> pskb_may_pull() maybe change skb->data and make uh pointer oboslete,
> so reload uh and guehdr
> 
> Fixes: 37dd0247 ("gue: Receive side for Generic UDP Encapsulation")
> Cc: Tom Herbert <therbert@google.com>
> Signed-off-by: Li RongQing <roy.qing.li@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH net] r8152: return -EBUSY for runtime suspend
From: David Miller @ 2014-10-18  3:46 UTC (permalink / raw)
  To: hayeswang; +Cc: netdev, nic_swsd, linux-kernel, linux-usb
In-Reply-To: <1394712342-15778-65-Taiwan-albertk@realtek.com>

From: Hayes Wang <hayeswang@realtek.com>
Date: Fri, 17 Oct 2014 16:55:08 +0800

> Remove calling cancel_delayed_work_sync() for runtime suspend,
> because it would cause dead lock. Instead, return -EBUSY to
> avoid the device enters suspending if the net is running and
> the delayed work is pending or running. The delayed work would
> try to wake up the device later, so the suspending is not
> necessary.
> 
> Signed-off-by: Hayes Wang <hayeswang@realtek.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] sfc: add support for skb->xmit_more
From: David Miller @ 2014-10-18  3:47 UTC (permalink / raw)
  To: ecree
  Cc: rstonehouse, dborkman, nikolay, netdev, sshah, jcooper,
	linux-net-drivers
In-Reply-To: <alpine.LFD.2.03.1410171512070.3218@solarflare.com>

From: Edward Cree <ecree@solarflare.com>
Date: Fri, 17 Oct 2014 15:32:25 +0100

> Don't ring the doorbell, and don't do PIO.  This will also prevent
>  TX Push, because there will be more than one buffer waiting when
>  the doorbell is rung.
> 
> Signed-off-by: Edward Cree <ecree@solarflare.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH net] ipv6: introduce tcp_v6_iif()
From: David Miller @ 2014-10-18  3:48 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1413562640.24953.26.camel@edumazet-glaptop2.roam.corp.google.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 17 Oct 2014 09:17:20 -0700

> From: Eric Dumazet <edumazet@google.com>
> 
> Commit 971f10eca186 ("tcp: better TCP_SKB_CB layout to reduce cache line
> misses") added a regression for SO_BINDTODEVICE on IPv6.
> 
> This is because we still use inet6_iif() which expects that IP6 control
> block is still at the beginning of skb->cb[]
> 
> This patch adds tcp_v6_iif() helper and uses it where necessary.
> 
> Because __inet6_lookup_skb() is used by TCP and DCCP, we add an iif
> parameter to it.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Fixes: 971f10eca186 ("tcp: better TCP_SKB_CB layout to reduce cache line misses")

Applied.

^ permalink raw reply

* Re: [PATCH net 1/1] tipc: fix bug in bundled buffer reception
From: David Miller @ 2014-10-18  3:51 UTC (permalink / raw)
  To: jon.maloy
  Cc: netdev, paul.gortmaker, erik.hugne, ying.xue, maloy,
	tipc-discussion
In-Reply-To: <1413573928-17954-1-git-send-email-jon.maloy@ericsson.com>

From: Jon Maloy <jon.maloy@ericsson.com>
Date: Fri, 17 Oct 2014 15:25:28 -0400

> In commit ec8a2e5621db2da24badb3969eda7fd359e1869f ("tipc: same receive
> code path for connection protocol and data messages") we omitted the
> the possiblilty that an arriving message extracted from a bundle buffer
> may be a multicast message. Such messages need to be to be delivered to
> the socket via a separate function, tipc_sk_mcast_rcv(). As a result,
> small multicast messages arriving as members of a bundle buffer will be
> silently dropped.
> 
> This commit corrects the error by considering this case in the function
> tipc_link_bundle_rcv().
> 
> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>

Applied and queued up for -stable.

^ permalink raw reply

* Re: [PATCH] dsa: Fix conversion from host device to mii bus
From: David Miller @ 2014-10-18  3:52 UTC (permalink / raw)
  To: linux; +Cc: f.fainelli, alexander.h.duyck, netdev, linux-kernel
In-Reply-To: <1413574258-21766-1-git-send-email-linux@roeck-us.net>

From: Guenter Roeck <linux@roeck-us.net>
Date: Fri, 17 Oct 2014 12:30:58 -0700

> Commit b4d2394d01bc ("dsa: Replace mii_bus with a generic host device")
> replaces mii_bus with a generic host_dev, and introduces
> dsa_host_dev_to_mii_bus() to support conversion from host_dev to mii_bus.
> However, in some cases it uses to_mii_bus to perform that conversion.
> Since host_dev is not the phy bus device but typically a platform device,
> this fails and results in a crash with the affected drivers.
> 
> BUG: unable to handle kernel NULL pointer dereference at           (null)
> IP: [<ffffffff81781d35>] __mutex_lock_slowpath+0x75/0x100
> PGD 406783067 PUD 406784067 PMD 0
> Oops: 0002 [#1] SMP
> ...
> Call Trace:
> [<ffffffff810a538b>] ? pick_next_task_fair+0x61b/0x880
> [<ffffffff81781de3>] mutex_lock+0x23/0x37
> [<ffffffff81533244>] mdiobus_read+0x34/0x60
> [<ffffffff8153b95a>] __mv88e6xxx_reg_read+0x8a/0xa0
> [<ffffffff8153b9bc>] mv88e6xxx_reg_read+0x4c/0xa0
> 
> Fixes: b4d2394d01bc ("dsa: Replace mii_bus with a generic host device")
> Cc: Alexander Duyck <alexander.h.duyck@intel.com>
> Signed-off-by: Guenter Roeck <linux@roeck-us.net>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH 1/1 net-next] netrom: use linux/uaccess.h
From: David Miller @ 2014-10-18  3:53 UTC (permalink / raw)
  To: fabf; +Cc: linux-kernel, ralf, linux-hams, netdev
In-Reply-To: <1413576023-26717-1-git-send-email-fabf@skynet.be>

From: Fabian Frederick <fabf@skynet.be>
Date: Fri, 17 Oct 2014 22:00:22 +0200

> replace asm/uaccess.h by linux/uaccess.h
> 
> Signed-off-by: Fabian Frederick <fabf@skynet.be>

Applied.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox