Netdev List
 help / color / mirror / Atom feed
* [PATCH 2/2 net-next] sunvnet: Return from vnet_napi_event() if no packets to read
From: Sowmini Varadhan @ 2014-11-06 19:51 UTC (permalink / raw)
  To: davem, sowmini.varadhan, david.stevens; +Cc: netdev


vnet_event_napi() may be called as part of the NAPI ->poll,
to resume reading descriptor rings. When no data is available,
descriptor ring state (e.g., rcv_nxt) needs to be reset
carefully to stay in lock-step with ldc_read(). In the interest
of simplicity, the best way to do this is to return from 
vnet_event_napi() when there are no more packets to read.
The next trip through ldc_rx will correctly set up the dring state.

Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Tested-by: David Stevens <david.stevens@oracle.com>
---
 drivers/net/ethernet/sun/sunvnet.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/sun/sunvnet.c b/drivers/net/ethernet/sun/sunvnet.c
index 2688b19..5c5fb59 100644
--- a/drivers/net/ethernet/sun/sunvnet.c
+++ b/drivers/net/ethernet/sun/sunvnet.c
@@ -691,7 +691,6 @@ ldc_ctrl:
 			pkt->end_idx = -1;
 			goto napi_resume;
 		}
-ldc_read:
 		err = ldc_read(vio->lp, &msgbuf, sizeof(msgbuf));
 		if (unlikely(err < 0)) {
 			if (err == -ECONNRESET)
@@ -722,8 +721,8 @@ napi_resume:
 				err = vnet_rx(port, &msgbuf, &npkts, budget);
 				if (npkts >= budget)
 					break;
-				if (npkts == 0 && err != -ECONNRESET)
-					goto ldc_read;
+				if (npkts == 0)
+					break;
 			} else if (msgbuf.tag.stype == VIO_SUBTYPE_ACK) {
 				err = vnet_ack(port, &msgbuf);
 				if (err > 0)
-- 
1.8.4.2

^ permalink raw reply related

* [PATCH 0/2 net-next] sunvnet: bug fixes
From: Sowmini Varadhan @ 2014-11-06 19:50 UTC (permalink / raw)
  To: davem, sowmini.varadhan, david.stevens, ben; +Cc: netdev


This patch series has a coding-style fix and a bug fix.

The coding style fix (patch 1) is the extra indentation flagged by
Ben Hutchings:
  http://marc.info/?l=linux-netdev&m=141529243409594&w=2

The bugfix (patch 2) is the following: 
when vnet_event_napi() is  called as part of napi_resume 
(i.e., continuation of a previous NAPI read that was truncated 
due to budget constraints), and then finds no more packets to read, 
the code was trying to avoid an additional trip through ldc_rx 
as an optimization. However, when this corner case happens, we would
need to reset a number of dring state bits such as rcv_nxt carefully, 
which quickly becomes complex and hacky.  The cleaner solution
is to just roll back to vnet_poll, re-enable interrupts and set up
dring state as was done in the pre-NAPI version of the driver.


Sowmini Varadhan (2):
  Fix indentation in maybe_tx_wakeup()
  Return from vnet_napi_event() if no packets to read

 drivers/net/ethernet/sun/sunvnet.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

-- 
1.8.4.2

^ permalink raw reply

* Re: [PATCH net-next 0/3] sfc: Clean up Siena SR-IOV support in preparation for EF10 SR-IOV support
From: David Miller @ 2014-11-06 19:43 UTC (permalink / raw)
  To: sshah; +Cc: netdev, linux-net-drivers
In-Reply-To: <545A14CD.6040809@solarflare.com>

From: Shradha Shah <sshah@solarflare.com>
Date: Wed, 5 Nov 2014 12:15:09 +0000

> This patch series provides a base and clean up for the upcoming
> EF10 SRIOV patches.

Series applied, thanks.

^ permalink raw reply

* Re: [PATCHv2 net-next] xen-netback: remove unconditional __pskb_pull_tail() in guest Tx path
From: David Miller @ 2014-11-06 19:40 UTC (permalink / raw)
  To: david.vrabel; +Cc: netdev, xen-devel, ian.campbell, wei.liu2, malcolm.crossley
In-Reply-To: <1415184622-19421-1-git-send-email-david.vrabel@citrix.com>

From: David Vrabel <david.vrabel@citrix.com>
Date: Wed, 5 Nov 2014 10:50:22 +0000

> From: Malcolm Crossley <malcolm.crossley@citrix.com>
> 
> Unconditionally pulling 128 bytes into the linear area is not required
> for:
> 
> - security: Every protocol demux starts with pskb_may_pull() to pull
>   frag data into the linear area, if necessary, before looking at
>   headers.
> 
> - performance: Netback has already grant copied up-to 128 bytes from
>   the first slot of a packet into the linear area. The first slot
>   normally contain all the IPv4/IPv6 and TCP/UDP headers.
> 
> The unconditional pull would often copy frag data unnecessarily.  This
> is a performance problem when running on a version of Xen where grant
> unmap avoids TLB flushes for pages which are not accessed.  TLB
> flushes can now be avoided for > 99% of unmaps (it was 0% before).
> 
> Grant unmap TLB flush avoidance will be available in a future version
> of Xen (probably 4.6).
> 
> Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com>
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH v3 0/4] stmmac: pci: various cleanups and fixes
From: David Miller @ 2014-11-06 19:39 UTC (permalink / raw)
  To: andriy.shevchenko; +Cc: peppe.cavallaro, netdev, hock.leong.kweh, vbridgers2013
In-Reply-To: <1415183249-9231-1-git-send-email-andriy.shevchenko@linux.intel.com>

From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date: Wed,  5 Nov 2014 12:27:25 +0200

> There are few cleanups and fixes regarding to stmmac PCI driver.
> This has been tested on Intel Galileo board with recent net-next tree.
> 
> Since v2:
> - drop patch 5/5 since it will be part of a big change across entire subsystem
> 
> Since v1:
> - remove already applied patch
> - append patch 1/5
> - rework patch 3/5 to be functional compatible with original code

These look fine, series applied to net-next, thanks.

^ permalink raw reply

* Re: [PATCH v2] stmmac: fix sparse warnings
From: David Miller @ 2014-11-06 19:35 UTC (permalink / raw)
  To: andriy.shevchenko; +Cc: peppe.cavallaro, netdev, vbridgers2013
In-Reply-To: <1415180732-8011-1-git-send-email-andriy.shevchenko@linux.intel.com>

From: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Date: Wed,  5 Nov 2014 11:45:32 +0200

> This patch fixes the following sparse warnings.
> 
> drivers/net/ethernet/stmicro/stmmac/enh_desc.c:381:30: warning: symbol 'enh_desc_ops' was not declared. Should it be static?
> drivers/net/ethernet/stmicro/stmmac/norm_desc.c:253:30: warning: symbol 'ndesc_ops' was not declared. Should it be static?
> drivers/net/ethernet/stmicro/stmmac/stmmac_hwtstamp.c:141:33: warning: symbol 'stmmac_ptp' was not declared. Should it be static?
> 
> There is no functional change.
> 
> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> Acked-by: Giuseppe CAVALLARO <peppe.cavallaro@st.com>
> ---
> Since v1:
> - redone as Giuseppe suggested

Applied, thanks.

^ permalink raw reply

* Re: [PATCH RFC net] ip_tunnel: Respect the IP_DF bit of the inner packet.
From: David Miller @ 2014-11-06 19:33 UTC (permalink / raw)
  To: steffen.klassert; +Cc: netdev
In-Reply-To: <20141105080930.GE6390@secunet.com>

From: Steffen Klassert <steffen.klassert@secunet.com>
Date: Wed, 5 Nov 2014 09:09:30 +0100

> The pmtu calculation depends on the IP_DF bit in tnl_update_pmtu().
> If the IP_DF bit is set, the pmtu calculation is based on the outer
> packet size. Otherwise it is based on the inner packet size.
> If xfrm is used after tunneling through an ipip device, the mtu of
> the outer device can be lower than the mtu of the ipip device.
> Reporting the mtu of the ipip device is wrong in this case. So
> respect the IP_DF bit of the inner packet on ipv4 to report the
> calculated mtu of the outer device.
> 
> Fixes: fd58156e456d ("IPIP: Use ip-tunneling code.")
> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
> ---
> 
> I marked this as RFC because it affects the mtu calculation of
> gre tunnels too. I think it should be ok, but I have no testcase
> to confirm the correctness for gre tunnels. So would be good if
> someone with gre knowlegde could look at this.
> 
> If it turns out that we can't do that for gre, we need to
> split this code back into a gre and an ipip version.

Looking quickly at this, the don't-frag handling in the
pre-ip-tunneling GRE code conversion used different conditions
wrt. calculating 'df'.

It takes the frag off from skb->data's IPH when skb->protocol
is GRE, for example.

So we may have to do this split.

^ permalink raw reply

* Re: am335x: cpsw: phy ignores max-speed setting
From: Lennart Sorensen @ 2014-11-06 19:20 UTC (permalink / raw)
  To: Yegor Yefremov; +Cc: netdev, N, Mugunthan V, mpa, Daniel Mack
In-Reply-To: <CAGm1_ktWK5ai85PZJTkq8Q1mAFH6JZ5XM1mDOHO3K_N2iGNLWg@mail.gmail.com>

On Thu, Nov 06, 2014 at 05:25:13PM +0100, Yegor Yefremov wrote:
> I' m trying to override max-speed setting for both CPSW connected
> PHYs. This is my DTS section for configuring CPSW:
> 
> &mac {
>         pinctrl-names = "default", "sleep";
>         pinctrl-0 = <&cpsw_default>;
>         pinctrl-1 = <&cpsw_sleep>;
>         dual_emac = <1>;
> 
>         status = "okay";
> };
> 
> &davinci_mdio {
>         pinctrl-names = "default", "sleep";
>         pinctrl-0 = <&davinci_mdio_default>;
>         pinctrl-1 = <&davinci_mdio_sleep>;
> 
>         status = "okay";
> };
> 
> &cpsw_emac0 {
>         phy_id = <&davinci_mdio>, <0>;
>         phy-mode = "rgmii-id";
>         dual_emac_res_vlan = <1>;
>         max-speed = <100>;
> };
> 
> &cpsw_emac1 {
>         phy_id = <&davinci_mdio>, <1>;
>         phy-mode = "rgmii-id";
>         dual_emac_res_vlan = <2>;
>         max-speed = <100>;
> };
> 
> But in drivers/net/phy/phy_device.c->of_set_phy_supported() routine I
> don't get through node check, i.e. node == NULL. Any idea why?
> 
> static void of_set_phy_supported(struct phy_device *phydev)
> {
>         struct device_node *node = phydev->dev.of_node;
>         u32 max_speed;

Did you try adding a printk here to make sure it is actually called?

>         if (!IS_ENABLED(CONFIG_OF_MDIO))
>                 return;

Do you have CONFIG_OF_MDIO on?  I would think so.

>         if (!node)
>                 return;
> 
>         if (!of_property_read_u32(node, "max-speed", &max_speed)) {
>                 /* The default values for phydev->supported are
> provided by the PHY
>                  * driver "features" member, we want to reset to sane
> defaults fist
>                  * before supporting higher speeds.
>                  */
>                 phydev->supported &= PHY_DEFAULT_FEATURES;
> 
>                 switch (max_speed) {
>                 default:
>                         return;
> 
>                 case SPEED_1000:
>                         phydev->supported |= PHY_1000BT_FEATURES;
>                 case SPEED_100:
>                         phydev->supported |= PHY_100BT_FEATURES;
>                 case SPEED_10:
>                         phydev->supported |= PHY_10BT_FEATURES;
>                 }
>         }
> }

-- 
Len Sorensen

^ permalink raw reply

* Re: am335x: cpsw: phy ignores max-speed setting
From: Joe Perches @ 2014-11-06 19:19 UTC (permalink / raw)
  To: Dave Taht
  Cc: Yegor Yefremov, netdev, N, Mugunthan V, mpa, lsorense,
	Daniel Mack
In-Reply-To: <CAA93jw5=LDirktyC+rvpLi-kywUSosj6QV8-na5p3-f=PxKcWQ@mail.gmail.com>

On Thu, 2014-11-06 at 08:51 -0800, Dave Taht wrote:
> ooh! ooh! I have a BQL enablement patch for the cpsw that I have no
> means of testing against multiple phys. Could
> you give the attached very small patch a shot along the way?

One trivial bit and another possible patch below it

this
+       dev_info(priv->dev, "BQL enabled\n");
might be better as:
+	cpsw_info(priv, link, "BQL enabled\n");

Is this the change that matters most?

-#define CPSW_POLL_WEIGHT       64
+#define CPSW_POLL_WEIGHT       16

If so, maybe this could be limited by a sysctl:

Something like:

 Documentation/sysctl/net.txt | 9 +++++++++
 include/linux/netdevice.h    | 1 +
 net/core/dev.c               | 7 +++++++
 net/core/sysctl_net_core.c   | 7 +++++++
 4 files changed, 24 insertions(+)

diff --git a/Documentation/sysctl/net.txt b/Documentation/sysctl/net.txt
index 04892b8..1fe0ebd 100644
--- a/Documentation/sysctl/net.txt
+++ b/Documentation/sysctl/net.txt
@@ -50,6 +50,15 @@ The maximum number of packets that kernel can handle on a NAPI interrupt,
 it's a Per-CPU variable.
 Default: 64
 
+napi_add_weight_max
+-------------------
+
+Limit the maximum number of packets that a device can register in a
+call to netif_napi_add.  This is disabled by default so the value in the
+specific device call is used, but it may be useful in throughput and
+latency testing.
+Default: 0 (off)
+
 default_qdisc
 --------------
 
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 68fe8a0..31857de 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3380,6 +3380,7 @@ void netdev_stats_to_stats64(struct rtnl_link_stats64 *stats64,
 extern int		netdev_max_backlog;
 extern int		netdev_tstamp_prequeue;
 extern int		weight_p;
+extern int		sysctl_napi_add_weight_max;
 extern int		bpf_jit_enable;
 
 bool netdev_has_upper_dev(struct net_device *dev, struct net_device *upper_dev);
diff --git a/net/core/dev.c b/net/core/dev.c
index c934680..aa9bd8d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3016,6 +3016,7 @@ EXPORT_SYMBOL(netdev_max_backlog);
 int netdev_tstamp_prequeue __read_mostly = 1;
 int netdev_budget __read_mostly = 300;
 int weight_p __read_mostly = 64;            /* old backlog weight */
+int sysctl_napi_add_weight_max __read_mostly = 0; /* disabled by default */
 
 /* Called with irq disabled */
 static inline void ____napi_schedule(struct softnet_data *sd,
@@ -4506,6 +4507,12 @@ void netif_napi_add(struct net_device *dev, struct napi_struct *napi,
 	if (weight > NAPI_POLL_WEIGHT)
 		pr_err_once("netif_napi_add() called with weight %d on device %s\n",
 			    weight, dev->name);
+	if (sysctl_napi_add_weight_max > 0 &&
+	    weight > sysctl_napi_add_weight_max) {
+		pr_notice("netif_napi_add() requested weight %d reduced to sysctl napi_add_weight_max limit %d on device %s\n",
+			  weight, sysctl_napi_add_weight_max, dev->name);
+		weight = sysctl_napi_add_weight_max;
+	}
 	napi->weight = weight;
 	list_add(&napi->dev_list, &dev->napi_list);
 	napi->dev = dev;
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index cf9cd13..c90e524 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -257,6 +257,13 @@ static struct ctl_table net_core_table[] = {
 		.proc_handler	= proc_dointvec
 	},
 	{
+		.procname	= "napi_add_weight_max",
+		.data		= &sysctl_napi_add_weight_max,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec
+	},
+	{
 		.procname	= "netdev_max_backlog",
 		.data		= &netdev_max_backlog,
 		.maxlen		= sizeof(int),

^ permalink raw reply related

* Re: [PATCH net-next 2/2] ip6_tunnel: Add support for wildcard tunnel endpoints.
From: David Miller @ 2014-11-06 19:19 UTC (permalink / raw)
  To: steffen.klassert; +Cc: netdev
In-Reply-To: <20141105070350.GD6390@secunet.com>

From: Steffen Klassert <steffen.klassert@secunet.com>
Date: Wed, 5 Nov 2014 08:03:50 +0100

> This patch adds support for tunnels with local or
> remote wildcard endpoints. With this we get a
> NBMA tunnel mode like we have it for ipv4 and
> sit tunnels.
> 
> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next 1/2] ipv6: Allow sending packets through tunnels with wildcard endpoints
From: David Miller @ 2014-11-06 19:19 UTC (permalink / raw)
  To: steffen.klassert; +Cc: netdev
In-Reply-To: <20141105070248.GC6390@secunet.com>

From: Steffen Klassert <steffen.klassert@secunet.com>
Date: Wed, 5 Nov 2014 08:02:48 +0100

> Currently we need the IP6_TNL_F_CAP_XMIT capabiltiy to transmit
> packets through an ipv6 tunnel. This capability is set when the
> tunnel gets configured, based on the tunnel endpoint addresses.
> 
> On tunnels with wildcard tunnel endpoints, we need to do the
> capabiltiy checking on a per packet basis like it is done in
> the receive path.
> 
> This patch extends ip6_tnl_xmit_ctl() to take local and remote
> addresses as parameters to allow for per packet capabiltiy
> checking.
> 
> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>

Applied.

^ permalink raw reply

* Re: [PATCH 00/13] net_sched: misc cleanups and improvements
From: David Miller @ 2014-11-06 19:03 UTC (permalink / raw)
  To: eric.dumazet; +Cc: xiyou.wangcong, netdev, jhs
In-Reply-To: <1415297873.13896.77.camel@edumazet-glaptop2.roam.corp.google.com>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 06 Nov 2014 10:17:53 -0800

> There is a difference from newbies and you.
> 
> As a community, we welcome new comers and encourage them,
> but after a while, people sending mostly cleanups are shifting in a
> category which doesn't fit to you.
> 
> We expect from you more interesting stuff. You can do it.

+1

> If you want to send cleanups, do this once in a while. Do not send 13
> patches and expect us to be happy with that. We are not.

+1

^ permalink raw reply

* Re: [PATCH net] dcbnl : Fix lock initialization
From: John Fastabend @ 2014-11-06 19:03 UTC (permalink / raw)
  To: Anish Bhatt
  Cc: netdev, davem, john.r.fastabend, ying.xue, jeffrey.t.kirsher,
	ebiederm
In-Reply-To: <1415297355-27282-1-git-send-email-anish@chelsio.com>

On 11/06/2014 10:09 AM, Anish Bhatt wrote:
> dcb_lock was being used uninitialized in dcbnl and is infact missing
>   initialization code. Fixed
>

Are you trying to resolve a bug? It is initialized with

static DEFINE_SPINLOCK(dcb_lock);

and if you follow the code far enough you get to this in
spinlock_types.h:


  #ifdef CONFIG_DEBUG_SPINLOCK
  # define SPIN_DEBUG_INIT(lockname)      \
      .magic = SPINLOCK_MAGIC,        \
      .owner_cpu = -1,            \
      .owner = SPINLOCK_OWNER_INIT,
  #else
  # define SPIN_DEBUG_INIT(lockname)
  #endif

  #define __RAW_SPIN_LOCK_INITIALIZER(lockname)   \
      {                   \
      .raw_lock = __ARCH_SPIN_LOCK_UNLOCKED,  \
      SPIN_DEBUG_INIT(lockname)       \
      SPIN_DEP_MAP_INIT(lockname) }

[...]



-- 
John Fastabend         Intel Corporation

^ permalink raw reply

* Re: [PATCH 00/13] net_sched: misc cleanups and improvements
From: David Miller @ 2014-11-06 19:02 UTC (permalink / raw)
  To: xiyou.wangcong; +Cc: eric.dumazet, netdev, jhs
In-Reply-To: <CAM_iQpWw4UMKZcdZfpp5D-tDfj954fbptyXJUzydgFCero6xNw@mail.gmail.com>

From: Cong Wang <xiyou.wangcong@gmail.com>
Date: Thu, 6 Nov 2014 10:05:41 -0800

> On Tue, Nov 4, 2014 at 5:47 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> On Tue, 2014-11-04 at 17:25 -0800, Cong Wang wrote:
>>
>>> Seriously, think about why it should when it's just cleanup's, be practical.
>>
>> I seriously ask you to not do cleanups then.
> 
> Apparently you didn't say this when the following commits got accepted:

I very strongly encourage you to not go arguing down this road.

The issues with your submissions is the amount of churn as well as the
terseness of explanations.

^ permalink raw reply

* [PATCH] man: ip-link: fix a typo
From: Masatake YAMATO @ 2014-11-06 18:57 UTC (permalink / raw)
  To: netdev; +Cc: yamato

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
---
 man/man8/ip-link.8.in | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/man/man8/ip-link.8.in b/man/man8/ip-link.8.in
index 4ee1d62..6d32f5e 100644
--- a/man/man8/ip-link.8.in
+++ b/man/man8/ip-link.8.in
@@ -54,7 +54,7 @@ ip-link \- network device configuration
 .ti -8
 .IR TYPE " := [ "
 .BR bridge " | "
-.BR bond " ]"
+.BR bond " | "
 .BR can " | "
 .BR dummy " | "
 .BR hsr " | "
-- 
1.9.3

^ permalink raw reply related

* Re: [patch net-next 07/10] bridge: call netdev_sw_port_stp_update when bridge port STP status changes
From: Scott Feldman @ 2014-11-06 18:57 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Jiri Pirko, netdev, davem, nhorman, andy, tgraf, dborkman,
	ogerlitz, jesse, pshelar, azhou, ben, stephen, jeffrey.t.kirsher,
	vyasevic, xiyou.wangcong, john.r.fastabend, edumazet, jhs,
	sfeldma, roopa, linville, jasowang, ebiederm, nicolas.dichtel,
	ryazanov.s.a, buytenh, aviadr, nbd, alexei.starovoitov,
	Neil.Jerram, ronye, simon.horman, alexander.h.duyck, john.ronciak,
	mleitner, shrijeet, gospo
In-Reply-To: <545BA8EF.4060601@gmail.com>



On Thu, 6 Nov 2014, Florian Fainelli wrote:

> On 11/06/2014 01:20 AM, Jiri Pirko wrote:
>> From: Scott Feldman <sfeldma@gmail.com>
>>
>> To notify switch driver of change in STP state of bridge port, add new
>> .ndo op and provide swdev wrapper func to call ndo op. Use it in bridge
>> code then.
>>
>> Signed-off-by: Scott Feldman <sfeldma@gmail.com>
>> Signed-off-by: Jiri Pirko <jiri@resnulli.us>
>> ---
>
> [snip]
>
>>  #endif /* _LINUX_SWITCHDEV_H_ */
>> diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
>> index 86c239b..13fecf1 100644
>> --- a/net/bridge/br_netlink.c
>> +++ b/net/bridge/br_netlink.c
>> @@ -17,6 +17,7 @@
>>  #include <net/net_namespace.h>
>>  #include <net/sock.h>
>>  #include <uapi/linux/if_bridge.h>
>> +#include <net/switchdev.h>
>>
>>  #include "br_private.h"
>>  #include "br_private_stp.h"
>> @@ -304,6 +305,7 @@ static int br_set_port_state(struct net_bridge_port *p, u8 state)
>>
>>  	br_set_state(p, state);
>>  	br_log_state(p);
>> +	netdev_sw_port_stp_update(p->dev, p->state);
>
> Is there a reason netdev_sw_port_stp_update() is not folded in
> br_set_state()? Are we missing calls to br_set_state() in some locations?

I put the netdev_sw call at the same level as br_log_state() and 
br_ifinfo_notify(), but now that you bring up the question, I agree it 
would be cleaner/safer if netdev_sw call was from br_set_state().

> --
> Florian
>

^ permalink raw reply

* Re: [PATCH net 3/5] fm10k: Implement ndo_gso_check()
From: Joe Stringer @ 2014-11-06 18:41 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: netdev, sathya.perla, jeffrey.t.kirsher, linux.nics, amirv,
	shahed.shaikh, Dept-GELinuxNICDev, therbert, linux-kernel
In-Reply-To: <545AE2C8.3070705@gmail.com>

On Wed, Nov 05, 2014 at 06:54:00PM -0800, Alexander Duyck wrote:
> On 11/04/2014 01:56 PM, Joe Stringer wrote:
> > ndo_gso_check() was recently introduced to allow NICs to report the
> > offloading support that they have on a per-skb basis. Add an
> > implementation for this driver which checks for something that looks
> > like VXLAN.
> >
> > Implementation shamelessly stolen from Tom Herbert:
> > http://thread.gmane.org/gmane.linux.network/332428/focus=333111
> >
> > Signed-off-by: Joe Stringer <joestringer@nicira.com>
> > ---
> > Should this driver report support for GSO on packets with tunnel headers
> > up to 64B like the i40e driver does?
> > ---
> >  drivers/net/ethernet/intel/fm10k/fm10k_netdev.c |   12 ++++++++++++
> >  1 file changed, 12 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
> > index 8811364..b9ef622 100644
> > --- a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
> > +++ b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c
> > @@ -1350,6 +1350,17 @@ static void fm10k_dfwd_del_station(struct net_device *dev, void *priv)
> >  	}
> >  }
> >  
> > +static bool fm10k_gso_check(struct sk_buff *skb, struct net_device *dev)
> > +{
> > +	if ((skb_shinfo(skb)->gso_type & SKB_GSO_UDP_TUNNEL) &&
> > +	    (skb->inner_protocol_type != ENCAP_TYPE_ETHER ||
> > +	     skb->inner_protocol != htons(ETH_P_TEB) ||
> > +	     skb_inner_mac_header(skb) - skb_transport_header(skb) != 16))
> > +		return false;
> > +
> > +	return true;
> > +}
> > +
> >  static const struct net_device_ops fm10k_netdev_ops = {
> >  	.ndo_open		= fm10k_open,
> >  	.ndo_stop		= fm10k_close,
> > @@ -1372,6 +1383,7 @@ static const struct net_device_ops fm10k_netdev_ops = {
> >  	.ndo_do_ioctl		= fm10k_ioctl,
> >  	.ndo_dfwd_add_station	= fm10k_dfwd_add_station,
> >  	.ndo_dfwd_del_station	= fm10k_dfwd_del_station,
> > +	.ndo_gso_check		= fm10k_gso_check,
> >  };
> >  
> >  #define DEFAULT_DEBUG_LEVEL_SHIFT 3
> 
> I'm thinking this check is far too simplistic.  If you look the fm10k
> driver already has fm10k_tx_encap_offload() in the TSO function for
> verifying if it can support offloading tunnels or not.  I would
> recommend starting there or possibly even just adapting that function to
> suit your purpose.
> 
> Thanks,
> 
> Alex

Would it be enough to just call fm10k_tx_encap_offload() in a way that echoes fm10k_tso()?

+static bool fm10k_gso_check(struct sk_buff *skb, struct net_device *dev)
+{
+       if (skb->encapsulation && !fm10k_tx_encap_offload(skb))
+               return false;
+
+       return true;
+}

Thanks,
Joe

^ permalink raw reply

* [PATCH v4 net-next] udp: Increment UDP_MIB_IGNOREDMULTI for arriving unmatched multicasts
From: Rick Jones @ 2014-11-06 18:37 UTC (permalink / raw)
  To: netdev; +Cc: davem


From: Rick Jones <rick.jones2@hp.com>

As NIC multicast filtering isn't perfect, and some platforms are
quite content to spew broadcasts, we should not trigger an event
for skb:kfree_skb when we do not have a match for such an incoming
datagram.  We do though want to avoid sweeping the matter under the
rug entirely, so increment a suitable statistic.

This incorporates feedback from David L. Stevens, Karl Neiss and Eric
Dumazet.

V3 - use bool per David Miller

Signed-off-by: Rick Jones <rick.jones2@hp.com>

---

Noticed __udp4_lib_mcast_deliver showing-up in a perf dropped packet
profile on a system sitting on a network with a bunch of Windows boxes
sending what they are fond of sending.

Verified that the new UDP_MIB_IGNOREDMULTI increments when ignored
datagrams are encountered, but was unable to cross the i's and dot
the t's of perf because the perf built from the tree at the time
wasn't happy in general.  Also hit a test system with some netperf
multicast UDP_STREAM and UDP_RR testing but that is the extent of 
the testing performed.

diff --git a/include/uapi/linux/snmp.h b/include/uapi/linux/snmp.h
index df40137..30f541b 100644
--- a/include/uapi/linux/snmp.h
+++ b/include/uapi/linux/snmp.h
@@ -156,6 +156,7 @@ enum
 	UDP_MIB_RCVBUFERRORS,			/* RcvbufErrors */
 	UDP_MIB_SNDBUFERRORS,			/* SndbufErrors */
 	UDP_MIB_CSUMERRORS,			/* InCsumErrors */
+	UDP_MIB_IGNOREDMULTI,			/* IgnoredMulti */
 	__UDP_MIB_MAX
 };
 
diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index 8e3eb39..5c5450c 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -181,6 +181,7 @@ static const struct snmp_mib snmp4_udp_list[] = {
 	SNMP_MIB_ITEM("RcvbufErrors", UDP_MIB_RCVBUFERRORS),
 	SNMP_MIB_ITEM("SndbufErrors", UDP_MIB_SNDBUFERRORS),
 	SNMP_MIB_ITEM("InCsumErrors", UDP_MIB_CSUMERRORS),
+	SNMP_MIB_ITEM("IgnoredMulti", UDP_MIB_IGNOREDMULTI),
 	SNMP_MIB_SENTINEL
 };
 
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index cd0db54..ebee9af 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1647,7 +1647,8 @@ static void udp_sk_rx_dst_set(struct sock *sk, struct dst_entry *dst)
 static int __udp4_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
 				    struct udphdr  *uh,
 				    __be32 saddr, __be32 daddr,
-				    struct udp_table *udptable)
+				    struct udp_table *udptable,
+				    int proto)
 {
 	struct sock *sk, *stack[256 / sizeof(struct sock *)];
 	struct hlist_nulls_node *node;
@@ -1656,6 +1657,7 @@ static int __udp4_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
 	int dif = skb->dev->ifindex;
 	unsigned int count = 0, offset = offsetof(typeof(*sk), sk_nulls_node);
 	unsigned int hash2 = 0, hash2_any = 0, use_hash2 = (hslot->count > 10);
+	bool inner_flushed = false;
 
 	if (use_hash2) {
 		hash2_any = udp4_portaddr_hash(net, htonl(INADDR_ANY), hnum) &
@@ -1674,6 +1676,7 @@ start_lookup:
 					dif, hnum)) {
 			if (unlikely(count == ARRAY_SIZE(stack))) {
 				flush_stack(stack, count, skb, ~0);
+				inner_flushed = true;
 				count = 0;
 			}
 			stack[count++] = sk;
@@ -1695,7 +1698,10 @@ start_lookup:
 	if (count) {
 		flush_stack(stack, count, skb, count - 1);
 	} else {
-		kfree_skb(skb);
+		if (!inner_flushed)
+			UDP_INC_STATS_BH(net, UDP_MIB_IGNOREDMULTI,
+					 proto == IPPROTO_UDPLITE);
+		consume_skb(skb);
 	}
 	return 0;
 }
@@ -1780,7 +1786,7 @@ int __udp4_lib_rcv(struct sk_buff *skb, struct udp_table *udptable,
 	} else {
 		if (rt->rt_flags & (RTCF_BROADCAST|RTCF_MULTICAST))
 			return __udp4_lib_mcast_deliver(net, skb, uh,
-					saddr, daddr, udptable);
+					saddr, daddr, udptable, proto);
 
 		sk = __udp4_lib_lookup_skb(skb, uh->source, uh->dest, udptable);
 	}
diff --git a/net/ipv6/proc.c b/net/ipv6/proc.c
index 1752cd0..679253d0 100644
--- a/net/ipv6/proc.c
+++ b/net/ipv6/proc.c
@@ -136,6 +136,7 @@ static const struct snmp_mib snmp6_udp6_list[] = {
 	SNMP_MIB_ITEM("Udp6RcvbufErrors", UDP_MIB_RCVBUFERRORS),
 	SNMP_MIB_ITEM("Udp6SndbufErrors", UDP_MIB_SNDBUFERRORS),
 	SNMP_MIB_ITEM("Udp6InCsumErrors", UDP_MIB_CSUMERRORS),
+	SNMP_MIB_ITEM("Udp6IgnoredMulti", UDP_MIB_IGNOREDMULTI),
 	SNMP_MIB_SENTINEL
 };
 
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index f6ba535..5bee6d2 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -771,7 +771,7 @@ static void udp6_csum_zero_error(struct sk_buff *skb)
  */
 static int __udp6_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
 		const struct in6_addr *saddr, const struct in6_addr *daddr,
-		struct udp_table *udptable)
+		struct udp_table *udptable, int proto)
 {
 	struct sock *sk, *stack[256 / sizeof(struct sock *)];
 	const struct udphdr *uh = udp_hdr(skb);
@@ -781,6 +781,7 @@ static int __udp6_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
 	int dif = inet6_iif(skb);
 	unsigned int count = 0, offset = offsetof(typeof(*sk), sk_nulls_node);
 	unsigned int hash2 = 0, hash2_any = 0, use_hash2 = (hslot->count > 10);
+	bool inner_flushed = false;
 
 	if (use_hash2) {
 		hash2_any = udp6_portaddr_hash(net, &in6addr_any, hnum) &
@@ -803,6 +804,7 @@ start_lookup:
 		    (uh->check || udp_sk(sk)->no_check6_rx)) {
 			if (unlikely(count == ARRAY_SIZE(stack))) {
 				flush_stack(stack, count, skb, ~0);
+				inner_flushed = true;
 				count = 0;
 			}
 			stack[count++] = sk;
@@ -821,7 +823,10 @@ start_lookup:
 	if (count) {
 		flush_stack(stack, count, skb, count - 1);
 	} else {
-		kfree_skb(skb);
+		if (!inner_flushed)
+			UDP_INC_STATS_BH(net, UDP_MIB_IGNOREDMULTI,
+					 proto == IPPROTO_UDPLITE);
+		consume_skb(skb);
 	}
 	return 0;
 }
@@ -873,7 +878,7 @@ int __udp6_lib_rcv(struct sk_buff *skb, struct udp_table *udptable,
 	 */
 	if (ipv6_addr_is_multicast(daddr))
 		return __udp6_lib_mcast_deliver(net, skb,
-				saddr, daddr, udptable);
+				saddr, daddr, udptable, proto);
 
 	/* Unicast */
 

^ permalink raw reply related

* [PATCH v3 net-next] udp: Increment UDP_MIB_IGNOREDMULTI for arriving unmatched multicasts
From: Rick Jones @ 2014-11-06 18:36 UTC (permalink / raw)
  To: netdev; +Cc: davem


From: Rick Jones <rick.jones2@hp.com>

As NIC multicast filtering isn't perfect, and some platforms are
quite content to spew broadcasts, we should not trigger an event
for skb:kfree_skb when we do not have a match for such an incoming
datagram.  We do though want to avoid sweeping the matter under the
rug entirely, so increment a suitable statistic.

This incorporates feedback from David L. Stevens, Karl Neiss and Eric
Dumazet.

Signed-off-by: Rick Jones <rick.jones2@hp.com>

---

Noticed __udp4_lib_mcast_deliver showing-up in a perf dropped packet
profile on a system sitting on a network with a bunch of Windows boxes
sending what they are fond of sending.

Verified that the new UDP_MIB_IGNOREDMULTI increments when ignored
datagrams are encountered, but was unable to cross the i's and dot
the t's of perf because the perf built from the tree at the time
wasn't happy in general.  Also hit a test system with some netperf
multicast UDP_STREAM and UDP_RR testing but that is the extent of 
the testing performed.

diff --git a/include/uapi/linux/snmp.h b/include/uapi/linux/snmp.h
index df40137..30f541b 100644
--- a/include/uapi/linux/snmp.h
+++ b/include/uapi/linux/snmp.h
@@ -156,6 +156,7 @@ enum
 	UDP_MIB_RCVBUFERRORS,			/* RcvbufErrors */
 	UDP_MIB_SNDBUFERRORS,			/* SndbufErrors */
 	UDP_MIB_CSUMERRORS,			/* InCsumErrors */
+	UDP_MIB_IGNOREDMULTI,			/* IgnoredMulti */
 	__UDP_MIB_MAX
 };
 
diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index 8e3eb39..5c5450c 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -181,6 +181,7 @@ static const struct snmp_mib snmp4_udp_list[] = {
 	SNMP_MIB_ITEM("RcvbufErrors", UDP_MIB_RCVBUFERRORS),
 	SNMP_MIB_ITEM("SndbufErrors", UDP_MIB_SNDBUFERRORS),
 	SNMP_MIB_ITEM("InCsumErrors", UDP_MIB_CSUMERRORS),
+	SNMP_MIB_ITEM("IgnoredMulti", UDP_MIB_IGNOREDMULTI),
 	SNMP_MIB_SENTINEL
 };
 
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index cd0db54..1215f89 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1647,7 +1647,8 @@ static void udp_sk_rx_dst_set(struct sock *sk, struct dst_entry *dst)
 static int __udp4_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
 				    struct udphdr  *uh,
 				    __be32 saddr, __be32 daddr,
-				    struct udp_table *udptable)
+				    struct udp_table *udptable,
+				    int proto)
 {
 	struct sock *sk, *stack[256 / sizeof(struct sock *)];
 	struct hlist_nulls_node *node;
@@ -1656,6 +1657,7 @@ static int __udp4_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
 	int dif = skb->dev->ifindex;
 	unsigned int count = 0, offset = offsetof(typeof(*sk), sk_nulls_node);
 	unsigned int hash2 = 0, hash2_any = 0, use_hash2 = (hslot->count > 10);
+	unsigned int inner_flushed = 0;
 
 	if (use_hash2) {
 		hash2_any = udp4_portaddr_hash(net, htonl(INADDR_ANY), hnum) &
@@ -1674,6 +1676,7 @@ start_lookup:
 					dif, hnum)) {
 			if (unlikely(count == ARRAY_SIZE(stack))) {
 				flush_stack(stack, count, skb, ~0);
+				inner_flushed = 1;
 				count = 0;
 			}
 			stack[count++] = sk;
@@ -1695,7 +1698,10 @@ start_lookup:
 	if (count) {
 		flush_stack(stack, count, skb, count - 1);
 	} else {
-		kfree_skb(skb);
+		if (!inner_flushed)
+			UDP_INC_STATS_BH(net, UDP_MIB_IGNOREDMULTI,
+					 proto == IPPROTO_UDPLITE);
+		consume_skb(skb);
 	}
 	return 0;
 }
@@ -1780,7 +1786,7 @@ int __udp4_lib_rcv(struct sk_buff *skb, struct udp_table *udptable,
 	} else {
 		if (rt->rt_flags & (RTCF_BROADCAST|RTCF_MULTICAST))
 			return __udp4_lib_mcast_deliver(net, skb, uh,
-					saddr, daddr, udptable);
+					saddr, daddr, udptable, proto);
 
 		sk = __udp4_lib_lookup_skb(skb, uh->source, uh->dest, udptable);
 	}
diff --git a/net/ipv6/proc.c b/net/ipv6/proc.c
index 1752cd0..679253d0 100644
--- a/net/ipv6/proc.c
+++ b/net/ipv6/proc.c
@@ -136,6 +136,7 @@ static const struct snmp_mib snmp6_udp6_list[] = {
 	SNMP_MIB_ITEM("Udp6RcvbufErrors", UDP_MIB_RCVBUFERRORS),
 	SNMP_MIB_ITEM("Udp6SndbufErrors", UDP_MIB_SNDBUFERRORS),
 	SNMP_MIB_ITEM("Udp6InCsumErrors", UDP_MIB_CSUMERRORS),
+	SNMP_MIB_ITEM("Udp6IgnoredMulti", UDP_MIB_IGNOREDMULTI),
 	SNMP_MIB_SENTINEL
 };
 
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index f6ba535..d80f21e 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -771,7 +771,7 @@ static void udp6_csum_zero_error(struct sk_buff *skb)
  */
 static int __udp6_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
 		const struct in6_addr *saddr, const struct in6_addr *daddr,
-		struct udp_table *udptable)
+		struct udp_table *udptable, int proto)
 {
 	struct sock *sk, *stack[256 / sizeof(struct sock *)];
 	const struct udphdr *uh = udp_hdr(skb);
@@ -781,6 +781,7 @@ static int __udp6_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
 	int dif = inet6_iif(skb);
 	unsigned int count = 0, offset = offsetof(typeof(*sk), sk_nulls_node);
 	unsigned int hash2 = 0, hash2_any = 0, use_hash2 = (hslot->count > 10);
+	int inner_flushed = 0;
 
 	if (use_hash2) {
 		hash2_any = udp6_portaddr_hash(net, &in6addr_any, hnum) &
@@ -803,6 +804,7 @@ start_lookup:
 		    (uh->check || udp_sk(sk)->no_check6_rx)) {
 			if (unlikely(count == ARRAY_SIZE(stack))) {
 				flush_stack(stack, count, skb, ~0);
+				inner_flushed = 1;
 				count = 0;
 			}
 			stack[count++] = sk;
@@ -821,7 +823,10 @@ start_lookup:
 	if (count) {
 		flush_stack(stack, count, skb, count - 1);
 	} else {
-		kfree_skb(skb);
+		if (!inner_flushed)
+			UDP_INC_STATS_BH(net, UDP_MIB_IGNOREDMULTI,
+					 proto == IPPROTO_UDPLITE);
+		consume_skb(skb);
 	}
 	return 0;
 }
@@ -873,7 +878,7 @@ int __udp6_lib_rcv(struct sk_buff *skb, struct udp_table *udptable,
 	 */
 	if (ipv6_addr_is_multicast(daddr))
 		return __udp6_lib_mcast_deliver(net, skb,
-				saddr, daddr, udptable);
+				saddr, daddr, udptable, proto);
 
 	/* Unicast */
 

^ permalink raw reply related

* Re: [PATCH 00/13] net_sched: misc cleanups and improvements
From: Eric Dumazet @ 2014-11-06 18:21 UTC (permalink / raw)
  To: Cong Wang; +Cc: Linux Kernel Network Developers, Jamal Hadi Salim
In-Reply-To: <CAM_iQpWw4UMKZcdZfpp5D-tDfj954fbptyXJUzydgFCero6xNw@mail.gmail.com>

On Thu, 2014-11-06 at 10:05 -0800, Cong Wang wrote:

> 
> Who works on what? Does he/she at least announce it on netdev?
> (If you meant John, I already waited for his rcu stuffs in the last
> merge window,
> I assumed his works is almost done therefore sent this patchset.)
> 
> Since when it becomes a rule that we should yield to something not merged,
> not even announced? If so, why not adding it to netdev-FAQ?

You really dont get it. You cant understand how it really works.

Most probably I am one of the contributor, and my work depends on the
knowledge I got from studying the code. If you constantly change it, my
knowledge is reduced to useless bits.

Clearly you have to understand how _other_ people work, not assume
everybody is as smart as you are.

^ permalink raw reply

* Re: [PATCH 00/13] net_sched: misc cleanups and improvements
From: Eric Dumazet @ 2014-11-06 18:17 UTC (permalink / raw)
  To: Cong Wang; +Cc: Linux Kernel Network Developers, Jamal Hadi Salim
In-Reply-To: <CAM_iQpWw4UMKZcdZfpp5D-tDfj954fbptyXJUzydgFCero6xNw@mail.gmail.com>

On Thu, 2014-11-06 at 10:05 -0800, Cong Wang wrote:
> On Tue, Nov 4, 2014 at 5:47 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > On Tue, 2014-11-04 at 17:25 -0800, Cong Wang wrote:
> >
> >> Seriously, think about why it should when it's just cleanup's, be practical.
> >
> > I seriously ask you to not do cleanups then.
> 
> Apparently you didn't say this when the following commits got accepted:

There is a difference from newbies and you.

As a community, we welcome new comers and encourage them,
but after a while, people sending mostly cleanups are shifting in a
category which doesn't fit to you.

We expect from you more interesting stuff. You can do it.

I understand you want to fully rewrite net/sched to your ideas of
how the code _should_ be.

Doing so forces other people already knowing all this code to spend time
to understand how things changed. And this is really not nice.

If you want to send cleanups, do this once in a while. Do not send 13
patches and expect us to be happy with that. We are not.

^ permalink raw reply

* [PATCH net] dcbnl : Fix lock initialization
From: Anish Bhatt @ 2014-11-06 18:09 UTC (permalink / raw)
  To: netdev
  Cc: davem, john.r.fastabend, ying.xue, jeffrey.t.kirsher, ebiederm,
	Anish Bhatt

dcb_lock was being used uninitialized in dcbnl and is infact missing
 initialization code. Fixed

Signed-off-by: Anish Bhatt <anish@chelsio.com>
---
 net/dcb/dcbnl.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/dcb/dcbnl.c b/net/dcb/dcbnl.c
index ca11d28..7bc44e1 100644
--- a/net/dcb/dcbnl.c
+++ b/net/dcb/dcbnl.c
@@ -1914,6 +1914,8 @@ static int __init dcbnl_init(void)
 {
 	INIT_LIST_HEAD(&dcb_app_list);
 
+	spin_lock_init(&dcb_lock);
+
 	rtnl_register(PF_UNSPEC, RTM_GETDCB, dcb_doit, NULL, NULL);
 	rtnl_register(PF_UNSPEC, RTM_SETDCB, dcb_doit, NULL, NULL);
 
-- 
2.1.3

^ permalink raw reply related

* Re: [PATCH 00/13] net_sched: misc cleanups and improvements
From: Cong Wang @ 2014-11-06 18:05 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Linux Kernel Network Developers, Jamal Hadi Salim
In-Reply-To: <1415152068.1458.2.camel@edumazet-glaptop2.roam.corp.google.com>

On Tue, Nov 4, 2014 at 5:47 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Tue, 2014-11-04 at 17:25 -0800, Cong Wang wrote:
>
>> Seriously, think about why it should when it's just cleanup's, be practical.
>
> I seriously ask you to not do cleanups then.

Apparently you didn't say this when the following commits got accepted:

commit 436f7c206860729d543a457aca5887e52039a5f4
Author: Fabian Frederick <fabf@skynet.be>
Date:   Tue Nov 4 20:52:14 2014 +0100

    igmp: remove camel case definitions

    use standard uppercase for definitions

    Signed-off-by: Fabian Frederick <fabf@skynet.be>
    Signed-off-by: David S. Miller <davem@davemloft.net>

commit c18450a52a10a5c4cea3dc426c40447a7152290f
Author: Fabian Frederick <fabf@skynet.be>
Date:   Tue Nov 4 20:48:41 2014 +0100

    udp: remove else after return

commit aa1f731e52807077e9e13a86c0cad12d442c8fd4
Author: Fabian Frederick <fabf@skynet.be>
Date:   Tue Nov 4 20:44:04 2014 +0100

    inet: frags: remove inline on static in c file

    remove __inline__ / inline and let compiler decide what to do
    with static functions

    Inspired-by: "David S. Miller" <davem@davemloft.net>
    Signed-off-by: Fabian Frederick <fabf@skynet.be>
    Signed-off-by: David S. Miller <davem@davemloft.net>

>
> Some people are working adding real stuff here, this code changing every
> month is slowing them a lot.
>

Who works on what? Does he/she at least announce it on netdev?
(If you meant John, I already waited for his rcu stuffs in the last
merge window,
I assumed his works is almost done therefore sent this patchset.)

Since when it becomes a rule that we should yield to something not merged,
not even announced? If so, why not adding it to netdev-FAQ?

^ permalink raw reply

* Fw: [Bug 87701] New: hard cpu lockup during pppd initialization of vpn
From: Stephen Hemminger @ 2014-11-04 18:09 UTC (permalink / raw)
  To: netdev



Begin forwarded message:

Date: Tue, 4 Nov 2014 08:35:39 -0800
From: "bugzilla-daemon@bugzilla.kernel.org" <bugzilla-daemon@bugzilla.kernel.org>
To: "stephen@networkplumber.org" <stephen@networkplumber.org>
Subject: [Bug 87701] New: hard cpu lockup during pppd initialization of vpn


https://bugzilla.kernel.org/show_bug.cgi?id=87701

            Bug ID: 87701
           Summary: hard cpu lockup during pppd initialization of vpn
           Product: Networking
           Version: 2.5
    Kernel Version: 3.18.0-rc3
          Hardware: x86-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: blocking
          Priority: P1
         Component: Other
          Assignee: shemminger@linux-foundation.org
          Reporter: richcoe2@gmail.com
        Regression: No

I did not experience this issue in 3.16 or before.
I did not try 3.17.

I moved from kernel-3.15 to 3.16, and then to kernel 3.18.
When I start forticlientsslvpn on 3.18, the system locks up hard.  No mouse and
no keyboard. 

forticlient starts pppd to enable a vpn connection.
Since this is laptop, I don't get a kernel traceback, or OOPS message.

I'm enabling kdump to see if I can get a reliable traceback.
I was first on 3.18.0-rc2, and moved to 3.18.0-rc3 today, and still have the
issue.

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply

* Re: [PATCH net-next 1/7] bpf: add 'flags' attribute to BPF_MAP_UPDATE_ELEM command
From: Alexei Starovoitov @ 2014-11-06 17:39 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: David S. Miller, Ingo Molnar, Andy Lutomirski,
	Hannes Frederic Sowa, Eric Dumazet, Linux API,
	Network Development, LKML
In-Reply-To: <545A3ACC.3080101@redhat.com>

On Wed, Nov 5, 2014 at 6:57 AM, Daniel Borkmann <dborkman@redhat.com> wrote:
> On 11/05/2014 12:04 AM, Alexei Starovoitov wrote:
>>
>> On Tue, Nov 4, 2014 at 1:25 AM, Daniel Borkmann <dborkman@redhat.com>
>> wrote:
>>>
>>> On 11/04/2014 03:54 AM, Alexei Starovoitov wrote:
>>>>
>>>>
>>>> the current meaning of BPF_MAP_UPDATE_ELEM syscall command is:
>>>> either update existing map element or create a new one.
>>>> Initially the plan was to add a new command to handle the case of
>>>> 'create new element if it didn't exist', but 'flags' style looks
>>>> cleaner and overall diff is much smaller (more code reused), so add
>>>> 'flags'
>>>> attribute to BPF_MAP_UPDATE_ELEM command with the following meaning:
>>>> enum {
>>>>     BPF_MAP_UPDATE_OR_CREATE = 0, /* add new element or update existing
>>>> */
>>>>     BPF_MAP_CREATE_ONLY,          /* add new element if it didn't exist
>>>> */
>>>>     BPF_MAP_UPDATE_ONLY           /* update existing element */
>>>> };
>>>
>>>
>>>  From you commit message/code I currently don't see an explanation why
>>> it cannot be done in typical ``flags style'' as various syscalls do,
>>> i.e. BPF_MAP_UPDATE_OR_CREATE rather represented as ...
>>>
>>>    BPF_MAP_CREATE | BPF_MAP_UPDATE
>>>
>>> Do you expect more than 64 different flags to be passed from user space
>>> for BPF_MAP?
>>
>>
>> several reasons:
>> - preserve flags==0 as default behavior
>> - avoid holes and extra checks for invalid combinations, so
>>    if (flags > BPF_MAP_UPDATE_ONLY) goto err, is enough.
>> - it looks much neater when user space uses
>>    BPF_MAP_UPDATE_OR_CREATE instead of ORing bits.
>>
>> Note this choice doesn't prevent adding bit-like flags
>> in the future. Today I cannot think of any new flags
>> for the update() command, but if somebody comes up with
>> a new selector that can apply to all three combinations,
>> we can add it as 3rd bit that can be ORed.
>
>
> Hm, mixing enums together with bitfield-like flags seems
> kind of hacky ... :/ Or, do you mean to say you're adding
> a 2nd flag field, i.e. splitting the 64bits into a 32bit
> ``cmd enum'' and 32bit ``flag section''?

something like this.
or splitting 64-bit into 2 and 62. We'll see.
First two encode this 'type' of update and the rest -
whatever else.

> Hm, my concern is that we start to add many *_OR_* enum
> elements once we find that a flag might be a useful in
> combination with many other flags ... even though if we
> only can think of 3 flags /right now/.

Agree. Adding many *_OR_* would look bad, that's
why I said that future additions can be bits. Like in
paragraph above.

Also, we don't have 3 flags now. In this patch I'm
showing 3 types and you're suggesting to treat
them as 2 flags. To me that's incorrect, since 'no flags'
becomes invalid combination, which logically incorrect.
Therefore I cannot see them as 'flags'. This is a 'type'
or 'style' of update() command.

I think it actually matches how open() defines things
in similar situation:
#define O_RDONLY        00000000
#define O_WRONLY        00000001
#define O_RDWR          00000002
We used to think of them as flags, but they're not
bit flags, though the rest of open() flags are bit-like.
If we apply your argument to open() then open()
should have defined O_RD as 1 and OR_WR as 2
and force everyone to mix and match them, but
then zero would be invalid. So I still think that
what I have is a cleaner API :)

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox