Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH 2/2] net: ip, ipv6: handle gso skbs in forwarding path
From: Florian Westphal @ 2014-01-29 10:53 UTC (permalink / raw)
  To: Eric Dumazet, Florian Westphal, netdev
In-Reply-To: <20140128002707.GA12308@order.stressinduktion.org>

Hannes Frederic Sowa <hannes@stressinduktion.org> wrote:
> > TCP stream not using DF flag are very unlikely to care if we adjust
> > their MTU (lowering gso_size) at this point ?
> 
> UDP shouldn't be a problem, too.

Sorry for late reply, but how can this be safe for UDP?
We should make sure that peer sees original, unchanged datagram?

And only solution for UDP that I can see is to do sw segmentation (i.e.
create ip fragments).

Thanks,
Florian

^ permalink raw reply

* Re: [PATCH] ipv6: default route for link local address is not added while assigning a address
From: Nicolas Dichtel @ 2014-01-29 10:38 UTC (permalink / raw)
  To: Sohny Thomas, netdev, linux-kernel, yoshfuji, davem, kumuda
In-Reply-To: <52E8A2AA.3090507@linux.vnet.ibm.com>

Le 29/01/2014 07:41, Sohny Thomas a écrit :
> Resending this on netdev mailing list:
> Default route for link local address is configured automatically if
> NETWORKING_IPV6=yes is in ifcfg-eth*.
> When the route table for the interface is flushed and a new address is added to
> the same device with out removing linklocal addr, default route for link local
> address has to added by default.
>
> I have found the issue to be caused by this checkin
>
> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/net/ipv6?id=62b54dd91567686a1cb118f76a72d5f4764a86dd
>
>
> According to this change :
> He removes adding a link local route if any other address is added , applicable
> across all interfaces though there's mentioned only lo interface
> So below patch fixes for other devices
>
> Signed-off-by: Sohny THomas <sohthoma@linux.vnet.ibm.com>
Your email client has corrupted the patch, it cannot be applied.
Please read Documentation/email-clients.txt

About the patch, I still think that the flush is too agressive. Link local
routes are marked as 'proto kernel', removing them without the link local
address is wrong.

With this patch, you will add a link local route even if you don't have a link
local address.

^ permalink raw reply

* [PATCH] net: eth: cpsw: Correctly attach to GPIO bitbang MDIO driver
From: Stefan Roese @ 2014-01-29 10:32 UTC (permalink / raw)
  To: netdev; +Cc: Lukas Stockmann, Mugunthan V N

When the GPIO bitbang MDIO driver is used instead of the Davinci MDIO driver
we need to configure the phy_id string differently. Otherwise this string
looks like this "gpio.6" instead of "gpio-0" and the PHY is not found when
phy_connect() is called.

Signed-off-by: Stefan Roese <sr@denx.de>
Cc: Lukas Stockmann <lukas.stockmann@siemens.com>
Cc: Mugunthan V N <mugunthanvnm@ti.com>
---
 drivers/net/ethernet/ti/cpsw.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c
index a3ca3dd..43e6591 100644
--- a/drivers/net/ethernet/ti/cpsw.c
+++ b/drivers/net/ethernet/ti/cpsw.c
@@ -1858,8 +1858,18 @@ static int cpsw_probe_dt(struct cpsw_platform_data *data,
 		mdio_node = of_find_node_by_phandle(be32_to_cpup(parp));
 		phyid = be32_to_cpup(parp+1);
 		mdio = of_find_device_by_node(mdio_node);
-		snprintf(slave_data->phy_id, sizeof(slave_data->phy_id),
-			 PHY_ID_FMT, mdio->name, phyid);
+
+		if (strncmp(mdio->name, "gpio", 4) == 0) {
+			/* GPIO bitbang MDIO driver attached */
+			struct mii_bus *bus = dev_get_drvdata(&mdio->dev);
+
+			snprintf(slave_data->phy_id, sizeof(slave_data->phy_id),
+				 PHY_ID_FMT, bus->id, phyid);
+		} else {
+			/* davinci MDIO driver attached */
+			snprintf(slave_data->phy_id, sizeof(slave_data->phy_id),
+				 PHY_ID_FMT, mdio->name, phyid);
+		}
 
 		mac_addr = of_get_mac_address(slave_node);
 		if (mac_addr)
-- 
1.8.5.3

^ permalink raw reply related

* ethtool 3.13 released
From: Ben Hutchings @ 2014-01-28 22:00 UTC (permalink / raw)
  To: netdev

[-- Attachment #1: Type: text/plain, Size: 528 bytes --]

ethtool version 3.13 has been released.

Home page: https://ftp.kernel.org/pub/software/network/ethtool/
Download link:
https://ftp.kernel.org/pub/software/network/ethtool/ethtool-3.13.tar.xz

Release notes:

	* Doc: Update GPL text to include current address of the FSF
	* Fix: Spelling fixes
	* Doc: Fix advertising flag values in manual page for 20G link modes,
	  and add missing 1G, 10G and 40G link modes

Ben.
-- 
Ben Hutchings
It is a miracle that curiosity survives formal education. - Albert Einstein

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply

* Re: [PATCH] xfrm: avoid creating temporary SA when there are no listeners
From: Horia Geantă @ 2014-01-29  9:17 UTC (permalink / raw)
  To: Steffen Klassert, David S. Miller; +Cc: netdev
In-Reply-To: <1390986731-12861-1-git-send-email-horia.geanta@freescale.com>

On 1/29/2014 11:12 AM, Horia Geanta wrote:
> In the case when KMs have no listeners, km_query() will fail and
> temporary SAs are garbage collected immediately after their allocation.
> This causes strain on memory allocation, leading even to OOM since
> temporary SA alloc/free cycle is performed for every packet
> and garbage collection does not keep up the pace.
>
> The sane thing to do is to make sure we have audience before
> temporary SA allocation.
>
> Signed-off-by: Horia Geanta<horia.geanta@freescale.com>
> ---
> Resending - initially posted as RFC:
> http://www.spinics.net/lists/netdev/msg268454.html
>
> Please apply.

This is for ipsec-next, sorry for not mentioning in the first place.

Regards,
Horia

^ permalink raw reply

* [PATCH] xfrm: avoid creating temporary SA when there are no listeners
From: Horia Geanta @ 2014-01-29  9:12 UTC (permalink / raw)
  To: Steffen Klassert, David S. Miller; +Cc: netdev

In the case when KMs have no listeners, km_query() will fail and
temporary SAs are garbage collected immediately after their allocation.
This causes strain on memory allocation, leading even to OOM since
temporary SA alloc/free cycle is performed for every packet
and garbage collection does not keep up the pace.

The sane thing to do is to make sure we have audience before
temporary SA allocation.

Signed-off-by: Horia Geanta <horia.geanta@freescale.com>
---
Resending - initially posted as RFC:
http://www.spinics.net/lists/netdev/msg268454.html

Please apply.

 include/net/xfrm.h    | 15 +++++++++++++++
 net/key/af_key.c      | 20 ++++++++++++++++++++
 net/xfrm/xfrm_state.c | 31 +++++++++++++++++++++++++++++++
 net/xfrm/xfrm_user.c  |  6 ++++++
 4 files changed, 72 insertions(+)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index cd7c46f..449a867 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -594,6 +594,7 @@ struct xfrm_mgr {
 					   const struct xfrm_migrate *m,
 					   int num_bundles,
 					   const struct xfrm_kmaddress *k);
+	bool			(*is_alive)(const struct km_event *c);
 };
 
 int xfrm_register_km(struct xfrm_mgr *km);
@@ -1646,6 +1647,20 @@ static inline int xfrm_aevent_is_on(struct net *net)
 	rcu_read_unlock();
 	return ret;
 }
+
+static inline int xfrm_acquire_is_on(struct net *net)
+{
+	struct sock *nlsk;
+	int ret = 0;
+
+	rcu_read_lock();
+	nlsk = rcu_dereference(net->xfrm.nlsk);
+	if (nlsk)
+		ret = netlink_has_listeners(nlsk, XFRMNLGRP_ACQUIRE);
+	rcu_read_unlock();
+
+	return ret;
+}
 #endif
 
 static inline int xfrm_alg_len(const struct xfrm_algo *alg)
diff --git a/net/key/af_key.c b/net/key/af_key.c
index 1a04c13..12eb0ad 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -3059,6 +3059,25 @@ static u32 get_acqseq(void)
 	return res;
 }
 
+static bool pfkey_is_alive(const struct km_event *c)
+{
+	struct netns_pfkey *net_pfkey = net_generic(c->net, pfkey_net_id);
+	struct sock *sk;
+	struct hlist_node *node;
+	bool is_alive = false;
+
+	rcu_read_lock();
+	sk_for_each_rcu(sk, node, &net_pfkey->table) {
+		if (pfkey_sk(sk)->registered) {
+			is_alive = true;
+			break;
+		}
+	}
+	rcu_read_unlock();
+
+	return is_alive;
+}
+
 static int pfkey_send_acquire(struct xfrm_state *x, struct xfrm_tmpl *t, struct xfrm_policy *xp)
 {
 	struct sk_buff *skb;
@@ -3784,6 +3803,7 @@ static struct xfrm_mgr pfkeyv2_mgr =
 	.new_mapping	= pfkey_send_new_mapping,
 	.notify_policy	= pfkey_send_policy_notify,
 	.migrate	= pfkey_send_migrate,
+	.is_alive	= pfkey_is_alive,
 };
 
 static int __net_init pfkey_net_init(struct net *net)
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index 8d11d28..e79f376 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -161,6 +161,7 @@ static DEFINE_SPINLOCK(xfrm_state_gc_lock);
 int __xfrm_state_delete(struct xfrm_state *x);
 
 int km_query(struct xfrm_state *x, struct xfrm_tmpl *t, struct xfrm_policy *pol);
+bool km_is_alive(const struct km_event *c);
 void km_state_expired(struct xfrm_state *x, int hard, u32 portid);
 
 static DEFINE_SPINLOCK(xfrm_type_lock);
@@ -788,6 +789,7 @@ xfrm_state_find(const xfrm_address_t *daddr, const xfrm_address_t *saddr,
 	struct xfrm_state *best = NULL;
 	u32 mark = pol->mark.v & pol->mark.m;
 	unsigned short encap_family = tmpl->encap_family;
+	struct km_event c;
 
 	to_put = NULL;
 
@@ -832,6 +834,17 @@ found:
 			error = -EEXIST;
 			goto out;
 		}
+
+		c.net = net;
+		/* If the KMs have no listeners (yet...), avoid allocating an SA
+		 * for each and every packet - garbage collection might not
+		 * handle the flood.
+		 */
+		if (!km_is_alive(&c)) {
+			error = -ESRCH;
+			goto out;
+		}
+
 		x = xfrm_state_alloc(net);
 		if (x == NULL) {
 			error = -ENOMEM;
@@ -1793,6 +1806,24 @@ int km_report(struct net *net, u8 proto, struct xfrm_selector *sel, xfrm_address
 }
 EXPORT_SYMBOL(km_report);
 
+bool km_is_alive(const struct km_event *c)
+{
+	struct xfrm_mgr *km;
+	bool is_alive = false;
+
+	read_lock(&xfrm_km_lock);
+	list_for_each_entry(km, &xfrm_km_list, list) {
+		if (km->is_alive && km->is_alive(c)) {
+			is_alive = true;
+			break;
+		}
+	}
+	read_unlock(&xfrm_km_lock);
+
+	return is_alive;
+}
+EXPORT_SYMBOL(km_is_alive);
+
 int xfrm_user_policy(struct sock *sk, int optname, u8 __user *optval, int optlen)
 {
 	int err;
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index 3348566..b53a489 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -2981,6 +2981,11 @@ static int xfrm_send_mapping(struct xfrm_state *x, xfrm_address_t *ipaddr,
 	return nlmsg_multicast(net->xfrm.nlsk, skb, 0, XFRMNLGRP_MAPPING, GFP_ATOMIC);
 }
 
+static bool xfrm_is_alive(const struct km_event *c)
+{
+	return (bool)xfrm_acquire_is_on(c->net);
+}
+
 static struct xfrm_mgr netlink_mgr = {
 	.id		= "netlink",
 	.notify		= xfrm_send_state_notify,
@@ -2990,6 +2995,7 @@ static struct xfrm_mgr netlink_mgr = {
 	.report		= xfrm_send_report,
 	.migrate	= xfrm_send_migrate,
 	.new_mapping	= xfrm_send_mapping,
+	.is_alive	= xfrm_is_alive,
 };
 
 static int __net_init xfrm_user_net_init(struct net *net)
-- 
1.8.3.1

^ permalink raw reply related

* [GIT] Networking
From: David Miller @ 2014-01-29  8:55 UTC (permalink / raw)
  To: torvalds; +Cc: akpm, netdev, linux-kernel


Several fixups, of note:

1) Fix unlock of not held spinlock in RXRPC code, from Alexey
   Khoroshilov.

2) Call pci_disable_device() from the correct shutdown path in
   bnx2x driver, from Yuval Mintz.

3) Fix qeth build on s390 for some configurations, from Eugene
   Crosser.

4) Cure locking bugs in bond_loadbalance_arp_mon(), from Ding
   Tianhong.

5) Must do netif_napi_add() before registering netdevice in sky2
   driver, from Stanislaw Gruszka.

6) Fix lost bug fix during merge due to code movement in ieee802154,
   noticed and fixed by the eagle eyed Stephen Rothwell.

7) Get rid of resource leak in xen-netfront driver, from Annie Li.

8) Bounds checks in qlcnic driver are off by one, from Manish Chopra.

9) TPROXY can leak sockets when TCP early demux is enabled, fix from
   Holger Eitzenberger.

Please pull, thanks a lot!

The following changes since commit 77d143de75812596a58d126606f42d1214e09dde:

  Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml (2014-01-26 11:06:16 -0800)

are available in the git repository at:


  git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git master

for you to fetch changes up to c044dc2132d19d8c643cdd340f21afcec177c046:

  qeth: fix build of s390 allmodconfig (2014-01-29 00:43:33 -0800)

----------------------------------------------------------------
Alexey Khoroshilov (1):
      RxRPC: do not unlock unheld spinlock in rxrpc_connect_exclusive()

Annie Li (1):
      xen-netfront: fix resource leak in netfront

Dave Jones (2):
      i40e: Add missing braces to i40e_dcb_need_reconfig()
      llc: remove noisy WARN from llc_mac_hdr_init

David S. Miller (4):
      Merge branch 'bonding'
      Merge branch 'qlcnic'
      Merge tag 'rxrpc-20140126' of git://git.kernel.org/.../dhowells/linux-fs
      Merge branch 'DT'

Ding Tianhong (1):
      bonding: fix locking in bond_loadbalance_arp_mon()

Duan Jiong (1):
      net: gre: use icmp_hdr() to get inner ip header

Eugene Crosser (1):
      qeth: fix build of s390 allmodconfig

Florian Westphal (1):
      net: add and use skb_gso_transport_seglen()

Geert Uytterhoeven (1):
      net/apne: Remove unused variable ei_local

Haiyang Zhang (1):
      hyperv: Add support for physically discontinuous receive buffer

Hans de Goede (2):
      net: stmmac: Silence PTP init errors on hw without PTP
      net: stmmac: Log MAC address only once

Holger Eitzenberger (1):
      net: Fix memory leak if TPROXY used with TCP early demux

Manish Chopra (1):
      qlcnic: Correct off-by-one errors in bounds checks

Martin Schwenke (1):
      net: Document promote_secondaries

Masanari Iida (1):
      net: Fix warning on make htmldocs caused by skbuff.c

Masatake YAMATO (1):
      tun: add device name(iff) field to proc fdinfo entry

Neil Horman (1):
      AF_PACKET: Add documentation for queue mapping fanout mode

Rajesh Borundia (2):
      qlcnic: Fix initialization of vlan list.
      qlcnic: Fix tx timeout.

Sachin Kamat (1):
      net: ipv4: Use PTR_ERR_OR_ZERO

Sergei Shtylyov (2):
      DT: net: davinci_emac: "ti, davinci-rmii-en" property is actually optional
      DT: net: davinci_emac: "ti, davinci-no-bd-ram" property is actually optional

Shahed Shaikh (1):
      qlcnic: Fix loopback test failure

Stanislaw Gruszka (1):
      sky2: initialize napi before registering device

Stephen Rothwell (1):
      net: 6lowpan: fixup for code movement

Tim Smith (2):
      af_rxrpc: Avoid setting up double-free on checksum error
      af_rxrpc: Handle frames delivered from another VM

Veaceslav Falico (2):
      bonding: RCUify bond_ab_arp_probe
      bonding: restructure locking of bond_ab_arp_probe()

Yaniv Rosner (1):
      bnx2x: Fix generic option settings

Yuval Mintz (1):
      bnx2x: More Shutdown revisions

 Documentation/devicetree/bindings/net/davinci_emac.txt   |  4 +-
 Documentation/networking/ip-sysctl.txt                   |  6 +++
 Documentation/networking/packet_mmap.txt                 |  1 +
 drivers/hv/channel.c                                     | 14 +++---
 drivers/net/bonding/bond_main.c                          | 96 ++++++++++++++++++++++++------------------
 drivers/net/bonding/bonding.h                            | 13 ++++++
 drivers/net/ethernet/8390/apne.c                         |  1 -
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c      | 78 +++++++++++++++++-----------------
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c         |  4 +-
 drivers/net/ethernet/intel/i40e/i40e_main.c              |  3 +-
 drivers/net/ethernet/marvell/sky2.c                      |  4 +-
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_io.c           | 19 ++++++---
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c         |  9 +---
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_sriov_common.c | 11 +++--
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c        |  6 +--
 drivers/net/hyperv/hyperv_net.h                          |  2 +-
 drivers/net/hyperv/netvsc.c                              |  7 +--
 drivers/net/tun.c                                        | 27 +++++++++++-
 drivers/net/xen-netfront.c                               | 88 ++++++++++++--------------------------
 drivers/s390/net/qeth_core.h                             |  5 +--
 drivers/s390/net/qeth_core_main.c                        | 18 +++-----
 drivers/s390/net/qeth_l2_main.c                          | 41 +++++++++++++++---
 drivers/s390/net/qeth_l3_main.c                          |  8 ++++
 include/linux/skbuff.h                                   |  1 +
 net/core/skbuff.c                                        | 27 +++++++++++-
 net/ieee802154/6lowpan_iphc.c                            |  2 +-
 net/ipv4/ip_gre.c                                        |  2 +-
 net/ipv4/ip_input.c                                      |  2 +-
 net/ipv4/ip_tunnel.c                                     |  3 +-
 net/ipv6/ip6_input.c                                     |  2 +-
 net/llc/llc_output.c                                     |  2 +-
 net/rxrpc/ar-connection.c                                |  2 +
 net/rxrpc/ar-recvmsg.c                                   |  7 ++-
 net/sched/sch_tbf.c                                      | 13 ++----
 34 files changed, 303 insertions(+), 225 deletions(-)

^ permalink raw reply

* Re: [PATCH stable 3.11+] can: bcm: add skb destructor
From: Andre Naujoks @ 2014-01-29  8:47 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, socketcan, netdev
In-Reply-To: <20140128.234630.1768378245126172951.davem@davemloft.net>

On 29.01.2014 08:46, schrieb David Miller:
> From: Andre Naujoks <nautsch2@gmail.com>
> Date: Wed, 29 Jan 2014 08:40:03 +0100
> 
>> Even if this is a bug in the CAN BCM implementation. Your "fix" just
>> enabled a user space application to shut down any machine with a kernel
>> containing the BUG_ON patch.
> 
> Rather, he detected a potential stray pointer reference to freed data
> that was caused by the CAN code which would difficult if not
> impossible to detect otherwise.
> 
> That's even more dangerous, and you should be thanking him.

"potential" is the keyword here. But its a definite kernel crash as it
is right now with a standard use case for the BCM.

Don't get me wrong. If there are bugs in the code, they should be fixed,
but I don't think breaking a working (even if flawed) part of the kernel
is the right thing to do here.

Regards
  Andre

> 

^ permalink raw reply

* Re: [PATCH net] qeth: fix build of s390 allmodconfig
From: David Miller @ 2014-01-29  8:43 UTC (permalink / raw)
  To: blaschka; +Cc: netdev, linux-s390
In-Reply-To: <20140129082348.GA34455@tuxmaker.boeblingen.de.ibm.com>

From: Frank Blaschka <blaschka@linux.vnet.ibm.com>
Date: Wed, 29 Jan 2014 09:23:48 +0100

> From: Eugene Crosser <Eugene.Crosser@ru.ibm.com>
> 
> commit 949efd1c "qeth: bridgeport support - basic control" broke
> s390 allmodconfig. This patch fixes this by eliminating one of the
> cross-module calls, and by making two other calls via function
> pointers in the qeth_discipline structure.
> 
> Signed-off-by: Eugene Crosser <Eugene.Crosser@ru.ibm.com>
> Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
> Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com>

Applied, thanks.

^ permalink raw reply

* [PATCH net] qeth: fix build of s390 allmodconfig
From: Frank Blaschka @ 2014-01-29  8:23 UTC (permalink / raw)
  To: davem; +Cc: netdev, linux-s390

From: Eugene Crosser <Eugene.Crosser@ru.ibm.com>

commit 949efd1c "qeth: bridgeport support - basic control" broke
s390 allmodconfig. This patch fixes this by eliminating one of the
cross-module calls, and by making two other calls via function
pointers in the qeth_discipline structure.

Signed-off-by: Eugene Crosser <Eugene.Crosser@ru.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com>
---
 drivers/s390/net/qeth_core.h      |    5 +---
 drivers/s390/net/qeth_core_main.c |   18 +++++-----------
 drivers/s390/net/qeth_l2_main.c   |   41 ++++++++++++++++++++++++++++++++------
 drivers/s390/net/qeth_l3_main.c   |    8 +++++++
 4 files changed, 51 insertions(+), 21 deletions(-)

--- a/drivers/s390/net/qeth_core.h
+++ b/drivers/s390/net/qeth_core.h
@@ -738,6 +738,8 @@ struct qeth_discipline {
 	int (*freeze)(struct ccwgroup_device *);
 	int (*thaw) (struct ccwgroup_device *);
 	int (*restore)(struct ccwgroup_device *);
+	int (*control_event_handler)(struct qeth_card *card,
+					struct qeth_ipa_cmd *cmd);
 };
 
 struct qeth_vlan_vid {
@@ -948,13 +950,10 @@ int qeth_query_card_info(struct qeth_car
 int qeth_send_control_data(struct qeth_card *, int, struct qeth_cmd_buffer *,
 	int (*reply_cb)(struct qeth_card *, struct qeth_reply*, unsigned long),
 	void *reply_param);
-void qeth_bridge_state_change(struct qeth_card *card, struct qeth_ipa_cmd *cmd);
-void qeth_bridgeport_query_support(struct qeth_card *card);
 int qeth_bridgeport_query_ports(struct qeth_card *card,
 	enum qeth_sbp_roles *role, enum qeth_sbp_states *state);
 int qeth_bridgeport_setrole(struct qeth_card *card, enum qeth_sbp_roles role);
 int qeth_bridgeport_an_set(struct qeth_card *card, int enable);
-void qeth_bridge_host_event(struct qeth_card *card, struct qeth_ipa_cmd *cmd);
 int qeth_get_priority_queue(struct qeth_card *, struct sk_buff *, int, int);
 int qeth_get_elements_no(struct qeth_card *, struct sk_buff *, int);
 int qeth_get_elements_for_frags(struct sk_buff *);
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -69,6 +69,7 @@ static void qeth_clear_output_buffer(str
 static int qeth_init_qdio_out_buf(struct qeth_qdio_out_q *, int);
 
 struct workqueue_struct *qeth_wq;
+EXPORT_SYMBOL_GPL(qeth_wq);
 
 static void qeth_close_dev_handler(struct work_struct *work)
 {
@@ -616,15 +617,12 @@ static struct qeth_ipa_cmd *qeth_check_i
 				qeth_schedule_recovery(card);
 				return NULL;
 			case IPA_CMD_SETBRIDGEPORT:
-				if (cmd->data.sbp.hdr.command_code ==
-					IPA_SBP_BRIDGE_PORT_STATE_CHANGE) {
-					qeth_bridge_state_change(card, cmd);
-					return NULL;
-				} else
-					return cmd;
 			case IPA_CMD_ADDRESS_CHANGE_NOTIF:
-				qeth_bridge_host_event(card, cmd);
-				return NULL;
+				if (card->discipline->control_event_handler
+								(card, cmd))
+					return cmd;
+				else
+					return NULL;
 			case IPA_CMD_MODCCID:
 				return cmd;
 			case IPA_CMD_REGISTER_LOCAL_ADDR:
@@ -4973,10 +4971,6 @@ retriable:
 		qeth_query_setadapterparms(card);
 	if (qeth_adp_supported(card, IPA_SETADP_SET_DIAG_ASSIST))
 		qeth_query_setdiagass(card);
-	qeth_bridgeport_query_support(card);
-	if (card->options.sbp.supported_funcs)
-		dev_info(&card->gdev->dev,
-		"The device represents a HiperSockets Bridge Capable Port\n");
 	return 0;
 out:
 	dev_warn(&card->gdev->dev, "The qeth device driver failed to recover "
--- a/drivers/s390/net/qeth_l2_main.c
+++ b/drivers/s390/net/qeth_l2_main.c
@@ -33,6 +33,11 @@ static int qeth_l2_send_setdelmac(struct
 					    unsigned long));
 static void qeth_l2_set_multicast_list(struct net_device *);
 static int qeth_l2_recover(void *);
+static void qeth_bridgeport_query_support(struct qeth_card *card);
+static void qeth_bridge_state_change(struct qeth_card *card,
+					struct qeth_ipa_cmd *cmd);
+static void qeth_bridge_host_event(struct qeth_card *card,
+					struct qeth_ipa_cmd *cmd);
 
 static int qeth_l2_do_ioctl(struct net_device *dev, struct ifreq *rq, int cmd)
 {
@@ -989,6 +994,10 @@ static int __qeth_l2_set_online(struct c
 		rc = -ENODEV;
 		goto out_remove;
 	}
+	qeth_bridgeport_query_support(card);
+	if (card->options.sbp.supported_funcs)
+		dev_info(&card->gdev->dev,
+		"The device represents a HiperSockets Bridge Capable Port\n");
 	qeth_trace_features(card);
 
 	if (!card->dev && qeth_l2_setup_netdev(card)) {
@@ -1233,6 +1242,26 @@ out:
 	return rc;
 }
 
+/* Returns zero if the command is successfully "consumed" */
+static int qeth_l2_control_event(struct qeth_card *card,
+					struct qeth_ipa_cmd *cmd)
+{
+	switch (cmd->hdr.command) {
+	case IPA_CMD_SETBRIDGEPORT:
+		if (cmd->data.sbp.hdr.command_code ==
+				IPA_SBP_BRIDGE_PORT_STATE_CHANGE) {
+			qeth_bridge_state_change(card, cmd);
+			return 0;
+		} else
+			return 1;
+	case IPA_CMD_ADDRESS_CHANGE_NOTIF:
+		qeth_bridge_host_event(card, cmd);
+		return 0;
+	default:
+		return 1;
+	}
+}
+
 struct qeth_discipline qeth_l2_discipline = {
 	.start_poll = qeth_qdio_start_poll,
 	.input_handler = (qdio_handler_t *) qeth_qdio_input_handler,
@@ -1246,6 +1275,7 @@ struct qeth_discipline qeth_l2_disciplin
 	.freeze = qeth_l2_pm_suspend,
 	.thaw = qeth_l2_pm_resume,
 	.restore = qeth_l2_pm_resume,
+	.control_event_handler = qeth_l2_control_event,
 };
 EXPORT_SYMBOL_GPL(qeth_l2_discipline);
 
@@ -1463,7 +1493,8 @@ static void qeth_bridge_state_change_wor
 	kfree(data);
 }
 
-void qeth_bridge_state_change(struct qeth_card *card, struct qeth_ipa_cmd *cmd)
+static void qeth_bridge_state_change(struct qeth_card *card,
+					struct qeth_ipa_cmd *cmd)
 {
 	struct qeth_sbp_state_change *qports =
 		 &cmd->data.sbp.data.state_change;
@@ -1488,7 +1519,6 @@ void qeth_bridge_state_change(struct qet
 			sizeof(struct qeth_sbp_state_change) + extrasize);
 	queue_work(qeth_wq, &data->worker);
 }
-EXPORT_SYMBOL(qeth_bridge_state_change);
 
 struct qeth_bridge_host_data {
 	struct work_struct worker;
@@ -1528,7 +1558,8 @@ static void qeth_bridge_host_event_worke
 	kfree(data);
 }
 
-void qeth_bridge_host_event(struct qeth_card *card, struct qeth_ipa_cmd *cmd)
+static void qeth_bridge_host_event(struct qeth_card *card,
+					struct qeth_ipa_cmd *cmd)
 {
 	struct qeth_ipacmd_addr_change *hostevs =
 		 &cmd->data.addrchange;
@@ -1560,7 +1591,6 @@ void qeth_bridge_host_event(struct qeth_
 			sizeof(struct qeth_ipacmd_addr_change) + extrasize);
 	queue_work(qeth_wq, &data->worker);
 }
-EXPORT_SYMBOL(qeth_bridge_host_event);
 
 /* SETBRIDGEPORT support; sending commands */
 
@@ -1683,7 +1713,7 @@ static int qeth_bridgeport_query_support
  * Sets bitmask of supported setbridgeport subfunctions in the qeth_card
  * strucutre: card->options.sbp.supported_funcs.
  */
-void qeth_bridgeport_query_support(struct qeth_card *card)
+static void qeth_bridgeport_query_support(struct qeth_card *card)
 {
 	struct qeth_cmd_buffer *iob;
 	struct qeth_ipa_cmd *cmd;
@@ -1709,7 +1739,6 @@ void qeth_bridgeport_query_support(struc
 	}
 	card->options.sbp.supported_funcs = cbctl.data.supported;
 }
-EXPORT_SYMBOL_GPL(qeth_bridgeport_query_support);
 
 static int qeth_bridgeport_query_ports_cb(struct qeth_card *card,
 	struct qeth_reply *reply, unsigned long data)
--- a/drivers/s390/net/qeth_l3_main.c
+++ b/drivers/s390/net/qeth_l3_main.c
@@ -3593,6 +3593,13 @@ out:
 	return rc;
 }
 
+/* Returns zero if the command is successfully "consumed" */
+static int qeth_l3_control_event(struct qeth_card *card,
+					struct qeth_ipa_cmd *cmd)
+{
+	return 1;
+}
+
 struct qeth_discipline qeth_l3_discipline = {
 	.start_poll = qeth_qdio_start_poll,
 	.input_handler = (qdio_handler_t *) qeth_qdio_input_handler,
@@ -3606,6 +3613,7 @@ struct qeth_discipline qeth_l3_disciplin
 	.freeze = qeth_l3_pm_suspend,
 	.thaw = qeth_l3_pm_resume,
 	.restore = qeth_l3_pm_resume,
+	.control_event_handler = qeth_l3_control_event,
 };
 EXPORT_SYMBOL_GPL(qeth_l3_discipline);
 

^ permalink raw reply

* Question about skb_segment()
From: Arnaud Ebalard @ 2014-01-29  8:12 UTC (permalink / raw)
  To: Herbert Xu; +Cc: David Miller, Eric Dumazet, Willy Tarreau, netdev

Hi Herbert,

I wonder if you could share some knowledge on the behaviour of
skb_segment() as it is implemented in 3.13.0: when passed a GSO
packet to be segmented, can the skb result have skb->next == NULL?

One would expect the number of segments of the result to usually match
tcp_skb_pcount() of passed packet and hence having skb->next != NULL: 
AFAICT, this what usually happens when skb_segment() is called in
tcp_gso_segment() but I noticed some cases where the number of segments
(length of chained sk_buff) is lower than the expected value and also
have two backtraces resulting from a skb delivered with a NULL skb->next. 

The whole thread leading to my question is here, in case you want to
take a look:

 http://thread.gmane.org/gmane.linux.network/301587

Thanks in advance,

a+

^ permalink raw reply

* Re: 8% performance improved by change tap interact with kernel stack
From: Michael S. Tsirkin @ 2014-01-29  7:56 UTC (permalink / raw)
  To: Qin Chuanyu; +Cc: jasowang, Anthony Liguori, KVM list, netdev
In-Reply-To: <52E8B0A4.6030806@huawei.com>

On Wed, Jan 29, 2014 at 03:41:24PM +0800, Qin Chuanyu wrote:
> On 2014/1/28 18:33, Michael S. Tsirkin wrote:
> 
> >>>Nice.
> >>>What about CPU utilization?
> >>>It's trivially easy to speed up networking by
> >>>burning up a lot of CPU so we must make sure it's
> >>>not doing that.
> >>>And I think we should see some tests with TCP as well, and
> >>>try several message sizes.
> >>>
> >>>
> >>Yes, by burning up more CPU we could get better performance easily.
> >>So I have bond vhost thread and interrupt of nic on CPU1 while testing.
> >>
> >>modified before, the idle of CPU1 is 0%-1% while testing.
> >>and after modify, the idle of CPU1 is 2%-3% while testing
> >>
> >>TCP also could gain from this, but pps is less than UDP, so I think
> >>the improvement would be not so obviously.
> >
> >Still need to test this doesn't regress but overall looks convincing to me.
> >Could you send a patch, accompanied by testing results for
> >throughput latency and cpu utilization for tcp and udp
> >with various message sizes?
> >
> >Thanks!
> >
> because of spring festival of china, the test result would be given
> two week later.
> throughput would be test by netperf, and latency would be tested by
> qperf. Is that OK?

For testing - sounds good. Run vmstat in host to check host cpu utilization.
Pls don't forget to address all issues raised in this thread and in
the old one Eric mentioned:
http://patchwork.ozlabs.org/patch/52963/
either address in code or address in commit log why it doesn't apply
anymore.

-- 
MST

^ permalink raw reply

* Re: [PATCH 0/2] [BUG FIXES - 3.10.27] sit: More backports
From: David Miller @ 2014-01-29  7:52 UTC (permalink / raw)
  To: rostedt
  Cc: linux-kernel, netdev, nicolas.dichtel, stable, williams, lclaudio,
	jkacur
In-Reply-To: <20140128205756.074448668@goodmis.org>

From: Steven Rostedt <rostedt@goodmis.org>
Date: Tue, 28 Jan 2014 15:57:56 -0500

> At Red Hat we base our real-time kernel off of 3.10.27 and do lots of
> stress testing on that kernel. This has discovered some bugs that we
> can hit with the vanilla 3.10.27 kernel (no -rt patches applied).
> 
> I sent out a bug fix that can cause a crash with the current 3.10.27
> when you add and then remove the sit module. That patch is obsoleted by
> these patches, as that patch was not enough.
> 
> A previous patch that was backported:
> 
>   Upstream commit 205983c43700ac3a81e7625273a3fa83cd2759b5
>   sit: allow to use rtnl ops on fb tunnel
> 
> Had a depenency on commit 5e6700b3bf98 ("sit: add support of x-netns")
> which was not backported. The dependency was only on part of that
> commit which is what I backported.
> 
> The other upstream commit 9434266f2c645d4fcf62a03a8e36ad8075e37943
> sit: fix use after free of fb_tunnel_dev
> 
> fixes another bug we encountered, it also fixes the 3.10.27 bug
> where removing the sit module cause the crash. This is the patch
> that obsoletes my previous patch.

Greg, these have my blessing, please apply these to 3.10 -stable,
thanks a lot!

^ permalink raw reply

* Re: [PATCH net-next] bonding: fix locking in bond_loadbalance_arp_mon()
From: David Miller @ 2014-01-29  7:48 UTC (permalink / raw)
  To: dingtianhong; +Cc: fubar, vfalico, netdev, andy
In-Reply-To: <52E728A5.5070701@huawei.com>

From: Ding Tianhong <dingtianhong@huawei.com>
Date: Tue, 28 Jan 2014 11:48:53 +0800

> The commit 1d3ee88ae0d605629bf369
> (bonding: add netlink attributes to slave link dev)
> has add rtmsg_ifinfo() in bond_set_active_slave() and
> bond_set_backup_slave(), so the two function need to
> called in RTNL lock, but bond_loadbalance_arp_mon()
> only calling these functions in RCU, warning message
> will occurs.
> 
> fix this by add a new function bond_slave_state_change(),
> which will reset the slave's state after slave link check,
> so remove the bond_set_xxx_slave() from the cycle and only
> record the slave_state_changed, this will call the new
> function to set all slaves to new state in RTNL later.
> 
> Cc: Jay Vosburgh <fubar@us.ibm.com>
> Cc: Veaceslav Falico <vfalico@redhat.com>
> Cc: Andy Gospodarek <andy@greyhouse.net>
> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH v3] tun: add device name(iff) field to proc fdinfo entry
From: David Miller @ 2014-01-29  7:47 UTC (permalink / raw)
  To: yamato; +Cc: netdev
In-Reply-To: <1390981411-16146-1-git-send-email-yamato@redhat.com>

From: Masatake YAMATO <yamato@redhat.com>
Date: Wed, 29 Jan 2014 16:43:31 +0900

> A file descriptor opened for /dev/net/tun and a tun device are
> connected with ioctl.  Though understanding the connection is
> important for trouble shooting, no way is given to a user to know
> the connected device for a given file descriptor at userland.
> 
> This patch adds a new fdinfo field for the device name connected to
> a file descriptor opened for /dev/net/tun.
> 
> Here is an example of the field:
...
> Signed-off-by: Masatake YAMATO <yamato@redhat.com>

This looks a lot better, applied, thanks.

^ permalink raw reply

* Re: [PATCH stable 3.11+] can: bcm: add skb destructor
From: David Miller @ 2014-01-29  7:46 UTC (permalink / raw)
  To: nautsch2; +Cc: eric.dumazet, socketcan, netdev
In-Reply-To: <52E8B053.2030808@gmail.com>

From: Andre Naujoks <nautsch2@gmail.com>
Date: Wed, 29 Jan 2014 08:40:03 +0100

> Even if this is a bug in the CAN BCM implementation. Your "fix" just
> enabled a user space application to shut down any machine with a kernel
> containing the BUG_ON patch.

Rather, he detected a potential stray pointer reference to freed data
that was caused by the CAN code which would difficult if not
impossible to detect otherwise.

That's even more dangerous, and you should be thanking him.

^ permalink raw reply

* [PATCH v3] tun: add device name(iff) field to proc fdinfo entry
From: Masatake YAMATO @ 2014-01-29  7:43 UTC (permalink / raw)
  To: davem; +Cc: netdev, yamato
In-Reply-To: <20140128.231947.1827551289841225189.davem@davemloft.net>

A file descriptor opened for /dev/net/tun and a tun device are
connected with ioctl.  Though understanding the connection is
important for trouble shooting, no way is given to a user to know
the connected device for a given file descriptor at userland.

This patch adds a new fdinfo field for the device name connected to
a file descriptor opened for /dev/net/tun.

Here is an example of the field:

    # lsof | grep tun
    qemu-syst 4565         qemu   25u      CHR             10,200       0t138      12921 /dev/net/tun
    ...

    # cat /proc/4565/fdinfo/25
    pos:	138
    flags:	0104002
    iff:	vnet0

    # ip link show dev vnet0
    8: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 ...

changelog:

    v2: indent iff just like the other fdinfo fields are.
    v3: remove unused variable.
        Both are suggested by David Miller <davem@davemloft.net>.

Signed-off-by: Masatake YAMATO <yamato@redhat.com>
---
 drivers/net/tun.c | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index bcf01af..44c4db8 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -69,6 +69,7 @@
 #include <net/netns/generic.h>
 #include <net/rtnetlink.h>
 #include <net/sock.h>
+#include <linux/seq_file.h>
 
 #include <asm/uaccess.h>
 
@@ -2228,6 +2229,27 @@ static int tun_chr_close(struct inode *inode, struct file *file)
 	return 0;
 }
 
+#ifdef CONFIG_PROC_FS
+static int tun_chr_show_fdinfo(struct seq_file *m, struct file *f)
+{
+	struct tun_struct *tun;
+	struct ifreq ifr;
+
+	memset(&ifr, 0, sizeof(ifr));
+
+	rtnl_lock();
+	tun = tun_get(f);
+	if (tun)
+		tun_get_iff(current->nsproxy->net_ns, tun, &ifr);
+	rtnl_unlock();
+
+	if (tun)
+		tun_put(tun);
+
+	return seq_printf(m, "iff:\t%s\n", ifr.ifr_name);
+}
+#endif
+
 static const struct file_operations tun_fops = {
 	.owner	= THIS_MODULE,
 	.llseek = no_llseek,
@@ -2242,7 +2264,10 @@ static const struct file_operations tun_fops = {
 #endif
 	.open	= tun_chr_open,
 	.release = tun_chr_close,
-	.fasync = tun_chr_fasync
+	.fasync = tun_chr_fasync,
+#ifdef CONFIG_PROC_FS
+	.show_fdinfo = tun_chr_show_fdinfo,
+#endif
 };
 
 static struct miscdevice tun_miscdev = {
-- 
1.8.5.3

^ permalink raw reply related

* Re: [PATCH 0/2] DT: net: davinci_emac: couple more properties actually optional
From: David Miller @ 2014-01-29  7:43 UTC (permalink / raw)
  To: sergei.shtylyov
  Cc: netdev, robh+dt, pawel.moll, mark.rutland, ijc+devicetree, galak,
	rob, devicetree, linux-doc, davinci-linux-open-source
In-Reply-To: <201401280245.35745.sergei.shtylyov@cogentembedded.com>

From: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Date: Tue, 28 Jan 2014 02:45:34 +0300

>    Though described as required, couple more properties in the DaVinci EMAC
> binding are actually optional, as the driver will happily function without them.
> The patchset is against DaveM's 'net.git' tree this time.
> 
> [1/2] DT: net: davinci_emac: "ti,davinci-rmii-en" property is actually optional
> [2/2] DT: net: davinci_emac: "ti,davinci-no-bd-ram" property is actually optional

Series applied with the "has/have" thing fixed.

Thanks.

^ permalink raw reply

* Re: 8% performance improved by change tap interact with kernel stack
From: Qin Chuanyu @ 2014-01-29  7:41 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: jasowang, Anthony Liguori, KVM list, netdev
In-Reply-To: <20140128103325.GA17794@redhat.com>

On 2014/1/28 18:33, Michael S. Tsirkin wrote:

>>> Nice.
>>> What about CPU utilization?
>>> It's trivially easy to speed up networking by
>>> burning up a lot of CPU so we must make sure it's
>>> not doing that.
>>> And I think we should see some tests with TCP as well, and
>>> try several message sizes.
>>>
>>>
>> Yes, by burning up more CPU we could get better performance easily.
>> So I have bond vhost thread and interrupt of nic on CPU1 while testing.
>>
>> modified before, the idle of CPU1 is 0%-1% while testing.
>> and after modify, the idle of CPU1 is 2%-3% while testing
>>
>> TCP also could gain from this, but pps is less than UDP, so I think
>> the improvement would be not so obviously.
>
> Still need to test this doesn't regress but overall looks convincing to me.
> Could you send a patch, accompanied by testing results for
> throughput latency and cpu utilization for tcp and udp
> with various message sizes?
>
> Thanks!
>
because of spring festival of china, the test result would be given two 
week later.
throughput would be test by netperf, and latency would be tested by 
qperf. Is that OK?


^ permalink raw reply

* Re: [PATCH stable 3.11+] can: bcm: add skb destructor
From: Andre Naujoks @ 2014-01-29  7:40 UTC (permalink / raw)
  To: Eric Dumazet, Oliver Hartkopp; +Cc: David Miller, Linux Netdev List
In-Reply-To: <1390953066.28432.26.camel@edumazet-glaptop2.roam.corp.google.com>

On 29.01.2014 00:51, schrieb Eric Dumazet:
> On Tue, 2014-01-28 at 23:49 +0100, Oliver Hartkopp wrote:
> 
>> The sbk->sk reference is used to make sure in AF_CAN to identify the
>> originating socket (if any) to not deliver echoed CAN frames to the
>> originating application.
>>
>> See first check in raw_rcv() in net/can/raw.c
> 
> Nice, this is buggy.
> 
>>
>>>
>>> Instead of understanding the issue, it seems this patch exactly shutup
>>> the useful warning.
>>
>> I would have been happy to have this a warning and not a bug as you
>> implemented it.
>>
> 
> Yes, I understand you are not happy of our work to discover CAN bugs.
> 
>> I don't need this warning as I'm using skb_alloc in the cases where CAN frames
>> are generated autonomously. They are not triggered through a direct socket
>> write operation nor do they need to take case about any sock wmem.
>>
>> The useful warning/bug might be nice for common use cases. I'm using plain
>> skb_alloc here for fire-and-forget skbs.
>>
>> So I need to shutup the useful warning or revert the two commits at
>> skb_orphan(). I would prefer the latter.
>>
>>>
>>> If you set skb->sk, then you expect a future reader of skb->sk to be
>>> 100% sure the socket did not disappear.
>>
>> It's a fire-and-forget skb. I don't need to care if the socket disappears.
>> If it disappears no new traffic is generated. That's enough.
>>
>>>
>>> I do not see this explained in the changelog.
>>>
>>
>> I hopefully was able to make it more clearly.
>> See Documentation/networking/can.txt
>>
> 
> 
> Just take a reference on the damn socket, and we do not have to worry.
> 
> bcm_tx_send() suffers from the same problem
> 
> can_send() is buggy as well :
> 
> newskb->sk = skb->sk; // line 293
> 
> dev_queue_xmit() can queue a packet a long time, and some packet qdisc
> even look at skb->sk.
> 
> So this is really wrong to assume only net/can can assume things about
> skb->sk, and not care of net/core or net/sched users.
> 
> I absolutely disagree with your patch. You need quite different _real_
> fixes.

Hi.

Even if this is a bug in the CAN BCM implementation. Your "fix" just
enabled a user space application to shut down any machine with a kernel
containing the BUG_ON patch.

If the BCM implementation is broken, it needs to be fixed. But this is a
regression that causes Kernel crashes, where there were none before.

As I am using the BCM, I would rather have a flawed but working
implementation than an unusable one. If the empty socket destructor
enables the system to work again, then I would like to see it. But, like
Oliver, I would prefer the BUG_ON patch reverted at least until this
issue is resolved.

Regards
  Andre

> 
> 
> 

^ permalink raw reply

* Re: [PATCH v2] tun: add device name(iff) field to proc fdinfo entry
From: David Miller @ 2014-01-29  7:19 UTC (permalink / raw)
  To: yamato; +Cc: netdev
In-Reply-To: <1390973614-3929-1-git-send-email-yamato@redhat.com>

From: Masatake YAMATO <yamato@redhat.com>
Date: Wed, 29 Jan 2014 14:33:34 +0900

>     (v2): indent iff just like the other fdinfo fields are.
>           Suggested by David Miller <davem@davemloft.net>.

Please fix the build warnings generated by your changes:

drivers/net/tun.c: In function ‘tun_chr_show_fdinfo’:
drivers/net/tun.c:2237:6: warning: unused variable ‘ret’ [-Wunused-variable]

^ permalink raw reply

* Re: 8% performance improved by change tap interact with kernel stack
From: Qin Chuanyu @ 2014-01-29  7:12 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: jasowang, Michael S. Tsirkin, Anthony Liguori, KVM list, netdev,
	Peter Klausler
In-Reply-To: <1390920560.28432.8.camel@edumazet-glaptop2.roam.corp.google.com>

On 2014/1/28 22:49, Eric Dumazet wrote:
> On Tue, 2014-01-28 at 16:14 +0800, Qin Chuanyu wrote:
>> according perf test result，I found that there are 5%-8% cpu cost on
>> softirq by use netif_rx_ni called in tun_get_user.
>>
>> so I changed the function which cause skb transmitted more quickly.
>> from
>> 	tun_get_user	->
>> 		 netif_rx_ni(skb);
>> to
>> 	tun_get_user	->
>> 		rcu_read_lock_bh();
>> 		netif_receive_skb(skb);
>> 		rcu_read_unlock_bh();
>
> No idea why you use rcu here ?

In my first version, I forgot to add lock when called netif_receive_skb
then I met a dad spinlock when using tcpdump.

tcpdump receive skb in netif_receive_skb but also in dev_queue_xmit.
and I have notice dev_queue_xmit add rcu_read_lock_bh before 
transmitting skb, and this lock avoid race between softirq and transmit 
thread.
	/* Disable soft irqs for various locks below. Also
	 * stops preemption for RCU.
	 */
	rcu_read_lock_bh();
Now I try to xmit skb in vhost thread, so I did the same thing.

^ permalink raw reply

* Re: [PATCH v2 3/4] net: ethoc: set up MII management bus clock
From: Florian Fainelli @ 2014-01-29  7:01 UTC (permalink / raw)
  To: Max Filippov, linux-xtensa, netdev, linux-kernel
  Cc: Chris Zankel, Marc Gauthier, David S. Miller, Ben Hutchings
In-Reply-To: <1390975218-13863-4-git-send-email-jcmvbkbc@gmail.com>

Le 28/01/2014 22:00, Max Filippov a écrit :
> MII management bus clock is derived from the MAC clock by dividing it by
> MIIMODER register CLKDIV field value. This value may need to be set up
> in case it is undefined or its default value is too high (and
> communication with PHY is too slow) or too low (and communication with
> PHY is impossible). The value of CLKDIV is not specified directly, but
> is derived from the MAC clock for the default MII management bus frequency
> of 2.5MHz. The MAC clock may be specified in the platform data, or as
> either 'clock-frequency' or 'clocks' device tree attribute.
>
> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
> ---
> Changes v1->v2:
> - drop MDIO bus frequency configurability, always configure for standard
>    2.5MHz;
> - allow using common clock framework to provide ethoc clock.
>
>   drivers/net/ethernet/ethoc.c | 37 +++++++++++++++++++++++++++++++++++--
>   include/net/ethoc.h          |  1 +
>   2 files changed, 36 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/ethoc.c b/drivers/net/ethernet/ethoc.c
> index 5643b2d..5854d41 100644
> --- a/drivers/net/ethernet/ethoc.c
> +++ b/drivers/net/ethernet/ethoc.c
> @@ -13,6 +13,7 @@
>
>   #include <linux/dma-mapping.h>
>   #include <linux/etherdevice.h>
> +#include <linux/clk.h>
>   #include <linux/crc32.h>
>   #include <linux/interrupt.h>
>   #include <linux/io.h>
> @@ -216,6 +217,7 @@ struct ethoc {
>
>   	struct phy_device *phy;
>   	struct mii_bus *mdio;
> +	struct clk *clk;
>   	s8 phy_id;
>   };
>
> @@ -925,6 +927,8 @@ static int ethoc_probe(struct platform_device *pdev)
>   	int num_bd;
>   	int ret = 0;
>   	bool random_mac = false;
> +	u32 eth_clkfreq = 0;
> +	struct ethoc_platform_data *pdata = dev_get_platdata(&pdev->dev);
>
>   	/* allocate networking device */
>   	netdev = alloc_etherdev(sizeof(struct ethoc));
> @@ -1038,8 +1042,7 @@ static int ethoc_probe(struct platform_device *pdev)
>   	}
>
>   	/* Allow the platform setup code to pass in a MAC address. */
> -	if (dev_get_platdata(&pdev->dev)) {
> -		struct ethoc_platform_data *pdata = dev_get_platdata(&pdev->dev);
> +	if (pdata) {
>   		memcpy(netdev->dev_addr, pdata->hwaddr, IFHWADDRLEN);
>   		priv->phy_id = pdata->phy_id;
>   	} else {
> @@ -1077,6 +1080,32 @@ static int ethoc_probe(struct platform_device *pdev)
>   	if (random_mac)
>   		netdev->addr_assign_type = NET_ADDR_RANDOM;
>
> +	/* Allow the platform setup code to adjust MII management bus clock. */
> +	if (pdata)
> +		eth_clkfreq = pdata->eth_clkfreq;

Since this is a new member, why not make it a struct clk pointer 
directly so you could simplify the code path?

> +	else
> +		of_property_read_u32(pdev->dev.of_node,
> +				     "clock-frequency", &eth_clkfreq);
> +	if (!eth_clkfreq) {
> +		struct clk *clk = clk_get(&pdev->dev, NULL);
> +
> +		if (!IS_ERR(clk)) {
> +			priv->clk = clk;
> +			clk_prepare_enable(clk);
> +			eth_clkfreq = clk_get_rate(clk);
> +		}
> +	}
> +	if (eth_clkfreq) {
> +		u32 clkdiv = MIIMODER_CLKDIV(eth_clkfreq / 2500000 + 1);
> +
> +		if (!clkdiv)
> +			clkdiv = 2;
> +		dev_dbg(&pdev->dev, "setting MII clkdiv to %u\n", clkdiv);
> +		ethoc_write(priv, MIIMODER,
> +			    (ethoc_read(priv, MIIMODER) & MIIMODER_NOPRE) |
> +			    clkdiv);
> +	}

This does look a bit convoluted, and it looks like the clk_get() or 
getting the clock-frequency property should boil down to being the same 
thing with of_clk_get() as it should resolve all clocks phandles and 
fetch their frequencies appropriately.

> +
>   	/* register MII bus */
>   	priv->mdio = mdiobus_alloc();
>   	if (!priv->mdio) {
> @@ -1141,6 +1170,8 @@ free_mdio:
>   	kfree(priv->mdio->irq);
>   	mdiobus_free(priv->mdio);
>   free:
> +	if (priv->clk)
> +		clk_disable_unprepare(priv->clk);

This should probbaly be if (!IS_ERR(priv-clk)).

>   	free_netdev(netdev);
>   out:
>   	return ret;
> @@ -1165,6 +1196,8 @@ static int ethoc_remove(struct platform_device *pdev)
>   			kfree(priv->mdio->irq);
>   			mdiobus_free(priv->mdio);
>   		}
> +		if (priv->clk)
> +			clk_disable_unprepare(priv->clk);

Same here

>   		unregister_netdev(netdev);
>   		free_netdev(netdev);
>   	}
> diff --git a/include/net/ethoc.h b/include/net/ethoc.h
> index 96f3789..2a2d6bb 100644
> --- a/include/net/ethoc.h
> +++ b/include/net/ethoc.h
> @@ -16,6 +16,7 @@
>   struct ethoc_platform_data {
>   	u8 hwaddr[IFHWADDRLEN];
>   	s8 phy_id;
> +	u32 eth_clkfreq;
>   };
>
>   #endif /* !LINUX_NET_ETHOC_H */
>

^ permalink raw reply

* Re: [PATCH v2 2/4] net: ethoc: don't advertise gigabit speed on attached PHY
From: Max Filippov @ 2014-01-29  7:01 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: linux-xtensa@linux-xtensa.org, netdev, LKML, Chris Zankel,
	Marc Gauthier, David S. Miller, Ben Hutchings
In-Reply-To: <52E8A407.1020809@gmail.com>

On Wed, Jan 29, 2014 at 10:47 AM, Florian Fainelli <f.fainelli@gmail.com> wrote:
> Hi Max,
>
> Le 28/01/2014 22:00, Max Filippov a écrit :
>
>> OpenCores 10/100 Mbps MAC does not support speeds above 100 Mbps, but does
>> not disable advertisement when PHY supports them. This results in
>> non-functioning network when the MAC is connected to a gigabit PHY
>> connected
>> to a gigabit switch.
>>
>> The fix is to disable gigabit speed advertisement on attached PHY
>> unconditionally.
>>
>> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
>> ---
>> Changes v1->v2:
>> - disable both gigabit advertisement and support.
>>
>>   drivers/net/ethernet/ethoc.c | 8 ++++++++
>>   1 file changed, 8 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/ethoc.c b/drivers/net/ethernet/ethoc.c
>> index 4de8cfd..5643b2d 100644
>> --- a/drivers/net/ethernet/ethoc.c
>> +++ b/drivers/net/ethernet/ethoc.c
>> @@ -688,6 +688,14 @@ static int ethoc_mdio_probe(struct net_device *dev)
>>         }
>>
>>         priv->phy = phy;
>> +       phy_update_advert(phy,
>> +                         ADVERTISED_1000baseT_Full |
>> +                         ADVERTISED_1000baseT_Half, 0);
>> +       phy_start_aneg(phy);
>
>
> This does not look necessary, you should not have to call phy_start_aneg()
> because the PHY state machine is not yet started, at best this calls does
> nothing.

This call actually makes the whole thing work, because otherwise once gigabit
support is cleared from the supported mask genphy_config_advert does not
update gigabit advertisement register, leaving it enabled.

>> +       phy_update_supported(phy,
>> +                            SUPPORTED_1000baseT_Full |
>> +                            SUPPORTED_1000baseT_Half, 0);
>> +
>>         return 0;
>>   }
>>
>>
>

-- 
Thanks.
-- Max

^ permalink raw reply

* Re: [PATCH v2 4/4] net: ethoc: implement ethtool operations
From: Florian Fainelli @ 2014-01-29  6:52 UTC (permalink / raw)
  To: Max Filippov, linux-xtensa, netdev, linux-kernel
  Cc: Chris Zankel, Marc Gauthier, David S. Miller, Ben Hutchings
In-Reply-To: <1390975218-13863-5-git-send-email-jcmvbkbc@gmail.com>

Le 28/01/2014 22:00, Max Filippov a écrit :
> The following methods are implemented:
> - get/set settings;
> - get registers length/registers;
> - get link state (standard implementation);
> - get/set ring parameters;
> - get timestamping info (standard implementation).

Ideally you should have one patch per ethtool callback that you 
implement just in case something happens to break, only the specific 
patch can reverted/referenced.

>
> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
> ---
> Changes v1->v2:
> - new patch.
>
>   drivers/net/ethernet/ethoc.c | 85 ++++++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 85 insertions(+)
>
> diff --git a/drivers/net/ethernet/ethoc.c b/drivers/net/ethernet/ethoc.c
> index 5854d41..2f8b174 100644
> --- a/drivers/net/ethernet/ethoc.c
> +++ b/drivers/net/ethernet/ethoc.c
> @@ -52,6 +52,7 @@ MODULE_PARM_DESC(buffer_size, "DMA buffer allocation size");
>   #define	ETH_HASH0	0x48
>   #define	ETH_HASH1	0x4c
>   #define	ETH_TXCTRL	0x50
> +#define	ETH_END		0x54
>
>   /* mode register */
>   #define	MODER_RXEN	(1 <<  0) /* receive enable */
> @@ -180,6 +181,7 @@ MODULE_PARM_DESC(buffer_size, "DMA buffer allocation size");
>    * @membase:	pointer to buffer memory region
>    * @dma_alloc:	dma allocated buffer size
>    * @io_region_size:	I/O memory region size
> + * @num_bd:	number of buffer descriptors
>    * @num_tx:	number of send buffers
>    * @cur_tx:	last send buffer written
>    * @dty_tx:	last buffer actually sent
> @@ -200,6 +202,7 @@ struct ethoc {
>   	int dma_alloc;
>   	resource_size_t io_region_size;
>
> +	unsigned int num_bd;
>   	unsigned int num_tx;
>   	unsigned int cur_tx;
>   	unsigned int dty_tx;
> @@ -900,6 +903,86 @@ out:
>   	return NETDEV_TX_OK;
>   }
>
> +static int ethoc_get_settings(struct net_device *dev, struct ethtool_cmd *cmd)
> +{
> +	struct ethoc *priv = netdev_priv(dev);
> +	struct phy_device *phydev = priv->phy;
> +
> +	if (!phydev)
> +		return -ENODEV;
> +
> +	return phy_ethtool_gset(phydev, cmd);
> +}
> +
> +static int ethoc_set_settings(struct net_device *dev, struct ethtool_cmd *cmd)
> +{
> +	struct ethoc *priv = netdev_priv(dev);
> +	struct phy_device *phydev = priv->phy;
> +
> +	if (!phydev)
> +		return -ENODEV;
> +
> +	return phy_ethtool_sset(phydev, cmd);
> +}
> +
> +static int ethoc_get_regs_len(struct net_device *netdev)
> +{
> +	return ETH_END;
> +}
> +
> +static void ethoc_get_regs(struct net_device *dev, struct ethtool_regs *regs,
> +			   void *p)
> +{
> +	struct ethoc *priv = netdev_priv(dev);
> +	u32 *regs_buff = p;
> +	unsigned i;
> +
> +	regs->version = 0;
> +	for (i = 0; i < ETH_END / sizeof(u32); ++i)
> +		regs_buff[i] = ethoc_read(priv, i * sizeof(u32));
> +}
> +
> +static void ethoc_get_ringparam(struct net_device *dev,
> +				struct ethtool_ringparam *ring)
> +{
> +	struct ethoc *priv = netdev_priv(dev);
> +
> +	ring->rx_max_pending = priv->num_bd;
> +	ring->rx_mini_max_pending = 0;
> +	ring->rx_jumbo_max_pending = 0;
> +	ring->tx_max_pending = priv->num_bd;
> +
> +	ring->rx_pending = priv->num_rx;
> +	ring->rx_mini_pending = 0;
> +	ring->rx_jumbo_pending = 0;
> +	ring->tx_pending = priv->num_tx;
> +}
> +
> +static int ethoc_set_ringparam(struct net_device *dev,
> +			       struct ethtool_ringparam *ring)
> +{
> +	struct ethoc *priv = netdev_priv(dev);
> +
> +	if (netif_running(dev))
> +		return -EBUSY;
> +	priv->num_tx = rounddown_pow_of_two(ring->tx_pending);
> +	priv->num_rx = priv->num_bd - priv->num_tx;
> +	if (priv->num_rx > ring->rx_pending)
> +		priv->num_rx = ring->rx_pending;
> +	return 0;
> +}
> +
> +const struct ethtool_ops ethoc_ethtool_ops = {
> +	.get_settings = ethoc_get_settings,
> +	.set_settings = ethoc_set_settings,
> +	.get_regs_len = ethoc_get_regs_len,
> +	.get_regs = ethoc_get_regs,
> +	.get_link = ethtool_op_get_link,
> +	.get_ringparam = ethoc_get_ringparam,
> +	.set_ringparam = ethoc_set_ringparam,
> +	.get_ts_info = ethtool_op_get_ts_info,
> +};
> +
>   static const struct net_device_ops ethoc_netdev_ops = {
>   	.ndo_open = ethoc_open,
>   	.ndo_stop = ethoc_stop,
> @@ -1028,6 +1111,7 @@ static int ethoc_probe(struct platform_device *pdev)
>   		ret = -ENODEV;
>   		goto error;
>   	}
> +	priv->num_bd = num_bd;
>   	/* num_tx must be a power of two */
>   	priv->num_tx = rounddown_pow_of_two(num_bd >> 1);
>   	priv->num_rx = num_bd - priv->num_tx;
> @@ -1148,6 +1232,7 @@ static int ethoc_probe(struct platform_device *pdev)
>   	netdev->netdev_ops = &ethoc_netdev_ops;
>   	netdev->watchdog_timeo = ETHOC_TIMEOUT;
>   	netdev->features |= 0;
> +	netdev->ethtool_ops = &ethoc_ethtool_ops;
>
>   	/* setup NAPI */
>   	netif_napi_add(netdev, &priv->napi, ethoc_poll, 64);
>

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox