netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next 0/5] bridge/ovs: avoid skb head copy on frame forwarding
@ 2016-02-23 12:53 Paolo Abeni
  2016-02-23 12:53 ` [PATCH net-next 1/5] netdev: introduce ndo_set_rx_headroom Paolo Abeni
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Paolo Abeni @ 2016-02-23 12:53 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Stephen Hemminger, Pravin Shelar, Jesse Gross,
	Flavio Leitner, Hannes Frederic Sowa

Currently, while when an OVS or Linux bridge is used to forward frames towards
some tunnel device, a skb_head_copy() may occur if the ingress device do not
provide enough headroom for the tx encapsulation.

This patch series tries to address the issue implementing a new ndo operation to
allow the master device to control the headroom used when allocating the skb on
frame reception.

Said operation is used by the Linux bridge to notify the bridged ports of
needed_headroom changes, and similar bookkeeping and behaviour is also added to
openvswitch, on a per datapath basis.

Finally, the operation is implemented for veth and tun device, which give
performance improvement in the 6-12% range when forwarding frames from said
devices towards a vxlan tunnel.

Paolo Abeni (5):
  netdev: introduce ndo_set_rx_headroom
  bridge: notify ensabled devices of headroom changes
  ovs: propagate per dp max headroom to all vports
  net/tun: implement ndo_set_rx_headroom
  veth: implement ndo_set_rx_headroom

 drivers/net/tun.c                    | 16 +++++++++++-
 drivers/net/veth.c                   | 26 +++++++++++++++++++
 include/linux/netdevice.h            | 20 +++++++++++++++
 net/bridge/br_if.c                   | 36 +++++++++++++++++++++++++--
 net/openvswitch/datapath.c           | 48 ++++++++++++++++++++++++++++++++++++
 net/openvswitch/datapath.h           |  4 +++
 net/openvswitch/vport-internal_dev.c | 10 +++++++-
 7 files changed, 156 insertions(+), 4 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH net-next 1/5] netdev: introduce ndo_set_rx_headroom
  2016-02-23 12:53 [PATCH net-next 0/5] bridge/ovs: avoid skb head copy on frame forwarding Paolo Abeni
@ 2016-02-23 12:53 ` Paolo Abeni
  2016-02-23 19:20   ` pravin shelar
  2016-02-23 12:53 ` [PATCH net-next 2/5] bridge: notify ensabled devices of headroom changes Paolo Abeni
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Paolo Abeni @ 2016-02-23 12:53 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Stephen Hemminger, Pravin Shelar, Jesse Gross,
	Flavio Leitner, Hannes Frederic Sowa

This method allows the controlling device (i.e. the bridge) to specify
additional headroom to be allocated for skb head on frame reception.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 include/linux/netdevice.h | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 47671ce0..fb74277 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1093,6 +1093,12 @@ struct tc_to_netdev {
  *	This function is used to get egress tunnel information for given skb.
  *	This is useful for retrieving outer tunnel header parameters while
  *	sampling packet.
+ * * int (*ndo_set_rx_headroom)(struct net_device *dev, int needed_headroom);
+ *	This function is used to specify the headroom that the skb must
+ *	consider when allocation skb during packet reception. Setting
+ *	appropriate rx headroom value allows avoiding skb head copy on
+ *	forward. Setting a negative value reset the rx headroom to the
+ *	default value.
  *
  */
 struct net_device_ops {
@@ -1278,6 +1284,8 @@ struct net_device_ops {
 							 bool proto_down);
 	int			(*ndo_fill_metadata_dst)(struct net_device *dev,
 						       struct sk_buff *skb);
+	void			(*ndo_set_rx_headroom)(struct net_device *dev,
+						       int needed_headroom);
 };
 
 /**
@@ -1315,6 +1323,8 @@ struct net_device_ops {
  * @IFF_L3MDEV_SLAVE: device is enslaved to an L3 master device
  * @IFF_TEAM: device is a team device
  * @IFF_RXFH_CONFIGURED: device has had Rx Flow indirection table configured
+ * @IFF_PHONY_HEADROOM: the headroom value is controlled by an external
+ *	entity (i.e. the master device for bridged veth)
  */
 enum netdev_priv_flags {
 	IFF_802_1Q_VLAN			= 1<<0,
@@ -1343,6 +1353,7 @@ enum netdev_priv_flags {
 	IFF_L3MDEV_SLAVE		= 1<<23,
 	IFF_TEAM			= 1<<24,
 	IFF_RXFH_CONFIGURED		= 1<<25,
+	IFF_PHONY_HEADROOM		= 1<<26,
 };
 
 #define IFF_802_1Q_VLAN			IFF_802_1Q_VLAN
@@ -1937,6 +1948,15 @@ struct netdev_queue *netdev_pick_tx(struct net_device *dev,
 				    struct sk_buff *skb,
 				    void *accel_priv);
 
+/* returns the headroom that the master device needs to take in account
+ * when forwarding to this dev
+ */
+static inline unsigned netdev_get_fwd_headroom(struct net_device *dev)
+{
+	return (dev->priv_flags && IFF_PHONY_HEADROOM) ? dev->needed_headroom :
+		0;
+}
+
 /*
  * Net namespace inlines
  */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH net-next 2/5] bridge: notify ensabled devices of headroom changes
  2016-02-23 12:53 [PATCH net-next 0/5] bridge/ovs: avoid skb head copy on frame forwarding Paolo Abeni
  2016-02-23 12:53 ` [PATCH net-next 1/5] netdev: introduce ndo_set_rx_headroom Paolo Abeni
@ 2016-02-23 12:53 ` Paolo Abeni
  2016-02-23 19:20   ` pravin shelar
  2016-02-23 12:53 ` [PATCH net-next 3/5] ovs: propagate per dp max headroom to all vports Paolo Abeni
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Paolo Abeni @ 2016-02-23 12:53 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Stephen Hemminger, Pravin Shelar, Jesse Gross,
	Flavio Leitner, Hannes Frederic Sowa

On bridge needed_headroom changes, the enslaved devices are
notified via the ndo_set_rx_headroom method

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/bridge/br_if.c | 36 ++++++++++++++++++++++++++++++++++--
 1 file changed, 34 insertions(+), 2 deletions(-)

diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index c367b3e..f42f1da 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -223,6 +223,31 @@ static void destroy_nbp_rcu(struct rcu_head *head)
 	destroy_nbp(p);
 }
 
+static unsigned get_max_headroom(struct net_bridge *br)
+{
+	unsigned max_headroom = 0;
+	struct net_bridge_port *p;
+
+	list_for_each_entry(p, &br->port_list, list) {
+		unsigned dev_headroom = netdev_get_fwd_headroom(p->dev);
+
+		if (dev_headroom > max_headroom)
+			max_headroom = dev_headroom;
+	}
+
+	return max_headroom;
+}
+
+static void update_headroom(struct net_bridge *br, int new_hr)
+{
+	struct net_bridge_port *p;
+
+	list_for_each_entry(p, &br->port_list, list)
+		if (p->dev->netdev_ops->ndo_set_rx_headroom)
+			p->dev->netdev_ops->ndo_set_rx_headroom(p->dev, new_hr);
+	br->dev->needed_headroom = new_hr;
+}
+
 /* Delete port(interface) from bridge is done in two steps.
  * via RCU. First step, marks device as down. That deletes
  * all the timers and stops new packets from flowing through.
@@ -248,6 +273,8 @@ static void del_nbp(struct net_bridge_port *p)
 	br_ifinfo_notify(RTM_DELLINK, p);
 
 	list_del_rcu(&p->list);
+	if (netdev_get_fwd_headroom(dev) == br->dev->needed_headroom)
+		update_headroom(br, get_max_headroom(br));
 
 	nbp_vlan_flush(p);
 	br_fdb_delete_by_port(br, p, 0, 1);
@@ -438,6 +465,7 @@ int br_add_if(struct net_bridge *br, struct net_device *dev)
 {
 	struct net_bridge_port *p;
 	int err = 0;
+	unsigned br_hr, dev_hr;
 	bool changed_addr;
 
 	/* Don't allow bridging non-ethernet like devices, or DSA-enabled
@@ -505,8 +533,12 @@ int br_add_if(struct net_bridge *br, struct net_device *dev)
 
 	netdev_update_features(br->dev);
 
-	if (br->dev->needed_headroom < dev->needed_headroom)
-		br->dev->needed_headroom = dev->needed_headroom;
+	br_hr = br->dev->needed_headroom;
+	dev_hr = netdev_get_fwd_headroom(dev);
+	if (br_hr < dev_hr)
+		update_headroom(br, dev_hr);
+	else if (dev->netdev_ops->ndo_set_rx_headroom)
+		dev->netdev_ops->ndo_set_rx_headroom(dev, br_hr);
 
 	if (br_fdb_insert(br, p, dev->dev_addr, 0))
 		netdev_err(dev, "failed insert local address bridge forwarding table\n");
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH net-next 3/5] ovs: propagate per dp max headroom to all vports
  2016-02-23 12:53 [PATCH net-next 0/5] bridge/ovs: avoid skb head copy on frame forwarding Paolo Abeni
  2016-02-23 12:53 ` [PATCH net-next 1/5] netdev: introduce ndo_set_rx_headroom Paolo Abeni
  2016-02-23 12:53 ` [PATCH net-next 2/5] bridge: notify ensabled devices of headroom changes Paolo Abeni
@ 2016-02-23 12:53 ` Paolo Abeni
  2016-02-23 19:20   ` pravin shelar
  2016-02-23 12:53 ` [PATCH net-next 4/5] net/tun: implement ndo_set_rx_headroom Paolo Abeni
  2016-02-23 12:53 ` [PATCH net-next 5/5] veth: " Paolo Abeni
  4 siblings, 1 reply; 15+ messages in thread
From: Paolo Abeni @ 2016-02-23 12:53 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Stephen Hemminger, Pravin Shelar, Jesse Gross,
	Flavio Leitner, Hannes Frederic Sowa

This patch implements bookkeeping support to compute the maximum
headroom for all the devices in each datapath. When said value
changes, the underlying devs are notified via the
ndo_set_rx_headroom method.

This also increases the internal vports xmit performance.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/openvswitch/datapath.c           | 48 ++++++++++++++++++++++++++++++++++++
 net/openvswitch/datapath.h           |  4 +++
 net/openvswitch/vport-internal_dev.c | 10 +++++++-
 3 files changed, 61 insertions(+), 1 deletion(-)

diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index c4e8455..7b37288 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -1908,6 +1908,34 @@ static struct vport *lookup_vport(struct net *net,
 		return ERR_PTR(-EINVAL);
 }
 
+/* Called with ovs_mutex */
+static void update_headroom(struct datapath *dp)
+{
+	int i;
+	struct vport *vport;
+	struct net_device *dev;
+	unsigned dev_headroom, max_headroom = 0;
+
+	for (i = 0; i < DP_VPORT_HASH_BUCKETS; i++) {
+		hlist_for_each_entry_rcu(vport, &dp->ports[i], dp_hash_node) {
+			dev = vport->dev;
+			dev_headroom = netdev_get_fwd_headroom(dev);
+			if (dev_headroom > max_headroom)
+				max_headroom = dev_headroom;
+		}
+	}
+
+	dp->max_headroom = max_headroom;
+	for (i = 0; i < DP_VPORT_HASH_BUCKETS; i++) {
+		hlist_for_each_entry_rcu(vport, &dp->ports[i], dp_hash_node) {
+			dev = vport->dev;
+			if (dev->netdev_ops->ndo_set_rx_headroom)
+				dev->netdev_ops->ndo_set_rx_headroom(dev,
+								  max_headroom);
+		}
+	}
+}
+
 static int ovs_vport_cmd_new(struct sk_buff *skb, struct genl_info *info)
 {
 	struct nlattr **a = info->attrs;
@@ -1973,6 +2001,13 @@ restart:
 
 	err = ovs_vport_cmd_fill_info(vport, reply, info->snd_portid,
 				      info->snd_seq, 0, OVS_VPORT_CMD_NEW);
+
+	if (netdev_get_fwd_headroom(vport->dev) > dp->max_headroom)
+		update_headroom(dp);
+	else if (vport->dev->netdev_ops->ndo_set_rx_headroom)
+		vport->dev->netdev_ops->ndo_set_rx_headroom(vport->dev,
+							    dp->max_headroom);
+
 	BUG_ON(err < 0);
 	ovs_unlock();
 
@@ -2043,6 +2078,8 @@ static int ovs_vport_cmd_del(struct sk_buff *skb, struct genl_info *info)
 	struct sk_buff *reply;
 	struct vport *vport;
 	int err;
+	struct datapath *dp;
+	bool must_update_headroom = false;
 
 	reply = ovs_vport_cmd_alloc_info();
 	if (!reply)
@@ -2062,7 +2099,18 @@ static int ovs_vport_cmd_del(struct sk_buff *skb, struct genl_info *info)
 	err = ovs_vport_cmd_fill_info(vport, reply, info->snd_portid,
 				      info->snd_seq, 0, OVS_VPORT_CMD_DEL);
 	BUG_ON(err < 0);
+
+	/* check if the deletion of this port may change the dp max_headroom
+	 * before deleting the vport
+	 */
+	dp = vport->dp;
+	if (!(vport->dev->priv_flags & IFF_PHONY_HEADROOM) &&
+	    vport->dev->needed_headroom == dp->max_headroom)
+		must_update_headroom = true;
 	ovs_dp_detach_port(vport);
+
+	if (must_update_headroom)
+		update_headroom(dp);
 	ovs_unlock();
 
 	ovs_notify(&dp_vport_genl_family, reply, info);
diff --git a/net/openvswitch/datapath.h b/net/openvswitch/datapath.h
index 67bdecd..427e39a 100644
--- a/net/openvswitch/datapath.h
+++ b/net/openvswitch/datapath.h
@@ -68,6 +68,8 @@ struct dp_stats_percpu {
  * ovs_mutex and RCU.
  * @stats_percpu: Per-CPU datapath statistics.
  * @net: Reference to net namespace.
+ * @max_headroom: the maximum headroom of all vports in this datapath; it will
+ * be used by all the internal vports in this dp.
  *
  * Context: See the comment on locking at the top of datapath.c for additional
  * locking information.
@@ -89,6 +91,8 @@ struct datapath {
 	possible_net_t net;
 
 	u32 user_features;
+
+	u32 max_headroom;
 };
 
 /**
diff --git a/net/openvswitch/vport-internal_dev.c b/net/openvswitch/vport-internal_dev.c
index ec76398..83a5534 100644
--- a/net/openvswitch/vport-internal_dev.c
+++ b/net/openvswitch/vport-internal_dev.c
@@ -138,6 +138,11 @@ internal_get_stats(struct net_device *dev, struct rtnl_link_stats64 *stats)
 	return stats;
 }
 
+void internal_set_rx_headroom(struct net_device *dev, int new_hr)
+{
+	dev->needed_headroom = new_hr;
+}
+
 static const struct net_device_ops internal_dev_netdev_ops = {
 	.ndo_open = internal_dev_open,
 	.ndo_stop = internal_dev_stop,
@@ -145,6 +150,7 @@ static const struct net_device_ops internal_dev_netdev_ops = {
 	.ndo_set_mac_address = eth_mac_addr,
 	.ndo_change_mtu = internal_dev_change_mtu,
 	.ndo_get_stats64 = internal_get_stats,
+	.ndo_set_rx_headroom = internal_set_rx_headroom,
 };
 
 static struct rtnl_link_ops internal_dev_link_ops __read_mostly = {
@@ -158,7 +164,8 @@ static void do_setup(struct net_device *netdev)
 	netdev->netdev_ops = &internal_dev_netdev_ops;
 
 	netdev->priv_flags &= ~IFF_TX_SKB_SHARING;
-	netdev->priv_flags |= IFF_LIVE_ADDR_CHANGE | IFF_OPENVSWITCH;
+	netdev->priv_flags |= IFF_LIVE_ADDR_CHANGE | IFF_OPENVSWITCH |
+			      IFF_PHONY_HEADROOM;
 	netdev->destructor = internal_dev_destructor;
 	netdev->ethtool_ops = &internal_dev_ethtool_ops;
 	netdev->rtnl_link_ops = &internal_dev_link_ops;
@@ -199,6 +206,7 @@ static struct vport *internal_dev_create(const struct vport_parms *parms)
 		err = -ENOMEM;
 		goto error_free_netdev;
 	}
+	vport->dev->needed_headroom = vport->dp->max_headroom;
 
 	dev_net_set(vport->dev, ovs_dp_get_net(vport->dp));
 	internal_dev = internal_dev_priv(vport->dev);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH net-next 4/5] net/tun: implement ndo_set_rx_headroom
  2016-02-23 12:53 [PATCH net-next 0/5] bridge/ovs: avoid skb head copy on frame forwarding Paolo Abeni
                   ` (2 preceding siblings ...)
  2016-02-23 12:53 ` [PATCH net-next 3/5] ovs: propagate per dp max headroom to all vports Paolo Abeni
@ 2016-02-23 12:53 ` Paolo Abeni
  2016-02-25 10:49   ` Paolo Abeni
  2016-02-23 12:53 ` [PATCH net-next 5/5] veth: " Paolo Abeni
  4 siblings, 1 reply; 15+ messages in thread
From: Paolo Abeni @ 2016-02-23 12:53 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Stephen Hemminger, Pravin Shelar, Jesse Gross,
	Flavio Leitner, Hannes Frederic Sowa

ndo_set_rx_headroom controls the align value used by tun devices to
allocate skbs on frame reception.
When the xmit device adds a large encapsulation, this avoids an skb
head reallocation on forwarding.

The measured improvement when forwarding towards a vxlan dev with
frame size below the egress device MTU is around 6% when tunneling over
ipv6.

In case of ipv4 tunnels there is no improvement, since the tun
device default alignment provides enough headroom to avoid the skb
head reallocation, at least on hosts with 64 bytes cacheline.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 drivers/net/tun.c | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 88bb8cc..5812693 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -187,6 +187,7 @@ struct tun_struct {
 #define TUN_USER_FEATURES (NETIF_F_HW_CSUM|NETIF_F_TSO_ECN|NETIF_F_TSO| \
 			  NETIF_F_TSO6|NETIF_F_UFO)
 
+	int			align;
 	int			vnet_hdr_sz;
 	int			sndbuf;
 	struct tap_filter	txflt;
@@ -934,6 +935,17 @@ static void tun_poll_controller(struct net_device *dev)
 	return;
 }
 #endif
+
+static void tun_set_headroom(struct net_device *dev, int new_hr)
+{
+	struct tun_struct *tun = netdev_priv(dev);
+
+	if (new_hr < NET_SKB_PAD)
+		new_hr = NET_SKB_PAD;
+
+	tun->align = new_hr;
+}
+
 static const struct net_device_ops tun_netdev_ops = {
 	.ndo_uninit		= tun_net_uninit,
 	.ndo_open		= tun_net_open,
@@ -945,6 +957,7 @@ static const struct net_device_ops tun_netdev_ops = {
 #ifdef CONFIG_NET_POLL_CONTROLLER
 	.ndo_poll_controller	= tun_poll_controller,
 #endif
+	.ndo_set_rx_headroom	= tun_set_headroom,
 };
 
 static const struct net_device_ops tap_netdev_ops = {
@@ -962,6 +975,7 @@ static const struct net_device_ops tap_netdev_ops = {
 	.ndo_poll_controller	= tun_poll_controller,
 #endif
 	.ndo_features_check	= passthru_features_check,
+	.ndo_set_rx_headroom	= tun_set_headroom,
 };
 
 static void tun_flow_init(struct tun_struct *tun)
@@ -1086,7 +1100,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 	struct tun_pi pi = { 0, cpu_to_be16(ETH_P_IP) };
 	struct sk_buff *skb;
 	size_t total_len = iov_iter_count(from);
-	size_t len = total_len, align = NET_SKB_PAD, linear;
+	size_t len = total_len, align = tun->align, linear;
 	struct virtio_net_hdr gso = { 0 };
 	int good_linear;
 	int copylen;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH net-next 5/5] veth: implement ndo_set_rx_headroom
  2016-02-23 12:53 [PATCH net-next 0/5] bridge/ovs: avoid skb head copy on frame forwarding Paolo Abeni
                   ` (3 preceding siblings ...)
  2016-02-23 12:53 ` [PATCH net-next 4/5] net/tun: implement ndo_set_rx_headroom Paolo Abeni
@ 2016-02-23 12:53 ` Paolo Abeni
  2016-02-23 19:21   ` pravin shelar
  4 siblings, 1 reply; 15+ messages in thread
From: Paolo Abeni @ 2016-02-23 12:53 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Stephen Hemminger, Pravin Shelar, Jesse Gross,
	Flavio Leitner, Hannes Frederic Sowa

The rx headroom for veth dev is the peer device needed_headroom.
Avoid ping-pong updates setting the private flag IFF_PHONY_HEADROOM.

This avoids skb head reallocation when forwarding from a veth dev
towards a device adding some kind of encapsulation.

When forwarding frames below the MTU size towards a vxlan device,
this gives about 10% performance speed-up when OVS is used to connect
the veth and the vxlan device and a little more when using a
plain Linux bridge.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 drivers/net/veth.c | 26 ++++++++++++++++++++++++++
 1 file changed, 26 insertions(+)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index ba21d07..4f30a6a 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -35,6 +35,7 @@ struct pcpu_vstats {
 struct veth_priv {
 	struct net_device __rcu	*peer;
 	atomic64_t		dropped;
+	unsigned		requested_headroom;
 };
 
 /*
@@ -271,6 +272,29 @@ static int veth_get_iflink(const struct net_device *dev)
 	return iflink;
 }
 
+static void veth_set_rx_headroom(struct net_device *dev, int new_hr)
+{
+	struct veth_priv *peer_priv, *priv = netdev_priv(dev);
+	struct net_device *peer;
+
+	if (new_hr < 0)
+		new_hr = 0;
+
+	rcu_read_lock();
+	peer = rcu_dereference(priv->peer);
+	if (unlikely(!peer))
+		goto out;
+
+	peer_priv = netdev_priv(peer);
+	priv->requested_headroom = new_hr;
+	new_hr = max(priv->requested_headroom, peer_priv->requested_headroom);
+	dev->needed_headroom = new_hr;
+	peer->needed_headroom = new_hr;
+
+out:
+	rcu_read_unlock();
+}
+
 static const struct net_device_ops veth_netdev_ops = {
 	.ndo_init            = veth_dev_init,
 	.ndo_open            = veth_open,
@@ -285,6 +309,7 @@ static const struct net_device_ops veth_netdev_ops = {
 #endif
 	.ndo_get_iflink		= veth_get_iflink,
 	.ndo_features_check	= passthru_features_check,
+	.ndo_set_rx_headroom	= veth_set_rx_headroom,
 };
 
 #define VETH_FEATURES (NETIF_F_SG | NETIF_F_FRAGLIST | NETIF_F_ALL_TSO |    \
@@ -301,6 +326,7 @@ static void veth_setup(struct net_device *dev)
 	dev->priv_flags &= ~IFF_TX_SKB_SHARING;
 	dev->priv_flags |= IFF_LIVE_ADDR_CHANGE;
 	dev->priv_flags |= IFF_NO_QUEUE;
+	dev->priv_flags |= IFF_PHONY_HEADROOM;
 
 	dev->netdev_ops = &veth_netdev_ops;
 	dev->ethtool_ops = &veth_ethtool_ops;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 1/5] netdev: introduce ndo_set_rx_headroom
  2016-02-23 12:53 ` [PATCH net-next 1/5] netdev: introduce ndo_set_rx_headroom Paolo Abeni
@ 2016-02-23 19:20   ` pravin shelar
  2016-02-24  8:37     ` Paolo Abeni
  0 siblings, 1 reply; 15+ messages in thread
From: pravin shelar @ 2016-02-23 19:20 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Linux Kernel Network Developers, David S. Miller,
	Stephen Hemminger, Pravin Shelar, Jesse Gross, Flavio Leitner,
	Hannes Frederic Sowa

On Tue, Feb 23, 2016 at 4:53 AM, Paolo Abeni <pabeni@redhat.com> wrote:
> This method allows the controlling device (i.e. the bridge) to specify
> additional headroom to be allocated for skb head on frame reception.
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
>  include/linux/netdevice.h | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
>
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 47671ce0..fb74277 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -1093,6 +1093,12 @@ struct tc_to_netdev {
>   *     This function is used to get egress tunnel information for given skb.
>   *     This is useful for retrieving outer tunnel header parameters while
>   *     sampling packet.
...

>  /**
> @@ -1315,6 +1323,8 @@ struct net_device_ops {
>   * @IFF_L3MDEV_SLAVE: device is enslaved to an L3 master device
>   * @IFF_TEAM: device is a team device
>   * @IFF_RXFH_CONFIGURED: device has had Rx Flow indirection table configured
> + * @IFF_PHONY_HEADROOM: the headroom value is controlled by an external
> + *     entity (i.e. the master device for bridged veth)
>   */
>  enum netdev_priv_flags {
>         IFF_802_1Q_VLAN                 = 1<<0,
> @@ -1343,6 +1353,7 @@ enum netdev_priv_flags {
>         IFF_L3MDEV_SLAVE                = 1<<23,
>         IFF_TEAM                        = 1<<24,
>         IFF_RXFH_CONFIGURED             = 1<<25,
> +       IFF_PHONY_HEADROOM              = 1<<26,
>  };
>
>  #define IFF_802_1Q_VLAN                        IFF_802_1Q_VLAN
> @@ -1937,6 +1948,15 @@ struct netdev_queue *netdev_pick_tx(struct net_device *dev,
>                                     struct sk_buff *skb,
>                                     void *accel_priv);
>
> +/* returns the headroom that the master device needs to take in account
> + * when forwarding to this dev
> + */
> +static inline unsigned netdev_get_fwd_headroom(struct net_device *dev)
> +{
> +       return (dev->priv_flags && IFF_PHONY_HEADROOM) ? dev->needed_headroom :
> +               0;
> +}
> +
This looks like typo. It should '&' to check for the bitfield.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 2/5] bridge: notify ensabled devices of headroom changes
  2016-02-23 12:53 ` [PATCH net-next 2/5] bridge: notify ensabled devices of headroom changes Paolo Abeni
@ 2016-02-23 19:20   ` pravin shelar
  2016-02-24  8:43     ` Paolo Abeni
  0 siblings, 1 reply; 15+ messages in thread
From: pravin shelar @ 2016-02-23 19:20 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Linux Kernel Network Developers, David S. Miller,
	Stephen Hemminger, Pravin Shelar, Jesse Gross, Flavio Leitner,
	Hannes Frederic Sowa

On Tue, Feb 23, 2016 at 4:53 AM, Paolo Abeni <pabeni@redhat.com> wrote:
> On bridge needed_headroom changes, the enslaved devices are
> notified via the ndo_set_rx_headroom method
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
>  net/bridge/br_if.c | 36 ++++++++++++++++++++++++++++++++++--
>  1 file changed, 34 insertions(+), 2 deletions(-)
>
> diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
> index c367b3e..f42f1da 100644
> --- a/net/bridge/br_if.c
> +++ b/net/bridge/br_if.c
> @@ -223,6 +223,31 @@ static void destroy_nbp_rcu(struct rcu_head *head)
>         destroy_nbp(p);
>  }
>
> +static unsigned get_max_headroom(struct net_bridge *br)
> +{
> +       unsigned max_headroom = 0;
> +       struct net_bridge_port *p;
> +
> +       list_for_each_entry(p, &br->port_list, list) {
> +               unsigned dev_headroom = netdev_get_fwd_headroom(p->dev);
> +

IFF_PHONY_HEADROOM is only set for veth and ovs-internal-device, so we
can not get headroom for tunnel devices. I guess it worked in your
tests due to the bug in netdev_get_fwd_headroom().

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 3/5] ovs: propagate per dp max headroom to all vports
  2016-02-23 12:53 ` [PATCH net-next 3/5] ovs: propagate per dp max headroom to all vports Paolo Abeni
@ 2016-02-23 19:20   ` pravin shelar
  2016-02-24  8:59     ` Paolo Abeni
  0 siblings, 1 reply; 15+ messages in thread
From: pravin shelar @ 2016-02-23 19:20 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Linux Kernel Network Developers, David S. Miller,
	Stephen Hemminger, Pravin Shelar, Jesse Gross, Flavio Leitner,
	Hannes Frederic Sowa

On Tue, Feb 23, 2016 at 4:53 AM, Paolo Abeni <pabeni@redhat.com> wrote:
> This patch implements bookkeeping support to compute the maximum
> headroom for all the devices in each datapath. When said value
> changes, the underlying devs are notified via the
> ndo_set_rx_headroom method.
>
> This also increases the internal vports xmit performance.
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
>  net/openvswitch/datapath.c           | 48 ++++++++++++++++++++++++++++++++++++
>  net/openvswitch/datapath.h           |  4 +++
>  net/openvswitch/vport-internal_dev.c | 10 +++++++-
>  3 files changed, 61 insertions(+), 1 deletion(-)
>
> diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
> index c4e8455..7b37288 100644
> --- a/net/openvswitch/datapath.c
> +++ b/net/openvswitch/datapath.c
> @@ -1908,6 +1908,34 @@ static struct vport *lookup_vport(struct net *net,
>                 return ERR_PTR(-EINVAL);
>  }
>
> +/* Called with ovs_mutex */
> +static void update_headroom(struct datapath *dp)
> +{
> +       int i;
> +       struct vport *vport;
> +       struct net_device *dev;
> +       unsigned dev_headroom, max_headroom = 0;
> +
> +       for (i = 0; i < DP_VPORT_HASH_BUCKETS; i++) {
> +               hlist_for_each_entry_rcu(vport, &dp->ports[i], dp_hash_node) {
> +                       dev = vport->dev;
> +                       dev_headroom = netdev_get_fwd_headroom(dev);
> +                       if (dev_headroom > max_headroom)
> +                               max_headroom = dev_headroom;
> +               }
> +       }
> +
> +       dp->max_headroom = max_headroom;
> +       for (i = 0; i < DP_VPORT_HASH_BUCKETS; i++) {
> +               hlist_for_each_entry_rcu(vport, &dp->ports[i], dp_hash_node) {
> +                       dev = vport->dev;
> +                       if (dev->netdev_ops->ndo_set_rx_headroom)
> +                               dev->netdev_ops->ndo_set_rx_headroom(dev,
> +                                                                 max_headroom);
> +               }
> +       }
> +}
> +
Code looks fine. But It could be kept in sync with bridge code from patch 2.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 5/5] veth: implement ndo_set_rx_headroom
  2016-02-23 12:53 ` [PATCH net-next 5/5] veth: " Paolo Abeni
@ 2016-02-23 19:21   ` pravin shelar
  2016-02-24  9:17     ` Paolo Abeni
  0 siblings, 1 reply; 15+ messages in thread
From: pravin shelar @ 2016-02-23 19:21 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Linux Kernel Network Developers, David S. Miller,
	Stephen Hemminger, Pravin Shelar, Jesse Gross, Flavio Leitner,
	Hannes Frederic Sowa

On Tue, Feb 23, 2016 at 4:53 AM, Paolo Abeni <pabeni@redhat.com> wrote:
> The rx headroom for veth dev is the peer device needed_headroom.
> Avoid ping-pong updates setting the private flag IFF_PHONY_HEADROOM.
>
> This avoids skb head reallocation when forwarding from a veth dev
> towards a device adding some kind of encapsulation.
>
> When forwarding frames below the MTU size towards a vxlan device,
> this gives about 10% performance speed-up when OVS is used to connect
> the veth and the vxlan device and a little more when using a
> plain Linux bridge.
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
>  drivers/net/veth.c | 26 ++++++++++++++++++++++++++
>  1 file changed, 26 insertions(+)
>
> diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> index ba21d07..4f30a6a 100644
> --- a/drivers/net/veth.c
> +++ b/drivers/net/veth.c
> @@ -35,6 +35,7 @@ struct pcpu_vstats {
>  struct veth_priv {
>         struct net_device __rcu *peer;
>         atomic64_t              dropped;
> +       unsigned                requested_headroom;
>  };
>
>  /*
> @@ -271,6 +272,29 @@ static int veth_get_iflink(const struct net_device *dev)
>         return iflink;
>  }
>
> +static void veth_set_rx_headroom(struct net_device *dev, int new_hr)
> +{
> +       struct veth_priv *peer_priv, *priv = netdev_priv(dev);
> +       struct net_device *peer;
> +
> +       if (new_hr < 0)
> +               new_hr = 0;
> +
> +       rcu_read_lock();
> +       peer = rcu_dereference(priv->peer);
> +       if (unlikely(!peer))
> +               goto out;
> +
> +       peer_priv = netdev_priv(peer);
> +       priv->requested_headroom = new_hr;
> +       new_hr = max(priv->requested_headroom, peer_priv->requested_headroom);
> +       dev->needed_headroom = new_hr;
> +       peer->needed_headroom = new_hr;
> +
> +out:
> +       rcu_read_unlock();
> +}
> +
I am not sure why new priv->requested_headroom is introduced. I think
you can just compare new_hr with dev->needed_headroom and set the max
value to both devices.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 1/5] netdev: introduce ndo_set_rx_headroom
  2016-02-23 19:20   ` pravin shelar
@ 2016-02-24  8:37     ` Paolo Abeni
  0 siblings, 0 replies; 15+ messages in thread
From: Paolo Abeni @ 2016-02-24  8:37 UTC (permalink / raw)
  To: pravin shelar
  Cc: Linux Kernel Network Developers, David S. Miller,
	Stephen Hemminger, Pravin Shelar, Jesse Gross, Flavio Leitner,
	Hannes Frederic Sowa

On Tue, 2016-02-23 at 11:20 -0800, pravin shelar wrote:
> On Tue, Feb 23, 2016 at 4:53 AM, Paolo Abeni <pabeni@redhat.com> wrote:
> > This method allows the controlling device (i.e. the bridge) to specify
> > additional headroom to be allocated for skb head on frame reception.
> >
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > ---
> >  include/linux/netdevice.h | 20 ++++++++++++++++++++
> >  1 file changed, 20 insertions(+)
> >
> > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> > index 47671ce0..fb74277 100644
> > --- a/include/linux/netdevice.h
> > +++ b/include/linux/netdevice.h
> > @@ -1093,6 +1093,12 @@ struct tc_to_netdev {
> >   *     This function is used to get egress tunnel information for given skb.
> >   *     This is useful for retrieving outer tunnel header parameters while
> >   *     sampling packet.
> ...
> 
> >  /**
> > @@ -1315,6 +1323,8 @@ struct net_device_ops {
> >   * @IFF_L3MDEV_SLAVE: device is enslaved to an L3 master device
> >   * @IFF_TEAM: device is a team device
> >   * @IFF_RXFH_CONFIGURED: device has had Rx Flow indirection table configured
> > + * @IFF_PHONY_HEADROOM: the headroom value is controlled by an external
> > + *     entity (i.e. the master device for bridged veth)
> >   */
> >  enum netdev_priv_flags {
> >         IFF_802_1Q_VLAN                 = 1<<0,
> > @@ -1343,6 +1353,7 @@ enum netdev_priv_flags {
> >         IFF_L3MDEV_SLAVE                = 1<<23,
> >         IFF_TEAM                        = 1<<24,
> >         IFF_RXFH_CONFIGURED             = 1<<25,
> > +       IFF_PHONY_HEADROOM              = 1<<26,
> >  };
> >
> >  #define IFF_802_1Q_VLAN                        IFF_802_1Q_VLAN
> > @@ -1937,6 +1948,15 @@ struct netdev_queue *netdev_pick_tx(struct net_device *dev,
> >                                     struct sk_buff *skb,
> >                                     void *accel_priv);
> >
> > +/* returns the headroom that the master device needs to take in account
> > + * when forwarding to this dev
> > + */
> > +static inline unsigned netdev_get_fwd_headroom(struct net_device *dev)
> > +{
> > +       return (dev->priv_flags && IFF_PHONY_HEADROOM) ? dev->needed_headroom :
> > +               0;
> > +}
> > +
> This looks like typo. It should '&' to check for the bitfield.

You are right! Thank you for spotting it.

There is also another error here; the code should be:

return (dev->priv_flags & IFF_PHONY_HEADROOM) ? 0 : dev->needed_headroom;

i.e. when computing the rx/forwarding headroom, we want to ignore
devices with phony headroom, since they actually will not requires
additional skb headroom on xmit.

I'll fix this in v2.

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 2/5] bridge: notify ensabled devices of headroom changes
  2016-02-23 19:20   ` pravin shelar
@ 2016-02-24  8:43     ` Paolo Abeni
  0 siblings, 0 replies; 15+ messages in thread
From: Paolo Abeni @ 2016-02-24  8:43 UTC (permalink / raw)
  To: pravin shelar
  Cc: Linux Kernel Network Developers, David S. Miller,
	Stephen Hemminger, Pravin Shelar, Jesse Gross, Flavio Leitner,
	Hannes Frederic Sowa

On Tue, 2016-02-23 at 11:20 -0800, pravin shelar wrote:
> On Tue, Feb 23, 2016 at 4:53 AM, Paolo Abeni <pabeni@redhat.com> wrote:
> > On bridge needed_headroom changes, the enslaved devices are
> > notified via the ndo_set_rx_headroom method
> >
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > ---
> >  net/bridge/br_if.c | 36 ++++++++++++++++++++++++++++++++++--
> >  1 file changed, 34 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
> > index c367b3e..f42f1da 100644
> > --- a/net/bridge/br_if.c
> > +++ b/net/bridge/br_if.c
> > @@ -223,6 +223,31 @@ static void destroy_nbp_rcu(struct rcu_head *head)
> >         destroy_nbp(p);
> >  }
> >
> > +static unsigned get_max_headroom(struct net_bridge *br)
> > +{
> > +       unsigned max_headroom = 0;
> > +       struct net_bridge_port *p;
> > +
> > +       list_for_each_entry(p, &br->port_list, list) {
> > +               unsigned dev_headroom = netdev_get_fwd_headroom(p->dev);
> > +
> 
> IFF_PHONY_HEADROOM is only set for veth and ovs-internal-device, so we
> can not get headroom for tunnel devices. I guess it worked in your
> tests due to the bug in netdev_get_fwd_headroom().

This code should be actually correct; the idea is to ignore the devices
with phony headroom when computing the maximum headroom; all other
devices should be taken in account.

The current, bugged, netdev_get_fwd_headroom() implementation is
actually the culprit for the confusion.

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 3/5] ovs: propagate per dp max headroom to all vports
  2016-02-23 19:20   ` pravin shelar
@ 2016-02-24  8:59     ` Paolo Abeni
  0 siblings, 0 replies; 15+ messages in thread
From: Paolo Abeni @ 2016-02-24  8:59 UTC (permalink / raw)
  To: pravin shelar
  Cc: Linux Kernel Network Developers, David S. Miller,
	Stephen Hemminger, Pravin Shelar, Jesse Gross, Flavio Leitner,
	Hannes Frederic Sowa

On Tue, 2016-02-23 at 11:20 -0800, pravin shelar wrote:
> On Tue, Feb 23, 2016 at 4:53 AM, Paolo Abeni <pabeni@redhat.com> wrote:
> > This patch implements bookkeeping support to compute the maximum
> > headroom for all the devices in each datapath. When said value
> > changes, the underlying devs are notified via the
> > ndo_set_rx_headroom method.
> >
> > This also increases the internal vports xmit performance.
> >
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > ---
> >  net/openvswitch/datapath.c           | 48 ++++++++++++++++++++++++++++++++++++
> >  net/openvswitch/datapath.h           |  4 +++
> >  net/openvswitch/vport-internal_dev.c | 10 +++++++-
> >  3 files changed, 61 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
> > index c4e8455..7b37288 100644
> > --- a/net/openvswitch/datapath.c
> > +++ b/net/openvswitch/datapath.c
> > @@ -1908,6 +1908,34 @@ static struct vport *lookup_vport(struct net *net,
> >                 return ERR_PTR(-EINVAL);
> >  }
> >
> > +/* Called with ovs_mutex */
> > +static void update_headroom(struct datapath *dp)
> > +{
> > +       int i;
> > +       struct vport *vport;
> > +       struct net_device *dev;
> > +       unsigned dev_headroom, max_headroom = 0;
> > +
> > +       for (i = 0; i < DP_VPORT_HASH_BUCKETS; i++) {
> > +               hlist_for_each_entry_rcu(vport, &dp->ports[i], dp_hash_node) {
> > +                       dev = vport->dev;
> > +                       dev_headroom = netdev_get_fwd_headroom(dev);
> > +                       if (dev_headroom > max_headroom)
> > +                               max_headroom = dev_headroom;
> > +               }
> > +       }
> > +
> > +       dp->max_headroom = max_headroom;
> > +       for (i = 0; i < DP_VPORT_HASH_BUCKETS; i++) {
> > +               hlist_for_each_entry_rcu(vport, &dp->ports[i], dp_hash_node) {
> > +                       dev = vport->dev;
> > +                       if (dev->netdev_ops->ndo_set_rx_headroom)
> > +                               dev->netdev_ops->ndo_set_rx_headroom(dev,
> > +                                                                 max_headroom);
> > +               }
> > +       }
> > +}
> > +
> Code looks fine. But It could be kept in sync with bridge code from patch 2.

You are right. 

I can had an helper to wrap the conditional ndo_set_rx_headroom()
invocation, so that all the code here and in the bridge will be devices
traversal only.

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 5/5] veth: implement ndo_set_rx_headroom
  2016-02-23 19:21   ` pravin shelar
@ 2016-02-24  9:17     ` Paolo Abeni
  0 siblings, 0 replies; 15+ messages in thread
From: Paolo Abeni @ 2016-02-24  9:17 UTC (permalink / raw)
  To: pravin shelar
  Cc: Linux Kernel Network Developers, David S. Miller,
	Stephen Hemminger, Pravin Shelar, Jesse Gross, Flavio Leitner,
	Hannes Frederic Sowa

On Tue, 2016-02-23 at 11:21 -0800, pravin shelar wrote:
> On Tue, Feb 23, 2016 at 4:53 AM, Paolo Abeni <pabeni@redhat.com> wrote:
> > The rx headroom for veth dev is the peer device needed_headroom.
> > Avoid ping-pong updates setting the private flag IFF_PHONY_HEADROOM.
> >
> > This avoids skb head reallocation when forwarding from a veth dev
> > towards a device adding some kind of encapsulation.
> >
> > When forwarding frames below the MTU size towards a vxlan device,
> > this gives about 10% performance speed-up when OVS is used to connect
> > the veth and the vxlan device and a little more when using a
> > plain Linux bridge.
> >
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > ---
> >  drivers/net/veth.c | 26 ++++++++++++++++++++++++++
> >  1 file changed, 26 insertions(+)
> >
> > diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> > index ba21d07..4f30a6a 100644
> > --- a/drivers/net/veth.c
> > +++ b/drivers/net/veth.c
> > @@ -35,6 +35,7 @@ struct pcpu_vstats {
> >  struct veth_priv {
> >         struct net_device __rcu *peer;
> >         atomic64_t              dropped;
> > +       unsigned                requested_headroom;
> >  };
> >
> >  /*
> > @@ -271,6 +272,29 @@ static int veth_get_iflink(const struct net_device *dev)
> >         return iflink;
> >  }
> >
> > +static void veth_set_rx_headroom(struct net_device *dev, int new_hr)
> > +{
> > +       struct veth_priv *peer_priv, *priv = netdev_priv(dev);
> > +       struct net_device *peer;
> > +
> > +       if (new_hr < 0)
> > +               new_hr = 0;
> > +
> > +       rcu_read_lock();
> > +       peer = rcu_dereference(priv->peer);
> > +       if (unlikely(!peer))
> > +               goto out;
> > +
> > +       peer_priv = netdev_priv(peer);
> > +       priv->requested_headroom = new_hr;
> > +       new_hr = max(priv->requested_headroom, peer_priv->requested_headroom);
> > +       dev->needed_headroom = new_hr;
> > +       peer->needed_headroom = new_hr;
> > +
> > +out:
> > +       rcu_read_unlock();
> > +}
> > +
> I am not sure why new priv->requested_headroom is introduced. I think
> you can just compare new_hr with dev->needed_headroom and set the max
> value to both devices.

The requested_headroom headroom was intended to cope with with this
scenario:

brctl addif br0 veth0
brctl delif br0 veth0
brctl addif br1 veth0

If I set needed_headroom to the max value to both veth peers, after the
last call veth will use max(br0 headroom, br1 headroom), instead of br1
headroom.

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next 4/5] net/tun: implement ndo_set_rx_headroom
  2016-02-23 12:53 ` [PATCH net-next 4/5] net/tun: implement ndo_set_rx_headroom Paolo Abeni
@ 2016-02-25 10:49   ` Paolo Abeni
  0 siblings, 0 replies; 15+ messages in thread
From: Paolo Abeni @ 2016-02-25 10:49 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Stephen Hemminger, Pravin Shelar, Jesse Gross,
	Flavio Leitner, Hannes Frederic Sowa

On Tue, 2016-02-23 at 13:53 +0100, Paolo Abeni wrote:
> ndo_set_rx_headroom controls the align value used by tun devices to
> allocate skbs on frame reception.
> When the xmit device adds a large encapsulation, this avoids an skb
> head reallocation on forwarding.
> 
> The measured improvement when forwarding towards a vxlan dev with
> frame size below the egress device MTU is around 6% when tunneling over
> ipv6.
> 
> In case of ipv4 tunnels there is no improvement, since the tun
> device default alignment provides enough headroom to avoid the skb
> head reallocation, at least on hosts with 64 bytes cacheline.
> 
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
>  drivers/net/tun.c | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index 88bb8cc..5812693 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -187,6 +187,7 @@ struct tun_struct {
>  #define TUN_USER_FEATURES (NETIF_F_HW_CSUM|NETIF_F_TSO_ECN|NETIF_F_TSO| \
>  			  NETIF_F_TSO6|NETIF_F_UFO)
>  
> +	int			align;

This needs to be initialized to NET_SKB_PAD, to preserved the current
behavior.

I'll fix it in v2.

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2016-02-25 10:49 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-02-23 12:53 [PATCH net-next 0/5] bridge/ovs: avoid skb head copy on frame forwarding Paolo Abeni
2016-02-23 12:53 ` [PATCH net-next 1/5] netdev: introduce ndo_set_rx_headroom Paolo Abeni
2016-02-23 19:20   ` pravin shelar
2016-02-24  8:37     ` Paolo Abeni
2016-02-23 12:53 ` [PATCH net-next 2/5] bridge: notify ensabled devices of headroom changes Paolo Abeni
2016-02-23 19:20   ` pravin shelar
2016-02-24  8:43     ` Paolo Abeni
2016-02-23 12:53 ` [PATCH net-next 3/5] ovs: propagate per dp max headroom to all vports Paolo Abeni
2016-02-23 19:20   ` pravin shelar
2016-02-24  8:59     ` Paolo Abeni
2016-02-23 12:53 ` [PATCH net-next 4/5] net/tun: implement ndo_set_rx_headroom Paolo Abeni
2016-02-25 10:49   ` Paolo Abeni
2016-02-23 12:53 ` [PATCH net-next 5/5] veth: " Paolo Abeni
2016-02-23 19:21   ` pravin shelar
2016-02-24  9:17     ` Paolo Abeni

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).