Netdev List

Netdev List
 help / color / mirror / Atom feed

* BUG:af_packet fails to TX TSO frames
From: Anton Ivanov @ 2017-10-11  8:39 UTC (permalink / raw)
  To: netdev; +Cc: davem

Hi all,

I am having an issue with af_packet.c

It fails to transmit any TSO frame submitted via raw socket + vnet 
headers. An identical frame is considered valid for tap.

The frames are generated out of legit linux skbufs (in UML) and vnet 
headers work for checksumming on raw, so I should have the raw 
initialization right.

The header is supposedly parsed correctly and the newly formed skbuf is 
sent to the device transmit routine (or enqueued) . I have debugged it 
as far as it reaching the following line in packet_snd() (line 2592 in 
4.13):

err = po->xmit(skb);

This returns NET_XMIT_DROP for any TSO capable device I tested. They 
dislike the frame. Same frame is accepted by tap. I have went through 
the header parsing and skb allocation code in both af_packet and tap 
several times and I do not see any material difference (except the new 
zerocopy stuff). So, frankly, I am stuck.

Can someone help me to debug this. I do not see an easy way to debug it, 
but this is not a part of the kernel I am familiar with. Is there a 
suitable helper function to try to segment the frame and see exactly 
what is wrong with it?

Cc-ing DaveM as this has no specific maintainer so it falls under his 
umbrella remit.

-- 
Anton R. Ivanov

Cambridge Greys Limited, England and Wales company No 10273661
http://www.cambridgegreys.com/	

^ permalink raw reply

* [PATCH][net-next] net: mpls: make function ipgre_mpls_encap_hlen static
From: Colin King @ 2017-10-11  9:53 UTC (permalink / raw)
  To: David S . Miller, David Ahern, Robert Shearman, Roopa Prabhu,
	netdev
  Cc: kernel-janitors, linux-kernel

From: Colin Ian King <colin.king@canonical.com>

The function ipgre_mpls_encap_hlen is local to the source and
does not need to be in global scope, so make it static.

Cleans up sparse warning:
symbol 'ipgre_mpls_encap_hlen' was not declared. Should it be static?

Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 net/mpls/af_mpls.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/mpls/af_mpls.c b/net/mpls/af_mpls.c
index 9745e8f69810..8ca9915befc8 100644
--- a/net/mpls/af_mpls.c
+++ b/net/mpls/af_mpls.c
@@ -41,7 +41,7 @@ static int label_limit = (1 << 20) - 1;
 static int ttl_max = 255;
 
 #if IS_ENABLED(CONFIG_NET_IP_TUNNEL)
-size_t ipgre_mpls_encap_hlen(struct ip_tunnel_encap *e)
+static size_t ipgre_mpls_encap_hlen(struct ip_tunnel_encap *e)
 {
 	return sizeof(struct mpls_shim_hdr);
 }
-- 
2.14.1

^ permalink raw reply related

* Re: [net-next 3/3] ip_gre: cache the device mtu hard_header_len calc
From: Xin Long @ 2017-10-11  9:54 UTC (permalink / raw)
  To: William Tu; +Cc: network dev, David Laight, davem
In-Reply-To: <1507582067-36718-4-git-send-email-u9012063@gmail.com>

On Tue, Oct 10, 2017 at 4:47 AM, William Tu <u9012063@gmail.com> wrote:
> The patch introduces ip_tunnel->ether_mtu fields to cache the value of
> dev->mtu + dev->hard_header_len.  This avoids the arithmetic operation
> on every packet.
>
> Signed-off-by: William Tu <u9012063@gmail.com>
> Cc: David Laight <David.Laight@aculab.com>
> ---
>  include/net/ip_tunnels.h | 1 +
>  net/ipv4/ip_gre.c        | 8 ++++----
>  net/ipv4/ip_tunnel.c     | 3 +++
>  3 files changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h
> index b41a1e057fce..19565be26e13 100644
> --- a/include/net/ip_tunnels.h
> +++ b/include/net/ip_tunnels.h
> @@ -117,6 +117,7 @@ struct ip_tunnel {
>
>         /* This field used only by ERSPAN */
>         u32             index;          /* ERSPAN type II index */
> +       unsigned int    ether_mtu;      /* The mtu including the ether hdr */
>
>         struct dst_cache dst_cache;
ip_tunnel is a very common structure for various tunnels.
Adding ether_mtu into it to avoid  ONLY an addition operation ONLY for
erspan, I think it's not worth it.

>
> diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c
> index 6e6e4c4811cc..994b8ddea0b1 100644
> --- a/net/ipv4/ip_gre.c
> +++ b/net/ipv4/ip_gre.c
> @@ -578,8 +578,8 @@ static void erspan_fb_xmit(struct sk_buff *skb, struct net_device *dev,
>         if (gre_handle_offloads(skb, false))
>                 goto err_free_rt;
>
> -       if (skb->len > dev->mtu + dev->hard_header_len) {
> -               pskb_trim(skb, dev->mtu + dev->hard_header_len);
> +       if (skb->len > tunnel->ether_mtu) {
> +               pskb_trim(skb, tunnel->ether_mtu);
>                 truncate = true;
>         }
>
> @@ -730,8 +730,8 @@ static netdev_tx_t erspan_xmit(struct sk_buff *skb,
>         if (skb_cow_head(skb, dev->needed_headroom))
>                 goto free_skb;
>
> -       if (skb->len > dev->mtu + dev->hard_header_len) {
> -               pskb_trim(skb, dev->mtu + dev->hard_header_len);
> +       if (skb->len > tunnel->ether_mtu) {
> +               pskb_trim(skb, tunnel->ether_mtu);
>                 truncate = true;
>         }
>
> diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
> index fe6fee728ce4..859af5b86802 100644
> --- a/net/ipv4/ip_tunnel.c
> +++ b/net/ipv4/ip_tunnel.c
> @@ -348,6 +348,7 @@ static int ip_tunnel_bind_dev(struct net_device *dev)
>
>         dev->needed_headroom = t_hlen + hlen;
>         mtu -= (dev->hard_header_len + t_hlen);
> +       tunnel->ether_mtu = mtu + dev->hard_header_len;
>
>         if (mtu < 68)
>                 mtu = 68;
> @@ -952,6 +953,7 @@ int __ip_tunnel_change_mtu(struct net_device *dev, int new_mtu, bool strict)
>         }
>
>         dev->mtu = new_mtu;
> +       tunnel->ether_mtu = new_mtu + dev->hard_header_len;
>         return 0;
>  }
>  EXPORT_SYMBOL_GPL(__ip_tunnel_change_mtu);
> @@ -1183,6 +1185,7 @@ int ip_tunnel_init(struct net_device *dev)
>
>         tunnel->dev = dev;
>         tunnel->net = dev_net(dev);
> +       tunnel->ether_mtu = dev->mtu + dev->hard_header_len;
>         strcpy(tunnel->parms.name, dev->name);
>         iph->version            = 4;
>         iph->ihl                = 5;
> --
> 2.7.4
>

^ permalink raw reply

* RE: [PATCH v2] xdp: Sample xdp program implementing ip forward
From: David Laight @ 2017-10-11 10:08 UTC (permalink / raw)
  To: 'Jesper Dangaard Brouer', Christina Jacob
  Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, Sunil.Goutham@cavium.com,
	daniel@iogearbox.net, dsahern@gmail.com, Christina Jacob
In-Reply-To: <20171010210607.2af452e2@redhat.com>

From: Jesper Dangaard Brouer
> Sent: 10 October 2017 20:06
...
> > +		int src_ip = 0, dest_ip = 0;
...
> > +			key4.b8[4] = dest_ip % 0x100;
> > +			key4.b8[5] = (dest_ip >> 8) % 0x100;
> > +			key4.b8[6] = (dest_ip >> 16) % 0x100;
> > +			key4.b8[7] = (dest_ip >> 24) % 0x100;

Do you really want signed remainders done here?

	David

^ permalink raw reply

* Re: [PATCH 11/13] timer: Remove expires argument from __TIMER_INITIALIZER()
From: Petr Mladek @ 2017-10-11 10:15 UTC (permalink / raw)
  To: Kees Cook
  Cc: Thomas Gleixner, Andrew Morton, Arnd Bergmann,
	Benjamin Herrenschmidt, Chris Metcalf, Geert Uytterhoeven,
	Greg Kroah-Hartman, Guenter Roeck, Harish Patil, Heiko Carstens,
	James E.J. Bottomley, John Stultz, Julian Wiedmann, Kalle Valo,
	Lai Jiangshan, Len Brown, Manish Chopra, Mark Gross,
	"Martin K. Petersen"
In-Reply-To: <1507159627-127660-12-git-send-email-keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>

On Wed 2017-10-04 16:27:05, Kees Cook wrote:
> The expires field is normally initialized during the first mod_timer()
> call. It was unused by all callers, so remove it from the macro.
> 
> Signed-off-by: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>
> ---
>  include/linux/kthread.h   | 2 +-
>  include/linux/timer.h     | 5 ++---
>  include/linux/workqueue.h | 2 +-
>  3 files changed, 4 insertions(+), 5 deletions(-)

I was primary interested into the change in kthread.h. But the entire
patch is simple and looks fine:

Reviewed-by: Petr Mladek <pmladek-AlSwsSmVLrQ@public.gmane.org>

Best Regards,
Petr
--
To unsubscribe from this list: send the line "unsubscribe linux-watchdog" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH net-next 2/3] net/mlx4_en: Obsolete call to generic write_desc in XDP xmit flow
From: Tariq Toukan @ 2017-10-11 10:17 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Eran Ben Elisha, kernel-team, Tariq Toukan
In-Reply-To: <1507717047-29255-1-git-send-email-tariqt@mellanox.com>

Function mlx4_en_tx_write_desc() is not optimized to use of XDP xmit.
Use the relevant parts inline instead.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_tx.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index f16774c9c347..ac7254e3f909 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -1090,7 +1090,9 @@ netdev_tx_t mlx4_en_xmit_frame(struct mlx4_en_rx_ring *rx_ring,
 			       struct mlx4_en_priv *priv, unsigned int length,
 			       int tx_ind, bool *doorbell_pending)
 {
-	union mlx4_wqe_qpn_vlan	qpn_vlan = {};
+	union mlx4_wqe_qpn_vlan qpn_vlan = {
+		.fence_size = MLX4_EN_XDP_TX_REAL_SZ,
+	};
 	struct mlx4_en_tx_desc *tx_desc;
 	struct mlx4_en_tx_info *tx_info;
 	struct mlx4_wqe_data_seg *data;
@@ -1140,7 +1142,6 @@ netdev_tx_t mlx4_en_xmit_frame(struct mlx4_en_rx_ring *rx_ring,
 	data->byte_count = cpu_to_be32(length);
 
 	/* tx completion can avoid cache line miss for common cases */
-	tx_desc->ctrl.srcrb_flags = priv->ctrl_flags;
 
 	op_own = cpu_to_be32(MLX4_OPCODE_SEND) |
 		((ring->prod & ring->size) ?
@@ -1151,10 +1152,16 @@ netdev_tx_t mlx4_en_xmit_frame(struct mlx4_en_rx_ring *rx_ring,
 
 	ring->prod += MLX4_EN_XDP_TX_NRTXBB;
 
-	qpn_vlan.fence_size = MLX4_EN_XDP_TX_REAL_SZ;
+	tx_desc->ctrl.qpn_vlan = qpn_vlan;
+	tx_desc->ctrl.srcrb_flags = priv->ctrl_flags;
+
+	/* Ensure new descriptor hits memory
+	 * before setting ownership of this descriptor to HW
+	 */
+	dma_wmb();
+	tx_desc->ctrl.owner_opcode = op_own;
+	ring->xmit_more++;
 
-	mlx4_en_tx_write_desc(ring, tx_desc, qpn_vlan, TXBB_SIZE, 0,
-			      op_own, false, false);
 	*doorbell_pending = true;
 
 	return NETDEV_TX_OK;
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH net-next 3/3] net/mlx4_en: XDP_TX, assign constant values of TX descs on ring creaion
From: Tariq Toukan @ 2017-10-11 10:17 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Eran Ben Elisha, kernel-team, Tariq Toukan
In-Reply-To: <1507717047-29255-1-git-send-email-tariqt@mellanox.com>

In XDP_TX, some fields in tx_info and tx_desc are constants across
all entries of the different XDP_TX rings.
Assign values to these fields on ring creation time, rather than in
data-path.

Patchset performance tests:
Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
Single queue no-RSS optimization ON.

XDP_TX packet rate:
------------------------------
Before    | After     | Gain |
13.7 Mpps | 14.0 Mpps | %2.2 |
------------------------------

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_netdev.c |  1 +
 drivers/net/ethernet/mellanox/mlx4/en_tx.c     | 38 ++++++++++++++++----------
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h   |  2 ++
 3 files changed, 27 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index e4c7a80ef5a8..d611df2f274d 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -1752,6 +1752,7 @@ int mlx4_en_start_port(struct net_device *dev)
 				mlx4_en_arm_cq(priv, cq);
 
 			} else {
+				mlx4_en_init_tx_xdp_ring_descs(priv, tx_ring);
 				mlx4_en_init_recycle_ring(priv, i);
 				/* XDP TX CQ should never be armed */
 			}
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index ac7254e3f909..596445a4a241 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -1085,14 +1085,35 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
 #define MLX4_EN_XDP_TX_REAL_SZ (((CTRL_SIZE + MLX4_EN_XDP_TX_NRTXBB * DS_SIZE) \
 				 / 16) & 0x3f)
 
+void mlx4_en_init_tx_xdp_ring_descs(struct mlx4_en_priv *priv,
+				    struct mlx4_en_tx_ring *ring)
+{
+	int i;
+
+	for (i = 0; i < ring->size; i++) {
+		struct mlx4_en_tx_info *tx_info = &ring->tx_info[i];
+		struct mlx4_en_tx_desc *tx_desc = ring->buf +
+			(i << LOG_TXBB_SIZE);
+
+		tx_info->map0_byte_count = PAGE_SIZE;
+		tx_info->nr_txbb = MLX4_EN_XDP_TX_NRTXBB;
+		tx_info->data_offset = offsetof(struct mlx4_en_tx_desc, data);
+		tx_info->ts_requested = 0;
+		tx_info->nr_maps = 1;
+		tx_info->linear = 1;
+		tx_info->inl = 0;
+
+		tx_desc->data.lkey = ring->mr_key;
+		tx_desc->ctrl.qpn_vlan.fence_size = MLX4_EN_XDP_TX_REAL_SZ;
+		tx_desc->ctrl.srcrb_flags = priv->ctrl_flags;
+	}
+}
+
 netdev_tx_t mlx4_en_xmit_frame(struct mlx4_en_rx_ring *rx_ring,
 			       struct mlx4_en_rx_alloc *frame,
 			       struct mlx4_en_priv *priv, unsigned int length,
 			       int tx_ind, bool *doorbell_pending)
 {
-	union mlx4_wqe_qpn_vlan qpn_vlan = {
-		.fence_size = MLX4_EN_XDP_TX_REAL_SZ,
-	};
 	struct mlx4_en_tx_desc *tx_desc;
 	struct mlx4_en_tx_info *tx_info;
 	struct mlx4_wqe_data_seg *data;
@@ -1124,20 +1145,12 @@ netdev_tx_t mlx4_en_xmit_frame(struct mlx4_en_rx_ring *rx_ring,
 	tx_info->page = frame->page;
 	frame->page = NULL;
 	tx_info->map0_dma = dma;
-	tx_info->map0_byte_count = PAGE_SIZE;
-	tx_info->nr_txbb = MLX4_EN_XDP_TX_NRTXBB;
 	tx_info->nr_bytes = max_t(unsigned int, length, ETH_ZLEN);
-	tx_info->data_offset = offsetof(struct mlx4_en_tx_desc, data);
-	tx_info->ts_requested = 0;
-	tx_info->nr_maps = 1;
-	tx_info->linear = 1;
-	tx_info->inl = 0;
 
 	dma_sync_single_range_for_device(priv->ddev, dma, frame->page_offset,
 					 length, PCI_DMA_TODEVICE);
 
 	data->addr = cpu_to_be64(dma + frame->page_offset);
-	data->lkey = ring->mr_key;
 	dma_wmb();
 	data->byte_count = cpu_to_be32(length);
 
@@ -1152,9 +1165,6 @@ netdev_tx_t mlx4_en_xmit_frame(struct mlx4_en_rx_ring *rx_ring,
 
 	ring->prod += MLX4_EN_XDP_TX_NRTXBB;
 
-	tx_desc->ctrl.qpn_vlan = qpn_vlan;
-	tx_desc->ctrl.srcrb_flags = priv->ctrl_flags;
-
 	/* Ensure new descriptor hits memory
 	 * before setting ownership of this descriptor to HW
 	 */
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index 8cad9b4f1936..1856e279a7e0 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -705,6 +705,8 @@ int mlx4_en_create_tx_ring(struct mlx4_en_priv *priv,
 			   int node, int queue_index);
 void mlx4_en_destroy_tx_ring(struct mlx4_en_priv *priv,
 			     struct mlx4_en_tx_ring **pring);
+void mlx4_en_init_tx_xdp_ring_descs(struct mlx4_en_priv *priv,
+				    struct mlx4_en_tx_ring *ring);
 int mlx4_en_activate_tx_ring(struct mlx4_en_priv *priv,
 			     struct mlx4_en_tx_ring *ring,
 			     int cq, int user_prio);
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH net-next 0/3] mlx4_en XDP TX improvements
From: Tariq Toukan @ 2017-10-11 10:17 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Eran Ben Elisha, kernel-team, Tariq Toukan

Hi Dave,

This patchset contains performance improvements
to the XDP_TX use case in the mlx4 Eth driver.

Patch 1 is a simple change in a function parameter type.
Patch 2 replaces a call to a generic function with the
  relevant parts inlined.
Patch 3 moves the write of descriptors' constant values
  from data path to control path.

Series generated against net-next commit:
833e0e2f24fd net: dst: move cpu inside ifdef to avoid compilation warning

Thanks,
Tariq.

Tariq Toukan (3):
  net/mlx4_en: Replace netdev parameter with priv in XDP xmit function
  net/mlx4_en: Obsolete call to generic write_desc in XDP xmit flow
  net/mlx4_en: XDP_TX, assign constant values of TX descs on ring
    creaion

 drivers/net/ethernet/mellanox/mlx4/en_netdev.c |  1 +
 drivers/net/ethernet/mellanox/mlx4/en_rx.c     |  2 +-
 drivers/net/ethernet/mellanox/mlx4/en_tx.c     | 46 +++++++++++++++++---------
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h   |  4 ++-
 4 files changed, 36 insertions(+), 17 deletions(-)

-- 
1.8.3.1

^ permalink raw reply

* [PATCH net-next 1/3] net/mlx4_en: Replace netdev parameter with priv in XDP xmit function
From: Tariq Toukan @ 2017-10-11 10:17 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Eran Ben Elisha, kernel-team, Tariq Toukan
In-Reply-To: <1507717047-29255-1-git-send-email-tariqt@mellanox.com>

The struct net_device parameter was passed only to extract
struct mlx4_en_priv out of it.
Here we pass the priv parameter directly.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_rx.c   | 2 +-
 drivers/net/ethernet/mellanox/mlx4/en_tx.c   | 3 +--
 drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 2 +-
 3 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index a7866954d106..92aec17f4b4d 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -778,7 +778,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
 			case XDP_PASS:
 				break;
 			case XDP_TX:
-				if (likely(!mlx4_en_xmit_frame(ring, frags, dev,
+				if (likely(!mlx4_en_xmit_frame(ring, frags, priv,
 							length, cq_ring,
 							&doorbell_pending))) {
 					frags[0].page = NULL;
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 2cc82dc07397..f16774c9c347 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -1087,10 +1087,9 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
 
 netdev_tx_t mlx4_en_xmit_frame(struct mlx4_en_rx_ring *rx_ring,
 			       struct mlx4_en_rx_alloc *frame,
-			       struct net_device *dev, unsigned int length,
+			       struct mlx4_en_priv *priv, unsigned int length,
 			       int tx_ind, bool *doorbell_pending)
 {
-	struct mlx4_en_priv *priv = netdev_priv(dev);
 	union mlx4_wqe_qpn_vlan	qpn_vlan = {};
 	struct mlx4_en_tx_desc *tx_desc;
 	struct mlx4_en_tx_info *tx_info;
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index 245e9ea09ab2..8cad9b4f1936 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -693,7 +693,7 @@ u16 mlx4_en_select_queue(struct net_device *dev, struct sk_buff *skb,
 netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev);
 netdev_tx_t mlx4_en_xmit_frame(struct mlx4_en_rx_ring *rx_ring,
 			       struct mlx4_en_rx_alloc *frame,
-			       struct net_device *dev, unsigned int length,
+			       struct mlx4_en_priv *priv, unsigned int length,
 			       int tx_ind, bool *doorbell_pending);
 void mlx4_en_xmit_doorbell(struct mlx4_en_tx_ring *ring);
 bool mlx4_en_rx_recycle(struct mlx4_en_rx_ring *ring,
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH][next] sctp: make array sctp_sched_ops static
From: Colin King @ 2017-10-11 10:17 UTC (permalink / raw)
  To: Vlad Yasevich, Neil Horman, David S . Miller, linux-sctp, netdev
  Cc: kernel-janitors, linux-kernel

From: Colin Ian King <colin.king@canonical.com>

The array sctp_sched_ops  is local to the source and
does not need to be in global scope, so make it static.

Cleans up sparse warning:
symbol 'sctp_sched_ops' was not declared. Should it be static?

Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 net/sctp/stream_sched.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sctp/stream_sched.c b/net/sctp/stream_sched.c
index 03513a9fa110..0b83ec51e43b 100644
--- a/net/sctp/stream_sched.c
+++ b/net/sctp/stream_sched.c
@@ -124,7 +124,7 @@ static struct sctp_sched_ops sctp_sched_fcfs = {
 extern struct sctp_sched_ops sctp_sched_prio;
 extern struct sctp_sched_ops sctp_sched_rr;

-struct sctp_sched_ops *sctp_sched_ops[] = {
+static struct sctp_sched_ops *sctp_sched_ops[] = {
 	&sctp_sched_fcfs,
 	&sctp_sched_prio,
 	&sctp_sched_rr,
-- 
2.14.1

^ permalink raw reply related

* Re: [PATCH 12/13] kthread: Convert callback to use from_timer()
From: Petr Mladek @ 2017-10-11 10:20 UTC (permalink / raw)
  To: Kees Cook
  Cc: Thomas Gleixner, Andrew Morton, Tejun Heo, Oleg Nesterov,
	Arnd Bergmann, Benjamin Herrenschmidt, Chris Metcalf,
	Geert Uytterhoeven, Greg Kroah-Hartman, Guenter Roeck,
	Harish Patil, Heiko Carstens, James E.J. Bottomley, John Stultz,
	Julian Wiedmann, Kalle Valo, Lai Jiangshan, Len Brown,
	Manish Chopra
In-Reply-To: <1507159627-127660-13-git-send-email-keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>

On Wed 2017-10-04 16:27:06, Kees Cook wrote:
> In preparation for unconditionally passing the struct timer_list pointer
> to all timer callbacks, switch kthread to use from_timer() and pass the
> timer pointer explicitly.
> 
> Cc: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
> Cc: Petr Mladek <pmladek-IBi9RG/b67k@public.gmane.org>
> Cc: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
> Cc: Thomas Gleixner <tglx-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
> Cc: Oleg Nesterov <oleg-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> Signed-off-by: Kees Cook <keescook-F7+t8E8rja9g9hUCZPvPmw@public.gmane.org>

Reviewed-by: Petr Mladek <pmladek-IBi9RG/b67k@public.gmane.org>

Best Regards,
Petr

^ permalink raw reply

* Re: [PATCH] rtl8xxxu: mark expected switch fall-throughs
From: Joe Perches @ 2017-10-11 10:20 UTC (permalink / raw)
  To: Gustavo A. R. Silva, Jes Sorensen, Kalle Valo
  Cc: linux-wireless, netdev, linux-kernel, Arnaldo Carvalho de Melo
In-Reply-To: <20171010193027.GA23108@embeddedor.com>

On Tue, 2017-10-10 at 14:30 -0500, Gustavo A. R. Silva wrote:
> In preparation to enabling -Wimplicit-fallthrough, mark switch cases
> where we are expecting to fall through.

perhaps use Arnaldo's idea:

https://lkml.org/lkml/2017/2/9/845
https://lkml.org/lkml/2017/2/10/485

^ permalink raw reply

* [PATCH] [net-next]NFC: Convert timers to use timer_setup()
From: Allen Pais @ 2017-10-11 10:33 UTC (permalink / raw)
  To: davem-fT/PcQaiUtIeIZ0/mPfg9Q
  Cc: linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA, Allen Pais

Switch to using the new timer_setup() and from_timer()
for net/nfc/*

Signed-off-by: Allen Pais <allen.pais-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
---
---
 net/nfc/core.c          |  8 +++-----
 net/nfc/hci/core.c      |  7 +++----
 net/nfc/hci/llc_shdlc.c | 23 +++++++++--------------
 net/nfc/llcp_core.c     | 14 ++++++--------
 4 files changed, 21 insertions(+), 31 deletions(-)

diff --git a/net/nfc/core.c b/net/nfc/core.c
index e5e23c2..56e5467 100644
--- a/net/nfc/core.c
+++ b/net/nfc/core.c
@@ -1015,9 +1015,9 @@ static void nfc_check_pres_work(struct work_struct *work)
 	device_unlock(&dev->dev);
 }
 
-static void nfc_check_pres_timeout(unsigned long data)
+static void nfc_check_pres_timeout(struct timer_list *t)
 {
-	struct nfc_dev *dev = (struct nfc_dev *)data;
+	struct nfc_dev *dev = from_timer(dev, t, check_pres_timer);
 
 	schedule_work(&dev->check_pres_work);
 }
@@ -1094,9 +1094,7 @@ struct nfc_dev *nfc_allocate_device(struct nfc_ops *ops,
 	dev->targets_generation = 1;
 
 	if (ops->check_presence) {
-		setup_timer(&dev->check_pres_timer, nfc_check_pres_timeout,
-			    (unsigned long)dev);
-
+		timer_setup(&dev->check_pres_timer, nfc_check_pres_timeout, 0);
 		INIT_WORK(&dev->check_pres_work, nfc_check_pres_work);
 	}
 
diff --git a/net/nfc/hci/core.c b/net/nfc/hci/core.c
index a8a6e78..ac8030c4 100644
--- a/net/nfc/hci/core.c
+++ b/net/nfc/hci/core.c
@@ -428,9 +428,9 @@ void nfc_hci_event_received(struct nfc_hci_dev *hdev, u8 pipe, u8 event,
 		nfc_hci_driver_failure(hdev, r);
 }
 
-static void nfc_hci_cmd_timeout(unsigned long data)
+static void nfc_hci_cmd_timeout(struct timer_list *t)
 {
-	struct nfc_hci_dev *hdev = (struct nfc_hci_dev *)data;
+	struct nfc_hci_dev *hdev = from_timer(hdev, t, cmd_timer);
 
 	schedule_work(&hdev->msg_tx_work);
 }
@@ -1004,8 +1004,7 @@ int nfc_hci_register_device(struct nfc_hci_dev *hdev)
 
 	INIT_WORK(&hdev->msg_tx_work, nfc_hci_msg_tx_work);
 
-	setup_timer(&hdev->cmd_timer, nfc_hci_cmd_timeout,
-		    (unsigned long)hdev);
+	timer_setup(&hdev->cmd_timer, nfc_hci_cmd_timeout, 0);
 
 	skb_queue_head_init(&hdev->rx_hcp_frags);
 
diff --git a/net/nfc/hci/llc_shdlc.c b/net/nfc/hci/llc_shdlc.c
index 58df37e..fe98893 100644
--- a/net/nfc/hci/llc_shdlc.c
+++ b/net/nfc/hci/llc_shdlc.c
@@ -580,27 +580,27 @@ static void llc_shdlc_handle_send_queue(struct llc_shdlc *shdlc)
 	}
 }
 
-static void llc_shdlc_connect_timeout(unsigned long data)
+static void llc_shdlc_connect_timeout(struct timer_list *t)
 {
-	struct llc_shdlc *shdlc = (struct llc_shdlc *)data;
+	struct llc_shdlc *shdlc = from_timer(shdlc, t, connect_timer);
 
 	pr_debug("\n");
 
 	schedule_work(&shdlc->sm_work);
 }
 
-static void llc_shdlc_t1_timeout(unsigned long data)
+static void llc_shdlc_t1_timeout(struct timer_list *t)
 {
-	struct llc_shdlc *shdlc = (struct llc_shdlc *)data;
+	struct llc_shdlc *shdlc = from_timer(shdlc, t, t1_timer);
 
 	pr_debug("SoftIRQ: need to send ack\n");
 
 	schedule_work(&shdlc->sm_work);
 }
 
-static void llc_shdlc_t2_timeout(unsigned long data)
+static void llc_shdlc_t2_timeout(struct timer_list *t)
 {
-	struct llc_shdlc *shdlc = (struct llc_shdlc *)data;
+	struct llc_shdlc *shdlc = from_timer(shdlc, t, t2_timer);
 
 	pr_debug("SoftIRQ: need to retransmit\n");
 
@@ -763,14 +763,9 @@ static void *llc_shdlc_init(struct nfc_hci_dev *hdev, xmit_to_drv_t xmit_to_drv,
 	mutex_init(&shdlc->state_mutex);
 	shdlc->state = SHDLC_DISCONNECTED;
 
-	setup_timer(&shdlc->connect_timer, llc_shdlc_connect_timeout,
-		    (unsigned long)shdlc);
-
-	setup_timer(&shdlc->t1_timer, llc_shdlc_t1_timeout,
-		    (unsigned long)shdlc);
-
-	setup_timer(&shdlc->t2_timer, llc_shdlc_t2_timeout,
-		    (unsigned long)shdlc);
+	timer_setup(&shdlc->connect_timer, llc_shdlc_connect_timeout, 0);
+	timer_setup(&shdlc->t1_timer, llc_shdlc_t1_timeout, 0);
+	timer_setup(&shdlc->t2_timer, llc_shdlc_t2_timeout, 0);
 
 	shdlc->w = SHDLC_MAX_WINDOW;
 	shdlc->srej_support = SHDLC_SREJ_SUPPORT;
diff --git a/net/nfc/llcp_core.c b/net/nfc/llcp_core.c
index 7988185..ef4026a 100644
--- a/net/nfc/llcp_core.c
+++ b/net/nfc/llcp_core.c
@@ -242,9 +242,9 @@ static void nfc_llcp_timeout_work(struct work_struct *work)
 	nfc_dep_link_down(local->dev);
 }
 
-static void nfc_llcp_symm_timer(unsigned long data)
+static void nfc_llcp_symm_timer(struct timer_list *t)
 {
-	struct nfc_llcp_local *local = (struct nfc_llcp_local *) data;
+	struct nfc_llcp_local *local = from_timer(local, t, link_timer);
 
 	pr_err("SYMM timeout\n");
 
@@ -285,9 +285,9 @@ static void nfc_llcp_sdreq_timeout_work(struct work_struct *work)
 		nfc_genl_llc_send_sdres(local->dev, &nl_sdres_list);
 }
 
-static void nfc_llcp_sdreq_timer(unsigned long data)
+static void nfc_llcp_sdreq_timer(struct timer_list *t)
 {
-	struct nfc_llcp_local *local = (struct nfc_llcp_local *) data;
+	struct nfc_llcp_local *local = from_timer(local, t, sdreq_timer);
 
 	schedule_work(&local->sdreq_timeout_work);
 }
@@ -1573,8 +1573,7 @@ int nfc_llcp_register_device(struct nfc_dev *ndev)
 	INIT_LIST_HEAD(&local->list);
 	kref_init(&local->ref);
 	mutex_init(&local->sdp_lock);
-	setup_timer(&local->link_timer, nfc_llcp_symm_timer,
-		    (unsigned long)local);
+	timer_setup(&local->link_timer, nfc_llcp_symm_timer, 0);
 
 	skb_queue_head_init(&local->tx_queue);
 	INIT_WORK(&local->tx_work, nfc_llcp_tx_work);
@@ -1600,8 +1599,7 @@ int nfc_llcp_register_device(struct nfc_dev *ndev)
 
 	mutex_init(&local->sdreq_lock);
 	INIT_HLIST_HEAD(&local->pending_sdreqs);
-	setup_timer(&local->sdreq_timer, nfc_llcp_sdreq_timer,
-		    (unsigned long)local);
+	timer_setup(&local->sdreq_timer, nfc_llcp_sdreq_timer, 0);
 	INIT_WORK(&local->sdreq_timeout_work, nfc_llcp_sdreq_timeout_work);
 
 	list_add(&local->list, &llcp_devices);
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCHv4 iproute2 2/2] lib/libnetlink: update rtnl_talk to support malloc buff at run time
From: Hangbin Liu @ 2017-10-11 10:40 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Michal Kubecek, Phil Sutter, netdev, Hangbin Liu
In-Reply-To: <20171010094743.6ae2baa8@shemminger-XPS-13-9360>

Hi Stephen,
On Tue, Oct 10, 2017 at 09:47:43AM -0700, Stephen Hemminger wrote:
> > Agreed. Current code is based on the assumption that we can estimate the
> > maximum reply length in advance and the reason for this series is that
> > this assumption turned out to be wrong. I'm afraid that if we replace
> > it by an assumption that we can estimate the maximum reply length for
> > most requests with only few exceptions, it's only matter of time for us
> > to be proven wrong again.
> > 
> > Michal Kubecek
> > 
> 
> For query responses, yes the response may be large. But for the common cases of
> add address or add route, the response should just be ack or error.

I tried to list 10 NIC links with ip cmd.

With unpatched ip cmd:
# time for i in `seq 100000`; do ip link show &> /dev/null; done

real    5m14.591s
user    0m58.134s
sys     4m21.104s


With patched ip cmd:
# time for i in `seq 100000`; do ./ip link show &> /dev/null; done

real    4m48.579s
user    0m8.570s
sys     4m43.460s


Then tested add 99,00 address via script
# cat add_addr.sh
#!/bin/bash
dev=$1
for vid in $(seq 99); do
        ip link add link $dev name ${dev}.$vid type vlan id $vid
        ip link set ${dev}.$vid up
        for n in $(seq 100); do
                ip addr add 20$vid::$n dev ${dev}.$vid
        done
done

with unpatched ip cmd:
# time ./add_addr.sh p7p1

real    0m13.456s
user    0m2.551s
sys     0m11.106s


With patched ip cmd:
# time ./add_addr.sh p7p1

real    0m13.700s
user    0m2.827s
sys     0m11.148s


The result don't have much difference and looks good. And I wonder if adding
thousands of address is a common case.

Thanks
Hangbin

^ permalink raw reply

* Re: [PATCH][next] sctp: make array sctp_sched_ops static
From: Joe Perches @ 2017-10-11 10:44 UTC (permalink / raw)
  To: Colin King, Vlad Yasevich, Neil Horman, David S . Miller,
	linux-sctp, netdev
  Cc: kernel-janitors, linux-kernel
In-Reply-To: <20171011101757.18825-1-colin.king@canonical.com>

On Wed, 2017-10-11 at 11:17 +0100, Colin King wrote:
> From: Colin Ian King <colin.king@canonical.com>
> 
> The array sctp_sched_ops  is local to the source and
> does not need to be in global scope, so make it static.
> 
> Cleans up sparse warning:
> symbol 'sctp_sched_ops' was not declared. Should it be static?
> 
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
> ---
>  net/sctp/stream_sched.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/sctp/stream_sched.c b/net/sctp/stream_sched.c
> index 03513a9fa110..0b83ec51e43b 100644
> --- a/net/sctp/stream_sched.c
> +++ b/net/sctp/stream_sched.c
> @@ -124,7 +124,7 @@ static struct sctp_sched_ops sctp_sched_fcfs = {
>  extern struct sctp_sched_ops sctp_sched_prio;
>  extern struct sctp_sched_ops sctp_sched_rr;
>  
> -struct sctp_sched_ops *sctp_sched_ops[] = {
> +static struct sctp_sched_ops *sctp_sched_ops[] = {
>  	&sctp_sched_fcfs,
>  	&sctp_sched_prio,
>  	&sctp_sched_rr,

Perhaps these should also be const to move more data to text
---
 include/net/sctp/stream_sched.h |  3 ++-
 include/net/sctp/structs.h      |  2 +-
 net/sctp/stream.c               |  6 +++---
 net/sctp/stream_sched.c         | 17 +++++++++--------
 4 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/include/net/sctp/stream_sched.h b/include/net/sctp/stream_sched.h
index c676550a4c7d..431235c7587a 100644
--- a/include/net/sctp/stream_sched.h
+++ b/include/net/sctp/stream_sched.h
@@ -67,6 +67,7 @@ void sctp_sched_dequeue_done(struct sctp_outq *q, struct sctp_chunk *ch);
 
 void sctp_sched_dequeue_common(struct sctp_outq *q, struct sctp_chunk *ch);
 int sctp_sched_init_sid(struct sctp_stream *stream, __u16 sid, gfp_t gfp);
-struct sctp_sched_ops *sctp_sched_ops_from_stream(struct sctp_stream *stream);
+const struct sctp_sched_ops *
+sctp_sched_ops_from_stream(struct sctp_stream *stream);
 
 #endif /* __sctp_stream_sched_h__ */
diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 6168e3449131..032ec5618e8a 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -1028,7 +1028,7 @@ struct sctp_outq {
 	struct list_head out_chunk_list;
 
 	/* Stream scheduler being used */
-	struct sctp_sched_ops *sched;
+	const struct sctp_sched_ops *sched;
 
 	unsigned int out_qlen;	/* Total length of queued data chunks. */
 
diff --git a/net/sctp/stream.c b/net/sctp/stream.c
index 5ea33a2c453b..62678c2229e4 100644
--- a/net/sctp/stream.c
+++ b/net/sctp/stream.c
@@ -140,7 +140,7 @@ static int sctp_stream_alloc_in(struct sctp_stream *stream, __u16 incnt,
 int sctp_stream_init(struct sctp_stream *stream, __u16 outcnt, __u16 incnt,
 		     gfp_t gfp)
 {
-	struct sctp_sched_ops *sched = sctp_sched_ops_from_stream(stream);
+	const struct sctp_sched_ops *sched = sctp_sched_ops_from_stream(stream);
 	int i, ret = 0;
 
 	gfp |= __GFP_NOWARN;
@@ -201,7 +201,7 @@ int sctp_stream_init_ext(struct sctp_stream *stream, __u16 sid)
 
 void sctp_stream_free(struct sctp_stream *stream)
 {
-	struct sctp_sched_ops *sched = sctp_sched_ops_from_stream(stream);
+	const struct sctp_sched_ops *sched = sctp_sched_ops_from_stream(stream);
 	int i;
 
 	sched->free(stream);
@@ -224,7 +224,7 @@ void sctp_stream_clear(struct sctp_stream *stream)
 
 void sctp_stream_update(struct sctp_stream *stream, struct sctp_stream *new)
 {
-	struct sctp_sched_ops *sched = sctp_sched_ops_from_stream(stream);
+	const struct sctp_sched_ops *sched = sctp_sched_ops_from_stream(stream);
 
 	sched->unsched_all(stream);
 	sctp_stream_outq_migrate(stream, new, new->outcnt);
diff --git a/net/sctp/stream_sched.c b/net/sctp/stream_sched.c
index 03513a9fa110..642542be57ec 100644
--- a/net/sctp/stream_sched.c
+++ b/net/sctp/stream_sched.c
@@ -106,7 +106,7 @@ static void sctp_sched_fcfs_unsched_all(struct sctp_stream *stream)
 {
 }
 
-static struct sctp_sched_ops sctp_sched_fcfs = {
+static const struct sctp_sched_ops sctp_sched_fcfs = {
 	.set = sctp_sched_fcfs_set,
 	.get = sctp_sched_fcfs_get,
 	.init = sctp_sched_fcfs_init,
@@ -121,10 +121,10 @@ static struct sctp_sched_ops sctp_sched_fcfs = {
 
 /* API to other parts of the stack */
 
-extern struct sctp_sched_ops sctp_sched_prio;
-extern struct sctp_sched_ops sctp_sched_rr;
+extern const struct sctp_sched_ops sctp_sched_prio;
+extern const struct sctp_sched_ops sctp_sched_rr;
 
-struct sctp_sched_ops *sctp_sched_ops[] = {
+static const struct sctp_sched_ops *sctp_sched_ops[] = {
 	&sctp_sched_fcfs,
 	&sctp_sched_prio,
 	&sctp_sched_rr,
@@ -133,8 +133,8 @@ struct sctp_sched_ops *sctp_sched_ops[] = {
 int sctp_sched_set_sched(struct sctp_association *asoc,
 			 enum sctp_sched_type sched)
 {
-	struct sctp_sched_ops *n = sctp_sched_ops[sched];
-	struct sctp_sched_ops *old = asoc->outqueue.sched;
+	const struct sctp_sched_ops *n = sctp_sched_ops[sched];
+	const struct sctp_sched_ops *old = asoc->outqueue.sched;
 	struct sctp_datamsg *msg = NULL;
 	struct sctp_chunk *ch;
 	int i, ret = 0;
@@ -259,13 +259,14 @@ void sctp_sched_dequeue_common(struct sctp_outq *q, struct sctp_chunk *ch)
 
 int sctp_sched_init_sid(struct sctp_stream *stream, __u16 sid, gfp_t gfp)
 {
-	struct sctp_sched_ops *sched = sctp_sched_ops_from_stream(stream);
+	const struct sctp_sched_ops *sched = sctp_sched_ops_from_stream(stream);
 
 	INIT_LIST_HEAD(&stream->out[sid].ext->outq);
 	return sched->init_sid(stream, sid, gfp);
 }
 
-struct sctp_sched_ops *sctp_sched_ops_from_stream(struct sctp_stream *stream)
+const struct sctp_sched_ops *
+sctp_sched_ops_from_stream(struct sctp_stream *stream)
 {
 	struct sctp_association *asoc;
 

^ permalink raw reply related

* Re: [net-next 2/3] ip_gre: fix erspan tunnel mtu calculation
From: Xin Long @ 2017-10-11 10:46 UTC (permalink / raw)
  To: William Tu; +Cc: davem, Linux Kernel Network Developers
In-Reply-To: <CALDO+SZhdSXQB-n_o006_fOGnmz1j99uJBAOVixD0YCFgKnVFw@mail.gmail.com>

On Tue, Oct 10, 2017 at 8:59 PM, William Tu <u9012063@gmail.com> wrote:
>>> @@ -1242,14 +1241,14 @@ static int erspan_tunnel_init(struct net_device *dev)
>>>         struct ip_tunnel *tunnel = netdev_priv(dev);
>>>         int t_hlen;
>>>
>>> -       tunnel->tun_hlen = 8;
>>> +       tunnel->tun_hlen = ERSPAN_GREHDR_LEN;
>>>         tunnel->parms.iph.protocol = IPPROTO_GRE;
>>>         tunnel->hlen = tunnel->tun_hlen + tunnel->encap_hlen +
>>>                        sizeof(struct erspanhdr);
>>>         t_hlen = tunnel->hlen + sizeof(struct iphdr);
>>>
>>> -       dev->needed_headroom = LL_MAX_HEADER + t_hlen + 4;
>>> -       dev->mtu = ETH_DATA_LEN - t_hlen - 4;
>>> +       dev->needed_headroom = LL_MAX_HEADER + t_hlen;
>>> +       dev->mtu = ETH_DATA_LEN - t_hlen;
>> 1. I guess '+4-4' stuff was copied from __gre_tunnel_init(), I'm thinking
>> it may be there for some reason.
>>
> I traced back to
> 4565e9919cda ("gre: Setup and TX path for gre/UDP foo-over-udp encapsulation")
> and I think '+4-4' is there for GRE base header length.
>
> Since now we do
>     dev->mtu = ETH_DATA_LEN - t_hlen;
> and t_hlen already counts the the gre base header + optional header
> len, I think it's not needed.

okay. thanks.

>
>> 2. 'dev->needed_headroom =' and 'dev->mtu =' are really needed ?
>> As I've seen both will be updated in .newlink:
>> ipgre_newlink() -> ip_tunnel_newlink() -> ip_tunnel_bind_dev()
>>
> right, I also find both values gets overwritten by
> ip_tunnel_bind_dev() using my test cases. Maybe we can remove them?

It's there just in case that there is no lower dev found, but
ip_tunnel_newlink/create always updates dev->mtu even if
there is no lower dev found.

let's leave as it is for now, ipgre may just not be sure ip_tunnel_xxx would
do it when no lower dev found.

^ permalink raw reply

* [PATCH][bpf-next] bpf: remove redundant variable old_flags
From: Colin King @ 2017-10-11 10:56 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, netdev; +Cc: kernel-janitors, linux-kernel

From: Colin Ian King <colin.king@canonical.com>

Variable old_flags is being assigned but is never read; it is redundant
and can be removed.

Cleans up clang warning: Value stored to 'old_flags' is never read

Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 kernel/bpf/cgroup.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index e88abc0865d5..3db5a17fcfe8 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -192,7 +192,6 @@ int __cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
 	struct cgroup_subsys_state *css;
 	struct bpf_prog_list *pl;
 	bool pl_was_allocated;
-	u32 old_flags;
 	int err;
 
 	if ((flags & BPF_F_ALLOW_OVERRIDE) && (flags & BPF_F_ALLOW_MULTI))
@@ -239,7 +238,6 @@ int __cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
 		pl->prog = prog;
 	}
 
-	old_flags = cgrp->bpf.flags[type];
 	cgrp->bpf.flags[type] = flags;
 
 	/* allocate and recompute effective prog arrays */
-- 
2.14.1

^ permalink raw reply related

* Re: [PATCHv4 iproute2 2/2] lib/libnetlink: update rtnl_talk to support malloc buff at run time
From: Phil Sutter @ 2017-10-11 11:10 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Michal Kubecek, Hangbin Liu, netdev, Hangbin Liu
In-Reply-To: <20171010094743.6ae2baa8@shemminger-XPS-13-9360>

On Tue, Oct 10, 2017 at 09:47:43AM -0700, Stephen Hemminger wrote:
> On Tue, 10 Oct 2017 08:41:17 +0200
> Michal Kubecek <mkubecek@suse.cz> wrote:
> 
> > On Mon, Oct 09, 2017 at 10:25:25PM +0200, Phil Sutter wrote:
> > > Hi Stephen,
> > > 
> > > On Mon, Oct 02, 2017 at 10:37:08AM -0700, Stephen Hemminger wrote:  
> > > > On Thu, 28 Sep 2017 21:33:46 +0800
> > > > Hangbin Liu <haliu@redhat.com> wrote:
> > > >   
> > > > > From: Hangbin Liu <liuhangbin@gmail.com>
> > > > > 
> > > > > This is an update for 460c03f3f3cc ("iplink: double the buffer size also in
> > > > > iplink_get()"). After update, we will not need to double the buffer size
> > > > > every time when VFs number increased.
> > > > > 
> > > > > With call like rtnl_talk(&rth, &req.n, NULL, 0), we can simply remove the
> > > > > length parameter.
> > > > > 
> > > > > With call like rtnl_talk(&rth, nlh, nlh, sizeof(req), I add a new variable
> > > > > answer to avoid overwrite data in nlh, because it may has more info after
> > > > > nlh. also this will avoid nlh buffer not enough issue.
> > > > > 
> > > > > We need to free answer after using.
> > > > > 
> > > > > Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
> > > > > Signed-off-by: Phil Sutter <phil@nwl.cc>
> > > > > ---  
> > > > 
> > > > Most of the uses of rtnl_talk() don't need to this peek and dynamic sizing.
> > > > Can only those places that need that be targeted?  
> > > 
> > > We could probably do that, by having a buffer on stack in __rtnl_talk()
> > > which will be used instead of the allocated one if 'answer' is NULL. Or
> > > maybe even introduce a dedicated API call for the dynamically allocated
> > > receive buffer. But I really doubt that's feasible: AFAICT, that stack
> > > buffer still needs to be reasonably sized since the reply might be
> > > larger than the request (reusing the request buffer would be the most
> > > simple way to tackle this), also there is support for extack which may
> > > bloat the response to arbitrary size. Hangbin has shown in his benchmark
> > > that the overhead of the second syscall is negligible, so why care about
> > > that and increase code complexity even further?
> > > 
> > > Not saying it's not possible, but I just doubt it's worth the effort.  
> > 
> > Agreed. Current code is based on the assumption that we can estimate the
> > maximum reply length in advance and the reason for this series is that
> > this assumption turned out to be wrong. I'm afraid that if we replace
> > it by an assumption that we can estimate the maximum reply length for
> > most requests with only few exceptions, it's only matter of time for us
> > to be proven wrong again.
> > 
> > Michal Kubecek
> > 
> 
> For query responses, yes the response may be large. But for the common cases of
> add address or add route, the response should just be ack or error.

And with extack, error is comprised of the original request plus an
arbitrarily sized error message, so we can't just reuse the request
buffer and are back to "guessing" the right length again.

To get an idea of what we're talking about, I wrote a simple benchmark
which adds 256 * 254 (= 65024) addresses to an interface, then removes
them again one by one and measured the time that takes for binaries with
and without Hangbin's patches:

OP	Vanilla		Hangbin		Delta
--------------------------------------------------------
add	real 2m16.244s	real 2m27.964s	+11.72s	(108.6%)
	user 0m15.241s	user 0m17.295s	+2.054s	(113.5%)
	sys  1m40.229s	sys  1m48.239s	+8.01s	(108.0%)

remove	real 1m44.950s	real 1m47.044s	+2.094s	(102.0%)
	user 0m13.899s	user 0m14.723s	+0.824s (105.9%)
	sys  1m30.798s	sys  1m31.938s	+1.140s (101.3%)

So the overhead of the second syscall and dynamic memory allocation is
less than 10% overall. Given the short time a single call to 'ip'
typically takes, I don't think the difference is noticeable even in
highly performance critical applications.

Cheers, Phil

^ permalink raw reply

* Re: [PATCH][next] sctp: make array sctp_sched_ops static
From: Neil Horman @ 2017-10-11 11:39 UTC (permalink / raw)
  To: Colin King
  Cc: Vlad Yasevich, David S . Miller, linux-sctp, netdev,
	kernel-janitors, linux-kernel
In-Reply-To: <20171011101757.18825-1-colin.king@canonical.com>

On Wed, Oct 11, 2017 at 11:17:57AM +0100, Colin King wrote:
> From: Colin Ian King <colin.king@canonical.com>
> 
> The array sctp_sched_ops  is local to the source and
> does not need to be in global scope, so make it static.
> 
> Cleans up sparse warning:
> symbol 'sctp_sched_ops' was not declared. Should it be static?
> 
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
> ---
>  net/sctp/stream_sched.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/sctp/stream_sched.c b/net/sctp/stream_sched.c
> index 03513a9fa110..0b83ec51e43b 100644
> --- a/net/sctp/stream_sched.c
> +++ b/net/sctp/stream_sched.c
> @@ -124,7 +124,7 @@ static struct sctp_sched_ops sctp_sched_fcfs = {
>  extern struct sctp_sched_ops sctp_sched_prio;
>  extern struct sctp_sched_ops sctp_sched_rr;
>  
> -struct sctp_sched_ops *sctp_sched_ops[] = {
> +static struct sctp_sched_ops *sctp_sched_ops[] = {
>  	&sctp_sched_fcfs,
>  	&sctp_sched_prio,
>  	&sctp_sched_rr,
> -- 
> 2.14.1
> 
> 
Acked-by: Neil Horman <nhorman@tuxdriver.com>

^ permalink raw reply

* [PATCH v2 net 0/2] net/smc: ib_query_gid() patches
From: Ursula Braun @ 2017-10-11 11:47 UTC (permalink / raw)
  To: davem-fT/PcQaiUtIeIZ0/mPfg9Q
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
	linux-s390-u79uwXL29TY76Z2rM5mHXA,
	jwi-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	schwidefsky-tA70FqPdS9bQT0dZR+AlfA,
	heiko.carstens-tA70FqPdS9bQT0dZR+AlfA,
	raspl-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	ubraun-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8,
	parav-VPRAkNaXOzVWk0Htik3J/w

Dave,

triggered by Parav Pandit here are 2 cleanup patches for usage of
ib_query_gid() in the smc-code.

Thanks, Ursula

v2 changes advised by Parav Pandit:
   extra check is_vlan_dev() in patch 2/2
   "RoCE" spelling
   added "Reported-by"
   added "Reviewed-by"
   added "Fixes"

Ursula Braun (2):
  net/smc: replace function pointer get_netdev()
  net/smc: dev_put for netdev after usage of ib_query_gid()

 net/smc/smc_core.c | 12 ++++++++----
 net/smc/smc_ib.c   | 26 +++++++++-----------------
 2 files changed, 17 insertions(+), 21 deletions(-)

-- 
2.13.5

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH v2 net 1/2] net/smc: replace function pointer get_netdev()
From: Ursula Braun @ 2017-10-11 11:47 UTC (permalink / raw)
  To: davem
  Cc: netdev, linux-rdma, linux-s390, jwi, schwidefsky, heiko.carstens,
	raspl, ubraun, parav
In-Reply-To: <20171011114723.30733-1-ubraun@linux.vnet.ibm.com>

SMC should not open code the function pointer get_netdev of the
IB device. Replacing ib_query_gid(..., NULL) with
ib_query_gid(..., gid_attr) allows access to the netdev.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Suggested-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
---
 net/smc/smc_ib.c | 26 +++++++++-----------------
 1 file changed, 9 insertions(+), 17 deletions(-)

diff --git a/net/smc/smc_ib.c b/net/smc/smc_ib.c
index 0b5852299158..468e1d725d97 100644
--- a/net/smc/smc_ib.c
+++ b/net/smc/smc_ib.c
@@ -369,26 +369,17 @@ void smc_ib_buf_unmap_sg(struct smc_ib_device *smcibdev,
 
 static int smc_ib_fill_gid_and_mac(struct smc_ib_device *smcibdev, u8 ibport)
 {
-	struct net_device *ndev;
+	struct ib_gid_attr gattr;
 	int rc;
 
 	rc = ib_query_gid(smcibdev->ibdev, ibport, 0,
-			  &smcibdev->gid[ibport - 1], NULL);
-	/* the SMC protocol requires specification of the roce MAC address;
-	 * if net_device cannot be determined, it can be derived from gid 0
-	 */
-	ndev = smcibdev->ibdev->get_netdev(smcibdev->ibdev, ibport);
-	if (ndev) {
-		memcpy(&smcibdev->mac, ndev->dev_addr, ETH_ALEN);
-		dev_put(ndev);
-	} else if (!rc) {
-		memcpy(&smcibdev->mac[ibport - 1][0],
-		       &smcibdev->gid[ibport - 1].raw[8], 3);
-		memcpy(&smcibdev->mac[ibport - 1][3],
-		       &smcibdev->gid[ibport - 1].raw[13], 3);
-		smcibdev->mac[ibport - 1][0] &= ~0x02;
-	}
-	return rc;
+			  &smcibdev->gid[ibport - 1], &gattr);
+	if (rc || !gattr.ndev)
+		return -ENODEV;
+
+	memcpy(smcibdev->mac[ibport - 1], gattr.ndev->dev_addr, ETH_ALEN);
+	dev_put(gattr.ndev);
+	return 0;
 }
 
 /* Create an identifier unique for this instance of SMC-R.
@@ -419,6 +410,7 @@ int smc_ib_remember_port_attr(struct smc_ib_device *smcibdev, u8 ibport)
 			   &smcibdev->pattr[ibport - 1]);
 	if (rc)
 		goto out;
+	/* the SMC protocol requires specification of the RoCE MAC address */
 	rc = smc_ib_fill_gid_and_mac(smcibdev, ibport);
 	if (rc)
 		goto out;
-- 
2.13.5

^ permalink raw reply related

* [PATCH v2 net 2/2] net/smc: dev_put for netdev after usage of ib_query_gid()
From: Ursula Braun @ 2017-10-11 11:47 UTC (permalink / raw)
  To: davem
  Cc: netdev, linux-rdma, linux-s390, jwi, schwidefsky, heiko.carstens,
	raspl, ubraun, parav
In-Reply-To: <20171011114723.30733-1-ubraun@linux.vnet.ibm.com>

For RoCEs ib_query_gid() takes a reference count on the net_device.
This reference count must be decreased by the caller.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Reported-by: Parav Pandit <parav@mellanox.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
Fixes: 0cfdd8f92cac ("smc: connection and link group creation")
---
 net/smc/smc_core.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c
index 20b66e79c5d6..5f6a20084157 100644
--- a/net/smc/smc_core.c
+++ b/net/smc/smc_core.c
@@ -380,10 +380,14 @@ static int smc_link_determine_gid(struct smc_link_group *lgr)
 		if (ib_query_gid(lnk->smcibdev->ibdev, lnk->ibport, i, &gid,
 				 &gattr))
 			continue;
-		if (gattr.ndev &&
-		    (vlan_dev_vlan_id(gattr.ndev) == lgr->vlan_id)) {
-			lnk->gid = gid;
-			return 0;
+		if (gattr.ndev) {
+			if (is_vlan_dev(gattr.ndev) &&
+			    vlan_dev_vlan_id(gattr.ndev) == lgr->vlan_id) {
+				lnk->gid = gid;
+				dev_put(gattr.ndev);
+				return 0;
+			}
+			dev_put(gattr.ndev);
 		}
 	}
 	return -ENODEV;
-- 
2.13.5

^ permalink raw reply related

* Re: ipsec: Fix dst leak in xfrm_bundle_create().
From: Steffen Klassert @ 2017-10-11 11:54 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20171010.205938.2305003344238226004.davem@davemloft.net>

On Tue, Oct 10, 2017 at 08:59:38PM -0700, David Miller wrote:
> 
> If we cannot find a suitable inner_mode value, we will leak
> the currently allocated 'xdst'.
> 
> The fix is to make sure it is linked into the chain before
> erroring out.
> 
> Signed-off-by: David S. Miller <davem@davemloft.net>
> ---
> 
> Steffen, I found this via visual inspection.  Please double check my
> work before applying this :-)

Looks good. Patch applied, thanks David!

^ permalink raw reply

* Re: [PATCH net-next 1/2] net sched act_vlan: Change stats update to use per-core stats
From: Jamal Hadi Salim @ 2017-10-11 12:27 UTC (permalink / raw)
  To: Manish Kurup, xiyou.wangcong, jiri, davem, netdev
  Cc: aring, mrv, manish.kurup
In-Reply-To: <1507689147-22941-1-git-send-email-manish.kurup@verizon.com>

minus lk

On 17-10-10 10:32 PM, Manish Kurup wrote:
> The VLAN action maintains one set of stats across all cores, and uses a
> spinlock to synchronize updates to it from the same. Changed this to use a
> per-CPU stats context instead.
> This change will result in better performance.
> 
> Signed-off-by: Manish Kurup <manish.kurup@verizon.com>
> ---
>   net/sched/act_vlan.c | 10 ++++++----
>   1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/net/sched/act_vlan.c b/net/sched/act_vlan.c
> index 16eb067..14c262c 100644
> --- a/net/sched/act_vlan.c
> +++ b/net/sched/act_vlan.c
> @@ -30,9 +30,10 @@ static int tcf_vlan(struct sk_buff *skb, const struct tc_action *a,
>   	int err;
>   	u16 tci;
>   
> -	spin_lock(&v->tcf_lock);
>   	tcf_lastuse_update(&v->tcf_tm);
> -	bstats_update(&v->tcf_bstats, skb);
> +	bstats_cpu_update(this_cpu_ptr(v->common.cpu_bstats), skb);
> +
> +	spin_lock(&v->tcf_lock);
>   	action = v->tcf_action;
>   
>   	/* Ensure 'data' points at mac_header prior calling vlan manipulating
> @@ -85,7 +86,8 @@ static int tcf_vlan(struct sk_buff *skb, const struct tc_action *a,
>   
>   drop:
>   	action = TC_ACT_SHOT;
> -	v->tcf_qstats.drops++;
> +	qstats_drop_inc(this_cpu_ptr(v->common.cpu_qstats));
> +
>   unlock:
>   	if (skb_at_tc_ingress(skb))
>   		skb_pull_rcsum(skb, skb->mac_len);
> @@ -172,7 +174,7 @@ static int tcf_vlan_init(struct net *net, struct nlattr *nla,
>   
>   	if (!exists) {
>   		ret = tcf_idr_create(tn, parm->index, est, a,
> -				     &act_vlan_ops, bind, false);
> +						&act_vlan_ops, bind, true);
>

Indentation mismatch here?

Otherwise looks good to me.

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>

cheers,
jamal

^ permalink raw reply

* Re: [PATCH net-next 2/2] net sched act_vlan: VLAN action rewrite to use RCU lock/unlock and update
From: Jamal Hadi Salim @ 2017-10-11 12:28 UTC (permalink / raw)
  To: Manish Kurup, xiyou.wangcong, jiri, davem, netdev, linux-kernel
  Cc: aring, mrv, manish.kurup
In-Reply-To: <1507689219-22993-1-git-send-email-manish.kurup@verizon.com>

On 17-10-10 10:33 PM, Manish Kurup wrote:
> Using a spinlock in the VLAN action causes performance issues when the VLAN
> action is used on multiple cores. Rewrote the VLAN action to use RCU read
> locking for reads and updates instead.
> 
> Signed-off-by: Manish Kurup <manish.kurup@verizon.com>

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>

cheers,
jamal

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox