Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] mac80211: fix TX status reporting for ieee80211s
From: Johannes Berg @ 2018-09-06  9:47 UTC (permalink / raw)
  To: Yuan-Chi Pang; +Cc: David S. Miller, linux-wireless, netdev, linux-kernel
In-Reply-To: <1536227104.68915.5.camel@gmail.com>

On Thu, 2018-09-06 at 17:45 +0800, Yuan-Chi Pang wrote:
> 
> I received the comment of my previous submission that I should fix
> indentation. But if I use the same indentation as before, I got "line
> over 80 characters" warning by checkpatch. What should I do?
> 
> > diff --git a/net/mac80211/mesh_hwmp.c b/net/mac80211/mesh_hwmp.c
> > index daf9db3..4286673 100644
> > --- a/net/mac80211/mesh_hwmp.c
> > +++ b/net/mac80211/mesh_hwmp.c
> > @@ -295,15 +295,11 @@ int mesh_path_error_tx(struct
> > ieee80211_sub_if_data *sdata,
> >  }
> >  
> >  void ieee80211s_update_metric(struct ieee80211_local *local,
> > -		struct sta_info *sta, struct sk_buff *skb)
> > +		struct sta_info *sta, struct ieee80211_tx_status
> > *st)

Here? Seems easy - add a linebreak between sta and status?

johannes

^ permalink raw reply

* Re: [PATCH] mac80211: fix TX status reporting for ieee80211s
From: Yuan-Chi Pang @ 2018-09-06  9:45 UTC (permalink / raw)
  To: Johannes Berg; +Cc: David S. Miller, linux-wireless, netdev, linux-kernel
In-Reply-To: <1536224269-69350-1-git-send-email-fu3mo6goo@gmail.com>


On Thu, 2018-09-06 at 16:57 +0800, Yuan-Chi Pang wrote:
> TX status reporting to ieee80211s is through
> ieee80211s_update_metric.
> There are two problems about ieee80211s_update_metric:
> 
> 1. The purpose is to estimate the fail probability
> to a specific link. No need to restrict to data frame.
> 
> 2. Current implementation does not work if wireless driver does not
> pass tx_status with skb.
> 
> Fix this by removing ieee80211_is_data condition, passing
> ieee80211_tx_status directly to ieee80211s_update_metric, and
> putting it in both __ieee80211_tx_status and ieee80211_tx_status_ext.
> 
> Signed-off-by: Yuan-Chi Pang <fu3mo6goo@gmail.com>
> ---
>  net/mac80211/mesh.h      | 2 +-
>  net/mac80211/mesh_hwmp.c | 8 ++------
>  net/mac80211/status.c    | 4 +++-
>  3 files changed, 6 insertions(+), 8 deletions(-)
> 
> diff --git a/net/mac80211/mesh.h b/net/mac80211/mesh.h
> index ee56f18..7ffe56a 100644
> --- a/net/mac80211/mesh.h
> +++ b/net/mac80211/mesh.h
> @@ -217,7 +217,7 @@ void mesh_rmc_free(struct ieee80211_sub_if_data
> *sdata);
>  int mesh_rmc_init(struct ieee80211_sub_if_data *sdata);
>  void ieee80211s_init(void);
>  void ieee80211s_update_metric(struct ieee80211_local *local,
> -			      struct sta_info *sta, struct sk_buff
> *skb);
> +			 struct sta_info *sta, struct
> ieee80211_tx_status *st);
>  void ieee80211_mesh_init_sdata(struct ieee80211_sub_if_data *sdata);
>  void ieee80211_mesh_teardown_sdata(struct ieee80211_sub_if_data
> *sdata);
>  int ieee80211_start_mesh(struct ieee80211_sub_if_data *sdata);

I received the comment of my previous submission that I should fix
indentation. But if I use the same indentation as before, I got "line
over 80 characters" warning by checkpatch. What should I do?

> diff --git a/net/mac80211/mesh_hwmp.c b/net/mac80211/mesh_hwmp.c
> index daf9db3..4286673 100644
> --- a/net/mac80211/mesh_hwmp.c
> +++ b/net/mac80211/mesh_hwmp.c
> @@ -295,15 +295,11 @@ int mesh_path_error_tx(struct
> ieee80211_sub_if_data *sdata,
>  }
>  
>  void ieee80211s_update_metric(struct ieee80211_local *local,
> -		struct sta_info *sta, struct sk_buff *skb)
> +		struct sta_info *sta, struct ieee80211_tx_status
> *st)
>  {
> -	struct ieee80211_tx_info *txinfo = IEEE80211_SKB_CB(skb);
> -	struct ieee80211_hdr *hdr = (struct ieee80211_hdr *) skb-
> >data;
> +	struct ieee80211_tx_info *txinfo = st->info;
>  	int failed;
>  
> -	if (!ieee80211_is_data(hdr->frame_control))
> -		return;
> -
>  	failed = !(txinfo->flags & IEEE80211_TX_STAT_ACK);
>  
>  	/* moving average, scaled to 100.
> diff --git a/net/mac80211/status.c b/net/mac80211/status.c
> index 9a6d720..dfcd419 100644
> --- a/net/mac80211/status.c
> +++ b/net/mac80211/status.c
> @@ -811,7 +811,7 @@ static void __ieee80211_tx_status(struct
> ieee80211_hw *hw,
>  
>  		rate_control_tx_status(local, sband, status);
>  		if (ieee80211_vif_is_mesh(&sta->sdata->vif))
> -			ieee80211s_update_metric(local, sta, skb);
> +			ieee80211s_update_metric(local, sta,
> status);
>  
>  		if (!(info->flags & IEEE80211_TX_CTL_INJECTED) &&
> acked)
>  			ieee80211_frame_acked(sta, skb);
> @@ -972,6 +972,8 @@ void ieee80211_tx_status_ext(struct ieee80211_hw
> *hw,
>  		}
>  
>  		rate_control_tx_status(local, sband, status);
> +		if (ieee80211_vif_is_mesh(&sta->sdata->vif))
> +			ieee80211s_update_metric(local, sta,
> status);
>  	}
>  
>  	if (acked || noack_success) {

^ permalink raw reply

* Re: [PATCH v2 net-next] packet: add sockopt to ignore outgoing packets
From: David Miller @ 2018-09-06  5:10 UTC (permalink / raw)
  To: vincent.whitchurch; +Cc: netdev, willemb, rabinv
In-Reply-To: <20180903142336.32122-1-vincent.whitchurch@axis.com>

From: Vincent Whitchurch <vincent.whitchurch@axis.com>
Date: Mon,  3 Sep 2018 16:23:36 +0200

> Currently, the only way to ignore outgoing packets on a packet socket is
> via the BPF filter.  With MSG_ZEROCOPY, packets that are looped into
> AF_PACKET are copied in dev_queue_xmit_nit(), and this copy happens even
> if the filter run from packet_rcv() would reject them.  So the presence
> of a packet socket on the interface takes away the benefits of
> MSG_ZEROCOPY, even if the packet socket is not interested in outgoing
> packets.  (Even when MSG_ZEROCOPY is not used, the skb is unnecessarily
> cloned, but the cost for that is much lower.)
> 
> Add a socket option to allow AF_PACKET sockets to ignore outgoing
> packets to solve this.  Note that the *BSDs already have something
> similar: BIOCSSEESENT/BIOCSDIRECTION and BIOCSDIRFILT.
> 
> The first intended user is lldpd.
> 
> Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
> ---
> v2: Stricter value validation.
>     Moved ignore check out of skb_loop_sk().

Applied, thank you.

^ permalink raw reply

* Re: [PATCH] ath9k: add reset for airtime station debugfs
From: Toke Høiland-Jørgensen @ 2018-09-06  9:27 UTC (permalink / raw)
  To: Louie Lu, kvalo-sgV2jX0FEOL9JmXXK+q4OQ
  Cc: ath9k-devel-A+ZNKFmMK5xy9aJCnZT0Uw, davem-fT/PcQaiUtIeIZ0/mPfg9Q,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <20180906035323.GA11337@nems_nctu_lu>

Louie Lu <git-sgRGTR89XAc@public.gmane.org> writes:

> Let user can reset station airtime status by debugfs, it will
> reset all airtime deficit to ATH_AIRTIME_QUANTUM and reset rx/tx
> airtime accumulate to 0.

No objections to the patch, but I'm curious which issues you were
debugging that led you to needing it? :)

-Toke

^ permalink raw reply

* [net-next 8/9] net/mlx5e: Set ECN for received packets using CQE indication
From: Saeed Mahameed @ 2018-09-06  4:33 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Natali Shechtman, Saeed Mahameed
In-Reply-To: <20180906043331.31958-1-saeedm@mellanox.com>

From: Natali Shechtman <natali@mellanox.com>

In multi-host (MH) NIC scheme, a single HW port serves multiple hosts
or sockets on the same host.
The HW uses a mechanism in the PCIe buffer which monitors
the amount of consumed PCIe buffers per host.
On a certain configuration, under congestion,
the HW emulates a switch doing ECN marking on packets using ECN
indication on the completion descriptor (CQE).

The driver needs to set the ECN bits on the packet SKB,
such that the network stack can react on that, this commit does that.

Signed-off-by: Natali Shechtman <natali@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/en_rx.c   | 35 ++++++++++++++++---
 .../ethernet/mellanox/mlx5/core/en_stats.c    |  3 ++
 .../ethernet/mellanox/mlx5/core/en_stats.h    |  2 ++
 3 files changed, 35 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 6a959e8b1f9d..2175d6972dc3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -37,6 +37,7 @@
 #include <net/busy_poll.h>
 #include <net/ip6_checksum.h>
 #include <net/page_pool.h>
+#include <net/inet_ecn.h>
 #include "en.h"
 #include "en_tc.h"
 #include "eswitch.h"
@@ -690,12 +691,29 @@ static inline void mlx5e_skb_set_hash(struct mlx5_cqe64 *cqe,
 	skb_set_hash(skb, be32_to_cpu(cqe->rss_hash_result), ht);
 }
 
-static inline bool is_last_ethertype_ip(struct sk_buff *skb, int *network_depth)
+static inline bool is_last_ethertype_ip(struct sk_buff *skb, int *network_depth,
+					__be16 *proto)
 {
-	__be16 ethertype = ((struct ethhdr *)skb->data)->h_proto;
+	*proto = ((struct ethhdr *)skb->data)->h_proto;
+	*proto = __vlan_get_protocol(skb, *proto, network_depth);
+	return (*proto == htons(ETH_P_IP) || *proto == htons(ETH_P_IPV6));
+}
+
+static inline void mlx5e_enable_ecn(struct mlx5e_rq *rq, struct sk_buff *skb)
+{
+	int network_depth = 0;
+	__be16 proto;
+	void *ip;
+	int rc;
 
-	ethertype = __vlan_get_protocol(skb, ethertype, network_depth);
-	return (ethertype == htons(ETH_P_IP) || ethertype == htons(ETH_P_IPV6));
+	if (unlikely(!is_last_ethertype_ip(skb, &network_depth, &proto)))
+		return;
+
+	ip = skb->data + network_depth;
+	rc = ((proto == htons(ETH_P_IP)) ? IP_ECN_set_ce((struct iphdr *)ip) :
+					 IP6_ECN_set_ce(skb, (struct ipv6hdr *)ip));
+
+	rq->stats->ecn_mark += !!rc;
 }
 
 static __be32 mlx5e_get_fcs(struct sk_buff *skb)
@@ -745,6 +763,7 @@ static inline void mlx5e_handle_csum(struct net_device *netdev,
 {
 	struct mlx5e_rq_stats *stats = rq->stats;
 	int network_depth = 0;
+	__be16 proto;
 
 	if (unlikely(!(netdev->features & NETIF_F_RXCSUM)))
 		goto csum_none;
@@ -755,7 +774,7 @@ static inline void mlx5e_handle_csum(struct net_device *netdev,
 		return;
 	}
 
-	if (likely(is_last_ethertype_ip(skb, &network_depth))) {
+	if (likely(is_last_ethertype_ip(skb, &network_depth, &proto))) {
 		skb->ip_summed = CHECKSUM_COMPLETE;
 		skb->csum = csum_unfold((__force __sum16)cqe->check_sum);
 		if (network_depth > ETH_HLEN)
@@ -790,6 +809,8 @@ static inline void mlx5e_handle_csum(struct net_device *netdev,
 	stats->csum_none++;
 }
 
+#define MLX5E_CE_BIT_MASK 0x80
+
 static inline void mlx5e_build_rx_skb(struct mlx5_cqe64 *cqe,
 				      u32 cqe_bcnt,
 				      struct mlx5e_rq *rq,
@@ -834,6 +855,10 @@ static inline void mlx5e_build_rx_skb(struct mlx5_cqe64 *cqe,
 	skb->mark = be32_to_cpu(cqe->sop_drop_qpn) & MLX5E_TC_FLOW_ID_MASK;
 
 	mlx5e_handle_csum(netdev, cqe, rq, skb, !!lro_num_seg);
+	/* checking CE bit in cqe - MSB in ml_path field */
+	if (unlikely(cqe->ml_path & MLX5E_CE_BIT_MASK))
+		mlx5e_enable_ecn(rq, skb);
+
 	skb->protocol = eth_type_trans(skb, netdev);
 }
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
index 6839481f7697..90c7607b1f44 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
@@ -53,6 +53,7 @@ static const struct counter_desc sw_stats_desc[] = {
 
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_lro_packets) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_lro_bytes) },
+	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_ecn_mark) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_removed_vlan_packets) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_csum_unnecessary) },
 	{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_csum_none) },
@@ -144,6 +145,7 @@ void mlx5e_grp_sw_update_stats(struct mlx5e_priv *priv)
 		s->rx_bytes	+= rq_stats->bytes;
 		s->rx_lro_packets += rq_stats->lro_packets;
 		s->rx_lro_bytes	+= rq_stats->lro_bytes;
+		s->rx_ecn_mark	+= rq_stats->ecn_mark;
 		s->rx_removed_vlan_packets += rq_stats->removed_vlan_packets;
 		s->rx_csum_none	+= rq_stats->csum_none;
 		s->rx_csum_complete += rq_stats->csum_complete;
@@ -1144,6 +1146,7 @@ static const struct counter_desc rq_stats_desc[] = {
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, xdp_redirect) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, lro_packets) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, lro_bytes) },
+	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, ecn_mark) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, removed_vlan_packets) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, wqe_err) },
 	{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, mpwqe_filler_cqes) },
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
index a4c035aedd46..a5fb3dc27f50 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
@@ -66,6 +66,7 @@ struct mlx5e_sw_stats {
 	u64 tx_nop;
 	u64 rx_lro_packets;
 	u64 rx_lro_bytes;
+	u64 rx_ecn_mark;
 	u64 rx_removed_vlan_packets;
 	u64 rx_csum_unnecessary;
 	u64 rx_csum_none;
@@ -184,6 +185,7 @@ struct mlx5e_rq_stats {
 	u64 csum_none;
 	u64 lro_packets;
 	u64 lro_bytes;
+	u64 ecn_mark;
 	u64 removed_vlan_packets;
 	u64 xdp_drop;
 	u64 xdp_redirect;
-- 
2.17.1

^ permalink raw reply related

* [net-next 9/9] net/mlx5e: don't set CHECKSUM_COMPLETE on SCTP packets
From: Saeed Mahameed @ 2018-09-06  4:33 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Alaa Hleihel, Saeed Mahameed
In-Reply-To: <20180906043331.31958-1-saeedm@mellanox.com>

From: Alaa Hleihel <alaa@mellanox.com>

CHECKSUM_COMPLETE is not applicable to SCTP protocol.
Setting it for SCTP packets leads to CRC32c validation failure.

Fixes: bbceefce9adf ("net/mlx5e: Support RX CHECKSUM_COMPLETE")
Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 2175d6972dc3..424bc89184c6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -755,6 +755,14 @@ static __be32 mlx5e_get_fcs(struct sk_buff *skb)
 	return fcs_bytes;
 }
 
+static u8 get_ip_proto(struct sk_buff *skb, __be16 proto)
+{
+	void *ip_p = skb->data + sizeof(struct ethhdr);
+
+	return (proto == htons(ETH_P_IP)) ? ((struct iphdr *)ip_p)->protocol :
+					    ((struct ipv6hdr *)ip_p)->nexthdr;
+}
+
 static inline void mlx5e_handle_csum(struct net_device *netdev,
 				     struct mlx5_cqe64 *cqe,
 				     struct mlx5e_rq *rq,
@@ -775,6 +783,9 @@ static inline void mlx5e_handle_csum(struct net_device *netdev,
 	}
 
 	if (likely(is_last_ethertype_ip(skb, &network_depth, &proto))) {
+		if (unlikely(get_ip_proto(skb, proto) == IPPROTO_SCTP))
+			goto csum_unnecessary;
+
 		skb->ip_summed = CHECKSUM_COMPLETE;
 		skb->csum = csum_unfold((__force __sum16)cqe->check_sum);
 		if (network_depth > ETH_HLEN)
@@ -792,6 +803,7 @@ static inline void mlx5e_handle_csum(struct net_device *netdev,
 		return;
 	}
 
+csum_unnecessary:
 	if (likely((cqe->hds_ip_ext & CQE_L3_OK) &&
 		   (cqe->hds_ip_ext & CQE_L4_OK))) {
 		skb->ip_summed = CHECKSUM_UNNECESSARY;
-- 
2.17.1

^ permalink raw reply related

* [net-next 7/9] net/mlx5e: Replace PTP clock lock from RW lock to seq lock
From: Saeed Mahameed @ 2018-09-06  4:33 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Shay Agroskin, Saeed Mahameed
In-Reply-To: <20180906043331.31958-1-saeedm@mellanox.com>

From: Shay Agroskin <shayag@mellanox.com>

Changed "priv.clock.lock" lock from 'rw_lock' to 'seq_lock'
in order to improve packet rate performance.

Tested on Intel(R) Xeon(R) CPU E5-2660 v2 @ 2.20GHz.
Sent 64b packets between two peers connected by ConnectX-5,
and measured packet rate for the receiver in three modes:
	no time-stamping (base rate)
	time-stamping using rw_lock (old lock) for critical region
	time-stamping using seq_lock (new lock) for critical region
Only the receiver time stamped its packets.

The measured packet rate improvements are:

	Single flow (multiple TX rings to single RX ring):
		without timestamping:	  4.26 (M packets)/sec
		with rw-lock (old lock):  4.1  (M packets)/sec
		with seq-lock (new lock): 4.16 (M packets)/sec
		1.46% improvement

	Multiple flows (multiple TX rings to six RX rings):
		without timestamping: 	  22   (M packets)/sec
		with rw-lock (old lock):  11.7 (M packets)/sec
		with seq-lock (new lock): 21.3 (M packets)/sec
		82.05% improvement

The packet rate improvement is due to the lack of atomic operations
for the 'readers' by the seq-lock.
Since there are much more 'readers' than 'writers' contention
on this lock, almost all atomic operations are saved.
this results in a dramatic decrease in overall
cache misses.

Signed-off-by: Shay Agroskin <shayag@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../ethernet/mellanox/mlx5/core/lib/clock.c   | 34 +++++++++----------
 .../ethernet/mellanox/mlx5/core/lib/clock.h   |  8 +++--
 include/linux/mlx5/driver.h                   |  2 +-
 3 files changed, 23 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
index 3f767cde4c1d..0d90b1b4a3d3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
@@ -111,10 +111,10 @@ static void mlx5_pps_out(struct work_struct *work)
 	for (i = 0; i < clock->ptp_info.n_pins; i++) {
 		u64 tstart;
 
-		write_lock_irqsave(&clock->lock, flags);
+		write_seqlock_irqsave(&clock->lock, flags);
 		tstart = clock->pps_info.start[i];
 		clock->pps_info.start[i] = 0;
-		write_unlock_irqrestore(&clock->lock, flags);
+		write_sequnlock_irqrestore(&clock->lock, flags);
 		if (!tstart)
 			continue;
 
@@ -132,10 +132,10 @@ static void mlx5_timestamp_overflow(struct work_struct *work)
 						overflow_work);
 	unsigned long flags;
 
-	write_lock_irqsave(&clock->lock, flags);
+	write_seqlock_irqsave(&clock->lock, flags);
 	timecounter_read(&clock->tc);
 	mlx5_update_clock_info_page(clock->mdev);
-	write_unlock_irqrestore(&clock->lock, flags);
+	write_sequnlock_irqrestore(&clock->lock, flags);
 	schedule_delayed_work(&clock->overflow_work, clock->overflow_period);
 }
 
@@ -147,10 +147,10 @@ static int mlx5_ptp_settime(struct ptp_clock_info *ptp,
 	u64 ns = timespec64_to_ns(ts);
 	unsigned long flags;
 
-	write_lock_irqsave(&clock->lock, flags);
+	write_seqlock_irqsave(&clock->lock, flags);
 	timecounter_init(&clock->tc, &clock->cycles, ns);
 	mlx5_update_clock_info_page(clock->mdev);
-	write_unlock_irqrestore(&clock->lock, flags);
+	write_sequnlock_irqrestore(&clock->lock, flags);
 
 	return 0;
 }
@@ -162,9 +162,9 @@ static int mlx5_ptp_gettime(struct ptp_clock_info *ptp, struct timespec64 *ts)
 	u64 ns;
 	unsigned long flags;
 
-	write_lock_irqsave(&clock->lock, flags);
+	write_seqlock_irqsave(&clock->lock, flags);
 	ns = timecounter_read(&clock->tc);
-	write_unlock_irqrestore(&clock->lock, flags);
+	write_sequnlock_irqrestore(&clock->lock, flags);
 
 	*ts = ns_to_timespec64(ns);
 
@@ -177,10 +177,10 @@ static int mlx5_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta)
 						ptp_info);
 	unsigned long flags;
 
-	write_lock_irqsave(&clock->lock, flags);
+	write_seqlock_irqsave(&clock->lock, flags);
 	timecounter_adjtime(&clock->tc, delta);
 	mlx5_update_clock_info_page(clock->mdev);
-	write_unlock_irqrestore(&clock->lock, flags);
+	write_sequnlock_irqrestore(&clock->lock, flags);
 
 	return 0;
 }
@@ -203,12 +203,12 @@ static int mlx5_ptp_adjfreq(struct ptp_clock_info *ptp, s32 delta)
 	adj *= delta;
 	diff = div_u64(adj, 1000000000ULL);
 
-	write_lock_irqsave(&clock->lock, flags);
+	write_seqlock_irqsave(&clock->lock, flags);
 	timecounter_read(&clock->tc);
 	clock->cycles.mult = neg_adj ? clock->nominal_c_mult - diff :
 				       clock->nominal_c_mult + diff;
 	mlx5_update_clock_info_page(clock->mdev);
-	write_unlock_irqrestore(&clock->lock, flags);
+	write_sequnlock_irqrestore(&clock->lock, flags);
 
 	return 0;
 }
@@ -307,12 +307,12 @@ static int mlx5_perout_configure(struct ptp_clock_info *ptp,
 		ts.tv_nsec = rq->perout.start.nsec;
 		ns = timespec64_to_ns(&ts);
 		cycles_now = mlx5_read_internal_timer(mdev);
-		write_lock_irqsave(&clock->lock, flags);
+		write_seqlock_irqsave(&clock->lock, flags);
 		nsec_now = timecounter_cyc2time(&clock->tc, cycles_now);
 		nsec_delta = ns - nsec_now;
 		cycles_delta = div64_u64(nsec_delta << clock->cycles.shift,
 					 clock->cycles.mult);
-		write_unlock_irqrestore(&clock->lock, flags);
+		write_sequnlock_irqrestore(&clock->lock, flags);
 		time_stamp = cycles_now + cycles_delta;
 		field_select = MLX5_MTPPS_FS_PIN_MODE |
 			       MLX5_MTPPS_FS_PATTERN |
@@ -471,14 +471,14 @@ void mlx5_pps_event(struct mlx5_core_dev *mdev,
 		ts.tv_sec += 1;
 		ts.tv_nsec = 0;
 		ns = timespec64_to_ns(&ts);
-		write_lock_irqsave(&clock->lock, flags);
+		write_seqlock_irqsave(&clock->lock, flags);
 		nsec_now = timecounter_cyc2time(&clock->tc, cycles_now);
 		nsec_delta = ns - nsec_now;
 		cycles_delta = div64_u64(nsec_delta << clock->cycles.shift,
 					 clock->cycles.mult);
 		clock->pps_info.start[pin] = cycles_now + cycles_delta;
 		schedule_work(&clock->pps_info.out_work);
-		write_unlock_irqrestore(&clock->lock, flags);
+		write_sequnlock_irqrestore(&clock->lock, flags);
 		break;
 	default:
 		mlx5_core_err(mdev, " Unhandled event\n");
@@ -498,7 +498,7 @@ void mlx5_init_clock(struct mlx5_core_dev *mdev)
 		mlx5_core_warn(mdev, "invalid device_frequency_khz, aborting HW clock init\n");
 		return;
 	}
-	rwlock_init(&clock->lock);
+	seqlock_init(&clock->lock);
 	clock->cycles.read = read_internal_timer;
 	clock->cycles.shift = MLX5_CYCLES_SHIFT;
 	clock->cycles.mult = clocksource_khz2mult(dev_freq,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.h b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.h
index 02e2e4575e4f..263cb6e2aeee 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.h
@@ -46,11 +46,13 @@ static inline int mlx5_clock_get_ptp_index(struct mlx5_core_dev *mdev)
 static inline ktime_t mlx5_timecounter_cyc2time(struct mlx5_clock *clock,
 						u64 timestamp)
 {
+	unsigned int seq;
 	u64 nsec;
 
-	read_lock(&clock->lock);
-	nsec = timecounter_cyc2time(&clock->tc, timestamp);
-	read_unlock(&clock->lock);
+	do {
+		seq = read_seqbegin(&clock->lock);
+		nsec = timecounter_cyc2time(&clock->tc, timestamp);
+	} while (read_seqretry(&clock->lock, seq));
 
 	return ns_to_ktime(nsec);
 }
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 2a0c845f6bdb..b7fce2c9443d 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -805,7 +805,7 @@ struct mlx5_pps {
 };
 
 struct mlx5_clock {
-	rwlock_t                   lock;
+	seqlock_t                  lock;
 	struct cyclecounter        cycles;
 	struct timecounter         tc;
 	struct hwtstamp_config     hwtstamp_config;
-- 
2.17.1

^ permalink raw reply related

* [net-next 6/9] net/mlx5e: Move Q counters allocation and drop RQ to init_rx
From: Saeed Mahameed @ 2018-09-06  4:33 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Roi Dayan, Saeed Mahameed
In-Reply-To: <20180906043331.31958-1-saeedm@mellanox.com>

From: Roi Dayan <roid@mellanox.com>

Not all profiles query the HW Q counters in update_stats() callback.
HW Q couners are limited per device and in case of representors all
their Q counters are allocated on the parent PF device.
Avoid reundant allocation of HW Q counters by moving the allocation
to init_rx profile callback.

Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h  |  6 +++
 .../net/ethernet/mellanox/mlx5/core/en_main.c | 45 +++++++++----------
 .../net/ethernet/mellanox/mlx5/core/en_rep.c  | 12 ++++-
 .../ethernet/mellanox/mlx5/core/ipoib/ipoib.c | 17 ++++++-
 4 files changed, 55 insertions(+), 25 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index de0f7702c86a..01a967e717e7 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -898,6 +898,12 @@ void mlx5e_destroy_mdev_resources(struct mlx5_core_dev *mdev);
 int mlx5e_refresh_tirs(struct mlx5e_priv *priv, bool enable_uc_lb);
 
 /* common netdev helpers */
+void mlx5e_create_q_counters(struct mlx5e_priv *priv);
+void mlx5e_destroy_q_counters(struct mlx5e_priv *priv);
+int mlx5e_open_drop_rq(struct mlx5e_priv *priv,
+		       struct mlx5e_rq *drop_rq);
+void mlx5e_close_drop_rq(struct mlx5e_rq *drop_rq);
+
 int mlx5e_create_indirect_rqt(struct mlx5e_priv *priv);
 
 int mlx5e_create_indirect_tirs(struct mlx5e_priv *priv);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 5a7939e70190..d14c4051edd8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -3049,8 +3049,8 @@ static int mlx5e_alloc_drop_cq(struct mlx5_core_dev *mdev,
 	return mlx5e_alloc_cq_common(mdev, param, cq);
 }
 
-static int mlx5e_open_drop_rq(struct mlx5e_priv *priv,
-			      struct mlx5e_rq *drop_rq)
+int mlx5e_open_drop_rq(struct mlx5e_priv *priv,
+		       struct mlx5e_rq *drop_rq)
 {
 	struct mlx5_core_dev *mdev = priv->mdev;
 	struct mlx5e_cq_param cq_param = {};
@@ -3094,7 +3094,7 @@ static int mlx5e_open_drop_rq(struct mlx5e_priv *priv,
 	return err;
 }
 
-static void mlx5e_close_drop_rq(struct mlx5e_rq *drop_rq)
+void mlx5e_close_drop_rq(struct mlx5e_rq *drop_rq)
 {
 	mlx5e_destroy_rq(drop_rq);
 	mlx5e_free_rq(drop_rq);
@@ -4726,7 +4726,7 @@ static void mlx5e_build_nic_netdev(struct net_device *netdev)
 	mlx5e_tls_build_netdev(priv);
 }
 
-static void mlx5e_create_q_counters(struct mlx5e_priv *priv)
+void mlx5e_create_q_counters(struct mlx5e_priv *priv)
 {
 	struct mlx5_core_dev *mdev = priv->mdev;
 	int err;
@@ -4744,7 +4744,7 @@ static void mlx5e_create_q_counters(struct mlx5e_priv *priv)
 	}
 }
 
-static void mlx5e_destroy_q_counters(struct mlx5e_priv *priv)
+void mlx5e_destroy_q_counters(struct mlx5e_priv *priv)
 {
 	if (priv->q_counter)
 		mlx5_core_dealloc_q_counter(priv->mdev, priv->q_counter);
@@ -4783,9 +4783,17 @@ static int mlx5e_init_nic_rx(struct mlx5e_priv *priv)
 	struct mlx5_core_dev *mdev = priv->mdev;
 	int err;
 
+	mlx5e_create_q_counters(priv);
+
+	err = mlx5e_open_drop_rq(priv, &priv->drop_rq);
+	if (err) {
+		mlx5_core_err(mdev, "open drop rq failed, %d\n", err);
+		goto err_destroy_q_counters;
+	}
+
 	err = mlx5e_create_indirect_rqt(priv);
 	if (err)
-		return err;
+		goto err_close_drop_rq;
 
 	err = mlx5e_create_direct_rqts(priv);
 	if (err)
@@ -4821,6 +4829,10 @@ static int mlx5e_init_nic_rx(struct mlx5e_priv *priv)
 	mlx5e_destroy_direct_rqts(priv);
 err_destroy_indirect_rqts:
 	mlx5e_destroy_rqt(priv, &priv->indir_rqt);
+err_close_drop_rq:
+	mlx5e_close_drop_rq(&priv->drop_rq);
+err_destroy_q_counters:
+	mlx5e_destroy_q_counters(priv);
 	return err;
 }
 
@@ -4832,6 +4844,8 @@ static void mlx5e_cleanup_nic_rx(struct mlx5e_priv *priv)
 	mlx5e_destroy_indirect_tirs(priv);
 	mlx5e_destroy_direct_rqts(priv);
 	mlx5e_destroy_rqt(priv, &priv->indir_rqt);
+	mlx5e_close_drop_rq(&priv->drop_rq);
+	mlx5e_destroy_q_counters(priv);
 }
 
 static int mlx5e_init_nic_tx(struct mlx5e_priv *priv)
@@ -4975,7 +4989,6 @@ struct net_device *mlx5e_create_netdev(struct mlx5_core_dev *mdev,
 
 int mlx5e_attach_netdev(struct mlx5e_priv *priv)
 {
-	struct mlx5_core_dev *mdev = priv->mdev;
 	const struct mlx5e_profile *profile;
 	int err;
 
@@ -4986,28 +4999,16 @@ int mlx5e_attach_netdev(struct mlx5e_priv *priv)
 	if (err)
 		goto out;
 
-	mlx5e_create_q_counters(priv);
-
-	err = mlx5e_open_drop_rq(priv, &priv->drop_rq);
-	if (err) {
-		mlx5_core_err(mdev, "open drop rq failed, %d\n", err);
-		goto err_destroy_q_counters;
-	}
-
 	err = profile->init_rx(priv);
 	if (err)
-		goto err_close_drop_rq;
+		goto err_cleanup_tx;
 
 	if (profile->enable)
 		profile->enable(priv);
 
 	return 0;
 
-err_close_drop_rq:
-	mlx5e_close_drop_rq(&priv->drop_rq);
-
-err_destroy_q_counters:
-	mlx5e_destroy_q_counters(priv);
+err_cleanup_tx:
 	profile->cleanup_tx(priv);
 
 out:
@@ -5025,8 +5026,6 @@ void mlx5e_detach_netdev(struct mlx5e_priv *priv)
 	flush_workqueue(priv->wq);
 
 	profile->cleanup_rx(priv);
-	mlx5e_close_drop_rq(&priv->drop_rq);
-	mlx5e_destroy_q_counters(priv);
 	profile->cleanup_tx(priv);
 	cancel_delayed_work_sync(&priv->update_stats_work);
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index c9cc9747d21d..f6eead24931f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -999,14 +999,21 @@ static int mlx5e_init_rep_rx(struct mlx5e_priv *priv)
 	struct mlx5_eswitch *esw = priv->mdev->priv.eswitch;
 	struct mlx5e_rep_priv *rpriv = priv->ppriv;
 	struct mlx5_eswitch_rep *rep = rpriv->rep;
+	struct mlx5_core_dev *mdev = priv->mdev;
 	struct mlx5_flow_handle *flow_rule;
 	int err;
 
 	mlx5e_init_l2_addr(priv);
 
+	err = mlx5e_open_drop_rq(priv, &priv->drop_rq);
+	if (err) {
+		mlx5_core_err(mdev, "open drop rq failed, %d\n", err);
+		return err;
+	}
+
 	err = mlx5e_create_direct_rqts(priv);
 	if (err)
-		return err;
+		goto err_close_drop_rq;
 
 	err = mlx5e_create_direct_tirs(priv);
 	if (err)
@@ -1027,6 +1034,8 @@ static int mlx5e_init_rep_rx(struct mlx5e_priv *priv)
 	mlx5e_destroy_direct_tirs(priv);
 err_destroy_direct_rqts:
 	mlx5e_destroy_direct_rqts(priv);
+err_close_drop_rq:
+	mlx5e_close_drop_rq(&priv->drop_rq);
 	return err;
 }
 
@@ -1037,6 +1046,7 @@ static void mlx5e_cleanup_rep_rx(struct mlx5e_priv *priv)
 	mlx5_del_flow_rules(rpriv->vport_rx_rule);
 	mlx5e_destroy_direct_tirs(priv);
 	mlx5e_destroy_direct_rqts(priv);
+	mlx5e_close_drop_rq(&priv->drop_rq);
 }
 
 static int mlx5e_init_rep_tx(struct mlx5e_priv *priv)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
index 3dd9f885ed8b..a825ed093efd 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
@@ -349,11 +349,20 @@ static void mlx5i_destroy_flow_steering(struct mlx5e_priv *priv)
 
 static int mlx5i_init_rx(struct mlx5e_priv *priv)
 {
+	struct mlx5_core_dev *mdev = priv->mdev;
 	int err;
 
+	mlx5e_create_q_counters(priv);
+
+	err = mlx5e_open_drop_rq(priv, &priv->drop_rq);
+	if (err) {
+		mlx5_core_err(mdev, "open drop rq failed, %d\n", err);
+		goto err_destroy_q_counters;
+	}
+
 	err = mlx5e_create_indirect_rqt(priv);
 	if (err)
-		return err;
+		goto err_close_drop_rq;
 
 	err = mlx5e_create_direct_rqts(priv);
 	if (err)
@@ -381,6 +390,10 @@ static int mlx5i_init_rx(struct mlx5e_priv *priv)
 	mlx5e_destroy_direct_rqts(priv);
 err_destroy_indirect_rqts:
 	mlx5e_destroy_rqt(priv, &priv->indir_rqt);
+err_close_drop_rq:
+	mlx5e_close_drop_rq(&priv->drop_rq);
+err_destroy_q_counters:
+	mlx5e_destroy_q_counters(priv);
 	return err;
 }
 
@@ -391,6 +404,8 @@ static void mlx5i_cleanup_rx(struct mlx5e_priv *priv)
 	mlx5e_destroy_indirect_tirs(priv);
 	mlx5e_destroy_direct_rqts(priv);
 	mlx5e_destroy_rqt(priv, &priv->indir_rqt);
+	mlx5e_close_drop_rq(&priv->drop_rq);
+	mlx5e_destroy_q_counters(priv);
 }
 
 static const struct mlx5e_profile mlx5i_nic_profile = {
-- 
2.17.1

^ permalink raw reply related

* [net-next 5/9] net/mlx5e: Move mlx5e_priv_flags into en_ethtool.c
From: Saeed Mahameed @ 2018-09-06  4:33 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Kamal Heib, Saeed Mahameed
In-Reply-To: <20180906043331.31958-1-saeedm@mellanox.com>

From: Kamal Heib <kamalheib1@gmail.com>

Move the definition of mlx5e_priv_flags into en_ethtool.c because it's
only used there.

Fixes: 4e59e2888139 ("net/mlx5e: Introduce net device priv flags infrastructure")
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en.h         | 7 -------
 drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c | 7 +++++++
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index db2cfcd21d43..de0f7702c86a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -204,13 +204,6 @@ struct mlx5e_umr_wqe {
 
 extern const char mlx5e_self_tests[][ETH_GSTRING_LEN];
 
-static const char mlx5e_priv_flags[][ETH_GSTRING_LEN] = {
-	"rx_cqe_moder",
-	"tx_cqe_moder",
-	"rx_cqe_compress",
-	"rx_striding_rq",
-};
-
 enum mlx5e_priv_flag {
 	MLX5E_PFLAG_RX_CQE_BASED_MODER = (1 << 0),
 	MLX5E_PFLAG_TX_CQE_BASED_MODER = (1 << 1),
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index 98dd3e0ada72..8cd338ceb237 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -135,6 +135,13 @@ void mlx5e_build_ptys2ethtool_map(void)
 				       ETHTOOL_LINK_MODE_50000baseKR2_Full_BIT);
 }
 
+static const char mlx5e_priv_flags[][ETH_GSTRING_LEN] = {
+	"rx_cqe_moder",
+	"tx_cqe_moder",
+	"rx_cqe_compress",
+	"rx_striding_rq",
+};
+
 int mlx5e_ethtool_get_sset_count(struct mlx5e_priv *priv, int sset)
 {
 	int i, num_stats = 0;
-- 
2.17.1

^ permalink raw reply related

* [net-next 3/9] net/mlx5: Store flow counters in a list
From: Saeed Mahameed @ 2018-09-06  4:33 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Vlad Buslov, Saeed Mahameed
In-Reply-To: <20180906043331.31958-1-saeedm@mellanox.com>

From: Vlad Buslov <vladbu@mellanox.com>

In order to improve performance of flow counter stats query loop that
traverses all configured flow counters, replace rb_tree with double-linked
list. This change improves performance of traversing flow counters by
removing the tree traversal. (profiling data showed that call to rb_next
was most top CPU consumer)

However, lookup of flow flow counter in list becomes linear, instead of
logarithmic. This problem is fixed by next patch in series, which adds idr
for fast lookup. Idr is to be used because it is not an intrusive data
structure and doesn't require adding any new members to struct mlx5_fc,
which allows its control data part to stay <= 1 cache line in size.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Amir Vadai <amir@vadai.me>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/fs_core.h |  2 +-
 .../ethernet/mellanox/mlx5/core/fs_counters.c | 88 +++++++++----------
 include/linux/mlx5/driver.h                   |  2 +-
 3 files changed, 42 insertions(+), 50 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h
index 617d6239c5f3..a06f83c0c2b6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h
@@ -139,7 +139,7 @@ struct mlx5_fc_cache {
 };
 
 struct mlx5_fc {
-	struct rb_node node;
+	struct list_head list;
 	struct llist_node addlist;
 	struct llist_node dellist;
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c
index f1266f215a31..90ebfee37508 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c
@@ -73,36 +73,38 @@
  *   elapsed, the thread will actually query the hardware.
  */
 
-static void mlx5_fc_stats_insert(struct rb_root *root, struct mlx5_fc *counter)
+static struct list_head *mlx5_fc_counters_lookup_next(struct mlx5_core_dev *dev,
+						      u32 id)
 {
-	struct rb_node **new = &root->rb_node;
-	struct rb_node *parent = NULL;
-
-	while (*new) {
-		struct mlx5_fc *this = rb_entry(*new, struct mlx5_fc, node);
-		int result = counter->id - this->id;
-
-		parent = *new;
-		if (result < 0)
-			new = &((*new)->rb_left);
-		else
-			new = &((*new)->rb_right);
-	}
+	struct mlx5_fc_stats *fc_stats = &dev->priv.fc_stats;
+	struct mlx5_fc *counter;
+
+	list_for_each_entry(counter, &fc_stats->counters, list)
+		if (counter->id > id)
+			return &counter->list;
+
+	return &fc_stats->counters;
+}
+
+static void mlx5_fc_stats_insert(struct mlx5_core_dev *dev,
+				 struct mlx5_fc *counter)
+{
+	struct list_head *next = mlx5_fc_counters_lookup_next(dev, counter->id);
 
-	/* Add new node and rebalance tree. */
-	rb_link_node(&counter->node, parent, new);
-	rb_insert_color(&counter->node, root);
+	list_add_tail(&counter->list, next);
 }
 
-/* The function returns the last node that was queried so the caller
+/* The function returns the last counter that was queried so the caller
  * function can continue calling it till all counters are queried.
  */
-static struct rb_node *mlx5_fc_stats_query(struct mlx5_core_dev *dev,
+static struct mlx5_fc *mlx5_fc_stats_query(struct mlx5_core_dev *dev,
 					   struct mlx5_fc *first,
 					   u32 last_id)
 {
+	struct mlx5_fc_stats *fc_stats = &dev->priv.fc_stats;
+	struct mlx5_fc *counter = NULL;
 	struct mlx5_cmd_fc_bulk *b;
-	struct rb_node *node = NULL;
+	bool more = false;
 	u32 afirst_id;
 	int num;
 	int err;
@@ -132,14 +134,16 @@ static struct rb_node *mlx5_fc_stats_query(struct mlx5_core_dev *dev,
 		goto out;
 	}
 
-	for (node = &first->node; node; node = rb_next(node)) {
-		struct mlx5_fc *counter = rb_entry(node, struct mlx5_fc, node);
+	counter = first;
+	list_for_each_entry_from(counter, &fc_stats->counters, list) {
 		struct mlx5_fc_cache *c = &counter->cache;
 		u64 packets;
 		u64 bytes;
 
-		if (counter->id > last_id)
+		if (counter->id > last_id) {
+			more = true;
 			break;
+		}
 
 		mlx5_cmd_fc_bulk_get(dev, b,
 				     counter->id, &packets, &bytes);
@@ -155,7 +159,7 @@ static struct rb_node *mlx5_fc_stats_query(struct mlx5_core_dev *dev,
 out:
 	mlx5_cmd_fc_bulk_free(b);
 
-	return node;
+	return more ? counter : NULL;
 }
 
 static void mlx5_free_fc(struct mlx5_core_dev *dev,
@@ -173,33 +177,30 @@ static void mlx5_fc_stats_work(struct work_struct *work)
 	struct llist_node *tmplist = llist_del_all(&fc_stats->addlist);
 	struct mlx5_fc *counter = NULL, *last = NULL, *tmp;
 	unsigned long now = jiffies;
-	struct rb_node *node;
 
-	if (tmplist || !RB_EMPTY_ROOT(&fc_stats->counters))
+	if (tmplist || !list_empty(&fc_stats->counters))
 		queue_delayed_work(fc_stats->wq, &fc_stats->work,
 				   fc_stats->sampling_interval);
 
 	llist_for_each_entry(counter, tmplist, addlist)
-		mlx5_fc_stats_insert(&fc_stats->counters, counter);
+		mlx5_fc_stats_insert(dev, counter);
 
 	tmplist = llist_del_all(&fc_stats->dellist);
 	llist_for_each_entry_safe(counter, tmp, tmplist, dellist) {
-		rb_erase(&counter->node, &fc_stats->counters);
+		list_del(&counter->list);
 
 		mlx5_free_fc(dev, counter);
 	}
 
-	node = rb_last(&fc_stats->counters);
-	if (time_before(now, fc_stats->next_query) || !node)
+	if (time_before(now, fc_stats->next_query) ||
+	    list_empty(&fc_stats->counters))
 		return;
-	last = rb_entry(node, struct mlx5_fc, node);
-
-	node = rb_first(&fc_stats->counters);
-	while (node) {
-		counter = rb_entry(node, struct mlx5_fc, node);
+	last = list_last_entry(&fc_stats->counters, struct mlx5_fc, list);
 
-		node = mlx5_fc_stats_query(dev, counter, last->id);
-	}
+	counter = list_first_entry(&fc_stats->counters, struct mlx5_fc,
+				   list);
+	while (counter)
+		counter = mlx5_fc_stats_query(dev, counter, last->id);
 
 	fc_stats->next_query = now + fc_stats->sampling_interval;
 }
@@ -257,7 +258,7 @@ int mlx5_init_fc_stats(struct mlx5_core_dev *dev)
 {
 	struct mlx5_fc_stats *fc_stats = &dev->priv.fc_stats;
 
-	fc_stats->counters = RB_ROOT;
+	INIT_LIST_HEAD(&fc_stats->counters);
 	init_llist_head(&fc_stats->addlist);
 	init_llist_head(&fc_stats->dellist);
 
@@ -277,7 +278,6 @@ void mlx5_cleanup_fc_stats(struct mlx5_core_dev *dev)
 	struct llist_node *tmplist;
 	struct mlx5_fc *counter;
 	struct mlx5_fc *tmp;
-	struct rb_node *node;
 
 	cancel_delayed_work_sync(&dev->priv.fc_stats.work);
 	destroy_workqueue(dev->priv.fc_stats.wq);
@@ -287,16 +287,8 @@ void mlx5_cleanup_fc_stats(struct mlx5_core_dev *dev)
 	llist_for_each_entry_safe(counter, tmp, tmplist, addlist)
 		mlx5_free_fc(dev, counter);
 
-	node = rb_first(&fc_stats->counters);
-	while (node) {
-		counter = rb_entry(node, struct mlx5_fc, node);
-
-		node = rb_next(node);
-
-		rb_erase(&counter->node, &fc_stats->counters);
-
+	list_for_each_entry_safe(counter, tmp, &fc_stats->counters, list)
 		mlx5_free_fc(dev, counter);
-	}
 }
 
 int mlx5_fc_query(struct mlx5_core_dev *dev, struct mlx5_fc *counter,
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 4b53ac64004b..61bed33e6675 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -583,7 +583,7 @@ struct mlx5_irq_info {
 };
 
 struct mlx5_fc_stats {
-	struct rb_root counters;
+	struct list_head counters;
 	struct llist_head addlist;
 	struct llist_head dellist;
 
-- 
2.17.1

^ permalink raw reply related

* [net-next 2/9] net/mlx5: Add new list to store deleted flow counters
From: Saeed Mahameed @ 2018-09-06  4:33 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Vlad Buslov, Saeed Mahameed
In-Reply-To: <20180906043331.31958-1-saeedm@mellanox.com>

From: Vlad Buslov <vladbu@mellanox.com>

In order to prevent flow counters stats work function from traversing whole
flow counters tree while searching for deleted flow counters, new list to
store deleted flow counters is added to struct mlx5_fc_stats. Lockless
NULL-terminated single linked list data type is used due to following
reasons:
 - This use case only needs to add single element to list and
 remove/iterate whole list. Lockless list doesn't require any additional
 synchronization for these operations.
 - First cache line of flow counter data structure only has space to store
 single additional pointer, which precludes usage of double linked list.

Remove flow counter 'deleted' flag that is no longer needed.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Amir Vadai <amir@vadai.me>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/fs_core.h |  2 +-
 .../ethernet/mellanox/mlx5/core/fs_counters.c | 34 +++++++------------
 include/linux/mlx5/driver.h                   |  1 +
 3 files changed, 14 insertions(+), 23 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h
index f68590291e0c..617d6239c5f3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h
@@ -141,6 +141,7 @@ struct mlx5_fc_cache {
 struct mlx5_fc {
 	struct rb_node node;
 	struct llist_node addlist;
+	struct llist_node dellist;
 
 	/* last{packets,bytes} members are used when calculating the delta since
 	 * last reading
@@ -149,7 +150,6 @@ struct mlx5_fc {
 	u64 lastbytes;
 
 	u32 id;
-	bool deleted;
 	bool aging;
 
 	struct mlx5_fc_cache cache ____cacheline_aligned_in_smp;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c
index d996d6cf9e19..f1266f215a31 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c
@@ -58,7 +58,7 @@
  *   - spawn thread to do the actual destroy
  *
  * - destroy (user context)
- *   - mark a counter as deleted
+ *   - add a counter to lockless dellist
  *   - spawn thread to do the actual del
  *
  * - dump (user context)
@@ -171,9 +171,8 @@ static void mlx5_fc_stats_work(struct work_struct *work)
 						 priv.fc_stats.work.work);
 	struct mlx5_fc_stats *fc_stats = &dev->priv.fc_stats;
 	struct llist_node *tmplist = llist_del_all(&fc_stats->addlist);
+	struct mlx5_fc *counter = NULL, *last = NULL, *tmp;
 	unsigned long now = jiffies;
-	struct mlx5_fc *counter = NULL;
-	struct mlx5_fc *last = NULL;
 	struct rb_node *node;
 
 	if (tmplist || !RB_EMPTY_ROOT(&fc_stats->counters))
@@ -183,26 +182,17 @@ static void mlx5_fc_stats_work(struct work_struct *work)
 	llist_for_each_entry(counter, tmplist, addlist)
 		mlx5_fc_stats_insert(&fc_stats->counters, counter);
 
-	node = rb_first(&fc_stats->counters);
-	while (node) {
-		counter = rb_entry(node, struct mlx5_fc, node);
-
-		node = rb_next(node);
-
-		if (counter->deleted) {
-			rb_erase(&counter->node, &fc_stats->counters);
-
-			mlx5_cmd_fc_free(dev, counter->id);
-
-			kfree(counter);
-			continue;
-		}
+	tmplist = llist_del_all(&fc_stats->dellist);
+	llist_for_each_entry_safe(counter, tmp, tmplist, dellist) {
+		rb_erase(&counter->node, &fc_stats->counters);
 
-		last = counter;
+		mlx5_free_fc(dev, counter);
 	}
 
-	if (time_before(now, fc_stats->next_query) || !last)
+	node = rb_last(&fc_stats->counters);
+	if (time_before(now, fc_stats->next_query) || !node)
 		return;
+	last = rb_entry(node, struct mlx5_fc, node);
 
 	node = rb_first(&fc_stats->counters);
 	while (node) {
@@ -254,13 +244,12 @@ void mlx5_fc_destroy(struct mlx5_core_dev *dev, struct mlx5_fc *counter)
 		return;
 
 	if (counter->aging) {
-		counter->deleted = true;
+		llist_add(&counter->dellist, &fc_stats->dellist);
 		mod_delayed_work(fc_stats->wq, &fc_stats->work, 0);
 		return;
 	}
 
-	mlx5_cmd_fc_free(dev, counter->id);
-	kfree(counter);
+	mlx5_free_fc(dev, counter);
 }
 EXPORT_SYMBOL(mlx5_fc_destroy);
 
@@ -270,6 +259,7 @@ int mlx5_init_fc_stats(struct mlx5_core_dev *dev)
 
 	fc_stats->counters = RB_ROOT;
 	init_llist_head(&fc_stats->addlist);
+	init_llist_head(&fc_stats->dellist);
 
 	fc_stats->wq = create_singlethread_workqueue("mlx5_fc");
 	if (!fc_stats->wq)
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index c00549293982..4b53ac64004b 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -585,6 +585,7 @@ struct mlx5_irq_info {
 struct mlx5_fc_stats {
 	struct rb_root counters;
 	struct llist_head addlist;
+	struct llist_head dellist;
 
 	struct workqueue_struct *wq;
 	struct delayed_work work;
-- 
2.17.1

^ permalink raw reply related

* [net-next 4/9] net/mlx5: Add flow counters idr
From: Saeed Mahameed @ 2018-09-06  4:33 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Vlad Buslov, Saeed Mahameed
In-Reply-To: <20180906043331.31958-1-saeedm@mellanox.com>

From: Vlad Buslov <vladbu@mellanox.com>

Previous patch in series changed flow counter storage structure from
rb_tree to linked list in order to improve flow counter traversal
performance. The drawback of such solution is that flow counter lookup by
id becomes linear in complexity.

Store pointers to flow counters in idr in order to improve lookup
performance to logarithmic again. Idr is non-intrusive data structure and
doesn't require extending flow counter struct with new elements. This means
that idr can be used for lookup, while linked list from previous patch is
used for traversal, and struct mlx5_fc size is <= 2 cache lines.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Amir Vadai <amir@vadai.me>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../ethernet/mellanox/mlx5/core/fs_counters.c | 37 +++++++++++++++++--
 include/linux/mlx5/driver.h                   |  2 +
 2 files changed, 35 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c
index 90ebfee37508..09206c4acd9a 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c
@@ -77,13 +77,18 @@ static struct list_head *mlx5_fc_counters_lookup_next(struct mlx5_core_dev *dev,
 						      u32 id)
 {
 	struct mlx5_fc_stats *fc_stats = &dev->priv.fc_stats;
+	unsigned long next_id = (unsigned long)id + 1;
 	struct mlx5_fc *counter;
 
-	list_for_each_entry(counter, &fc_stats->counters, list)
-		if (counter->id > id)
-			return &counter->list;
+	rcu_read_lock();
+	/* skip counters that are in idr, but not yet in counters list */
+	while ((counter = idr_get_next_ul(&fc_stats->counters_idr,
+					  &next_id)) != NULL &&
+	       list_empty(&counter->list))
+		next_id++;
+	rcu_read_unlock();
 
-	return &fc_stats->counters;
+	return counter ? &counter->list : &fc_stats->counters;
 }
 
 static void mlx5_fc_stats_insert(struct mlx5_core_dev *dev,
@@ -214,15 +219,29 @@ struct mlx5_fc *mlx5_fc_create(struct mlx5_core_dev *dev, bool aging)
 	counter = kzalloc(sizeof(*counter), GFP_KERNEL);
 	if (!counter)
 		return ERR_PTR(-ENOMEM);
+	INIT_LIST_HEAD(&counter->list);
 
 	err = mlx5_cmd_fc_alloc(dev, &counter->id);
 	if (err)
 		goto err_out;
 
 	if (aging) {
+		u32 id = counter->id;
+
 		counter->cache.lastuse = jiffies;
 		counter->aging = true;
 
+		idr_preload(GFP_KERNEL);
+		spin_lock(&fc_stats->counters_idr_lock);
+
+		err = idr_alloc_u32(&fc_stats->counters_idr, counter, &id, id,
+				    GFP_NOWAIT);
+
+		spin_unlock(&fc_stats->counters_idr_lock);
+		idr_preload_end();
+		if (err)
+			goto err_out_alloc;
+
 		llist_add(&counter->addlist, &fc_stats->addlist);
 
 		mod_delayed_work(fc_stats->wq, &fc_stats->work, 0);
@@ -230,6 +249,8 @@ struct mlx5_fc *mlx5_fc_create(struct mlx5_core_dev *dev, bool aging)
 
 	return counter;
 
+err_out_alloc:
+	mlx5_cmd_fc_free(dev, counter->id);
 err_out:
 	kfree(counter);
 
@@ -245,6 +266,10 @@ void mlx5_fc_destroy(struct mlx5_core_dev *dev, struct mlx5_fc *counter)
 		return;
 
 	if (counter->aging) {
+		spin_lock(&fc_stats->counters_idr_lock);
+		WARN_ON(!idr_remove(&fc_stats->counters_idr, counter->id));
+		spin_unlock(&fc_stats->counters_idr_lock);
+
 		llist_add(&counter->dellist, &fc_stats->dellist);
 		mod_delayed_work(fc_stats->wq, &fc_stats->work, 0);
 		return;
@@ -258,6 +283,8 @@ int mlx5_init_fc_stats(struct mlx5_core_dev *dev)
 {
 	struct mlx5_fc_stats *fc_stats = &dev->priv.fc_stats;
 
+	spin_lock_init(&fc_stats->counters_idr_lock);
+	idr_init(&fc_stats->counters_idr);
 	INIT_LIST_HEAD(&fc_stats->counters);
 	init_llist_head(&fc_stats->addlist);
 	init_llist_head(&fc_stats->dellist);
@@ -283,6 +310,8 @@ void mlx5_cleanup_fc_stats(struct mlx5_core_dev *dev)
 	destroy_workqueue(dev->priv.fc_stats.wq);
 	dev->priv.fc_stats.wq = NULL;
 
+	idr_destroy(&fc_stats->counters_idr);
+
 	tmplist = llist_del_all(&fc_stats->addlist);
 	llist_for_each_entry_safe(counter, tmp, tmplist, addlist)
 		mlx5_free_fc(dev, counter);
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 61bed33e6675..2a0c845f6bdb 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -583,6 +583,8 @@ struct mlx5_irq_info {
 };
 
 struct mlx5_fc_stats {
+	spinlock_t counters_idr_lock; /* protects counters_idr */
+	struct idr counters_idr;
 	struct list_head counters;
 	struct llist_head addlist;
 	struct llist_head dellist;
-- 
2.17.1

^ permalink raw reply related

* [net-next 1/9] net/mlx5: Change flow counters addlist type to single linked list
From: Saeed Mahameed @ 2018-09-06  4:33 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Vlad Buslov, Saeed Mahameed
In-Reply-To: <20180906043331.31958-1-saeedm@mellanox.com>

From: Vlad Buslov <vladbu@mellanox.com>

In order to prevent flow counters stats work function from traversing whole
flow counters tree while searching for deleted flow counters, new list to
store deleted flow counters will be added to struct mlx5_fc_stats. However,
the flow counter structure itself has no space left to store any more data
in first cache line. To free space that is needed to store additional list
node, convert current addlist double linked list (two pointers per node) to
atomic single linked list (one pointer per node).

Lockless NULL-terminated single linked list data type doesn't require any
additional external synchronization for operations used by flow counters
module (add single new element, remove all elements from list and traverse
them). Remove addlist_lock that is no longer needed.

Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Amir Vadai <amir@vadai.me>
Reviewed-by: Paul Blakey <paulb@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/fs_core.h |  3 +-
 .../ethernet/mellanox/mlx5/core/fs_counters.c | 45 +++++++++----------
 include/linux/mlx5/driver.h                   |  4 +-
 3 files changed, 23 insertions(+), 29 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h
index 32070e5d993d..f68590291e0c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h
@@ -36,6 +36,7 @@
 #include <linux/refcount.h>
 #include <linux/mlx5/fs.h>
 #include <linux/rhashtable.h>
+#include <linux/llist.h>
 
 enum fs_node_type {
 	FS_TYPE_NAMESPACE,
@@ -139,7 +140,7 @@ struct mlx5_fc_cache {
 
 struct mlx5_fc {
 	struct rb_node node;
-	struct list_head list;
+	struct llist_node addlist;
 
 	/* last{packets,bytes} members are used when calculating the delta since
 	 * last reading
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c
index 58af6be13dfa..d996d6cf9e19 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c
@@ -52,7 +52,9 @@
  * access to counter list:
  * - create (user context)
  *   - mlx5_fc_create() only adds to an addlist to be used by
- *     mlx5_fc_stats_query_work(). addlist is protected by a spinlock.
+ *     mlx5_fc_stats_query_work(). addlist is a lockless single linked list
+ *     that doesn't require any additional synchronization when adding single
+ *     node.
  *   - spawn thread to do the actual destroy
  *
  * - destroy (user context)
@@ -156,28 +158,29 @@ static struct rb_node *mlx5_fc_stats_query(struct mlx5_core_dev *dev,
 	return node;
 }
 
+static void mlx5_free_fc(struct mlx5_core_dev *dev,
+			 struct mlx5_fc *counter)
+{
+	mlx5_cmd_fc_free(dev, counter->id);
+	kfree(counter);
+}
+
 static void mlx5_fc_stats_work(struct work_struct *work)
 {
 	struct mlx5_core_dev *dev = container_of(work, struct mlx5_core_dev,
 						 priv.fc_stats.work.work);
 	struct mlx5_fc_stats *fc_stats = &dev->priv.fc_stats;
+	struct llist_node *tmplist = llist_del_all(&fc_stats->addlist);
 	unsigned long now = jiffies;
 	struct mlx5_fc *counter = NULL;
 	struct mlx5_fc *last = NULL;
 	struct rb_node *node;
-	LIST_HEAD(tmplist);
-
-	spin_lock(&fc_stats->addlist_lock);
 
-	list_splice_tail_init(&fc_stats->addlist, &tmplist);
-
-	if (!list_empty(&tmplist) || !RB_EMPTY_ROOT(&fc_stats->counters))
+	if (tmplist || !RB_EMPTY_ROOT(&fc_stats->counters))
 		queue_delayed_work(fc_stats->wq, &fc_stats->work,
 				   fc_stats->sampling_interval);
 
-	spin_unlock(&fc_stats->addlist_lock);
-
-	list_for_each_entry(counter, &tmplist, list)
+	llist_for_each_entry(counter, tmplist, addlist)
 		mlx5_fc_stats_insert(&fc_stats->counters, counter);
 
 	node = rb_first(&fc_stats->counters);
@@ -229,9 +232,7 @@ struct mlx5_fc *mlx5_fc_create(struct mlx5_core_dev *dev, bool aging)
 		counter->cache.lastuse = jiffies;
 		counter->aging = true;
 
-		spin_lock(&fc_stats->addlist_lock);
-		list_add(&counter->list, &fc_stats->addlist);
-		spin_unlock(&fc_stats->addlist_lock);
+		llist_add(&counter->addlist, &fc_stats->addlist);
 
 		mod_delayed_work(fc_stats->wq, &fc_stats->work, 0);
 	}
@@ -268,8 +269,7 @@ int mlx5_init_fc_stats(struct mlx5_core_dev *dev)
 	struct mlx5_fc_stats *fc_stats = &dev->priv.fc_stats;
 
 	fc_stats->counters = RB_ROOT;
-	INIT_LIST_HEAD(&fc_stats->addlist);
-	spin_lock_init(&fc_stats->addlist_lock);
+	init_llist_head(&fc_stats->addlist);
 
 	fc_stats->wq = create_singlethread_workqueue("mlx5_fc");
 	if (!fc_stats->wq)
@@ -284,6 +284,7 @@ int mlx5_init_fc_stats(struct mlx5_core_dev *dev)
 void mlx5_cleanup_fc_stats(struct mlx5_core_dev *dev)
 {
 	struct mlx5_fc_stats *fc_stats = &dev->priv.fc_stats;
+	struct llist_node *tmplist;
 	struct mlx5_fc *counter;
 	struct mlx5_fc *tmp;
 	struct rb_node *node;
@@ -292,13 +293,9 @@ void mlx5_cleanup_fc_stats(struct mlx5_core_dev *dev)
 	destroy_workqueue(dev->priv.fc_stats.wq);
 	dev->priv.fc_stats.wq = NULL;
 
-	list_for_each_entry_safe(counter, tmp, &fc_stats->addlist, list) {
-		list_del(&counter->list);
-
-		mlx5_cmd_fc_free(dev, counter->id);
-
-		kfree(counter);
-	}
+	tmplist = llist_del_all(&fc_stats->addlist);
+	llist_for_each_entry_safe(counter, tmp, tmplist, addlist)
+		mlx5_free_fc(dev, counter);
 
 	node = rb_first(&fc_stats->counters);
 	while (node) {
@@ -308,9 +305,7 @@ void mlx5_cleanup_fc_stats(struct mlx5_core_dev *dev)
 
 		rb_erase(&counter->node, &fc_stats->counters);
 
-		mlx5_cmd_fc_free(dev, counter->id);
-
-		kfree(counter);
+		mlx5_free_fc(dev, counter);
 	}
 }
 
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 7a452716de4b..c00549293982 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -584,9 +584,7 @@ struct mlx5_irq_info {
 
 struct mlx5_fc_stats {
 	struct rb_root counters;
-	struct list_head addlist;
-	/* protect addlist add/splice operations */
-	spinlock_t addlist_lock;
+	struct llist_head addlist;
 
 	struct workqueue_struct *wq;
 	struct delayed_work work;
-- 
2.17.1

^ permalink raw reply related

* [pull request][net-next 0/9] Mellanox, mlx5 ethernet updates 2018-09-05
From: Saeed Mahameed @ 2018-09-06  4:33 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Saeed Mahameed

Hi Dave,

This pull request provides some updates to mlx5 ethernet driver.

For more information please see tag log below.

Please pull and let me know if there's any problem.

Thanks,
Saeed.

---

The following changes since commit 05dcc71298643256948a2e17db7dbecc748719d2:

  net: lan743x_ptp: make function lan743x_ptp_set_sync_ts_insert() static (2018-09-05 08:07:05 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git tags/mlx5e-updates-2018-09-05

for you to fetch changes up to fe1dc069990c1f290ef6b99adb46332c03258f38:

  net/mlx5e: don't set CHECKSUM_COMPLETE on SCTP packets (2018-09-05 21:14:57 -0700)

----------------------------------------------------------------
mlx5e-updates-2018-09-05

This series provides updates to mlx5 ethernet driver.

1) Starting with a four patches series to optimize flow counters updates,
>From Vlad Buslov:
==============================================

By default mlx5 driver updates cached counters each second. Update function
consumes noticeable amount of CPU resources. The goal of this patch series
is to optimize update function.

Investigation revealed following bottlenecks in fs counters
implementation:
 1) Update code(scheduled each second) iterates over all counters twice.
 (first for finding and deleting counters that are marked for deletion,
 second iteration is for actually updating the counters)
 2) Counters are stored in rb tree. Linear iteration over all rb tree
 elements(rb_next in profiling data) consumed ~65% of time spent in
 update function.

Following optimizations were implemented:
 1) Instead of just marking counters for deletion, store them in
 standalone list. This removes first iteration over whole counters tree.
 2) Store counters in sorted list to optimize traversing them and remove
 calls to rb_next.

First implementation of these changes caused degradation of performance,
instead of improving it. Investigation revealed that there first cache
line of struct mlx5_fc is full and adding anything to it causes amount
of cache misses to double. To mitigate that, following refactorings were
implemented:
 - Change 'addlist' list type from double linked to single linked. This
 allowes to get free space for one additional pointer that is used to
 store deletion list(optimization 1)
 - Substitute rb tree with idr. Idr is non-intrusive data structure and
 doesn't require adding any new members to struct mlx5_fc. Use free
 space that became available for double linked sorted list that is used
 for traversing all counters. (optimization 2)

Described changes reduced CPU time spent in mlx5_fc_stats_work from 70%
to 44%. (global perf profile mode)
============================================

The rest of the series are misc updates:

2) From Kamal, Move mlx5e_priv_flags into en_ethtool.c, to avoid a
compilation warning.

3) From Roi Dayan, Move Q counters allocation and drop RQ to init_rx profile
function to avoid allocating Q counters when not required.

4) From Shay Agroskin, Replace PTP clock lock from RW lock to seq lock.
Almost double the packet rate when timestamping is active on multiple TX
queues.

5) From: Natali Shechtman, set ECN for received packets using CQE indication.

6) From: Alaa Hleihel, don't set CHECKSUM_COMPLETE on SCTP packets.
CHECKSUM_COMPLETE is not applicable to SCTP protocol.

----------------------------------------------------------------
Alaa Hleihel (1):
      net/mlx5e: don't set CHECKSUM_COMPLETE on SCTP packets

Kamal Heib (1):
      net/mlx5e: Move mlx5e_priv_flags into en_ethtool.c

Natali Shechtman (1):
      net/mlx5e: Set ECN for received packets using CQE indication

Roi Dayan (1):
      net/mlx5e: Move Q counters allocation and drop RQ to init_rx

Shay Agroskin (1):
      net/mlx5e: Replace PTP clock lock from RW lock to seq lock

Vlad Buslov (4):
      net/mlx5: Change flow counters addlist type to single linked list
      net/mlx5: Add new list to store deleted flow counters
      net/mlx5: Store flow counters in a list
      net/mlx5: Add flow counters idr

 drivers/net/ethernet/mellanox/mlx5/core/en.h       |  13 +-
 .../net/ethernet/mellanox/mlx5/core/en_ethtool.c   |   7 +
 drivers/net/ethernet/mellanox/mlx5/core/en_main.c  |  45 +++--
 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c   |  12 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_rx.c    |  47 +++++-
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.c |   3 +
 drivers/net/ethernet/mellanox/mlx5/core/en_stats.h |   2 +
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.h  |   5 +-
 .../net/ethernet/mellanox/mlx5/core/fs_counters.c  | 184 +++++++++++----------
 .../net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c  |  17 +-
 .../net/ethernet/mellanox/mlx5/core/lib/clock.c    |  34 ++--
 .../net/ethernet/mellanox/mlx5/core/lib/clock.h    |   8 +-
 include/linux/mlx5/driver.h                        |  11 +-
 13 files changed, 235 insertions(+), 153 deletions(-)

^ permalink raw reply

* [PATCH] mac80211: fix TX status reporting for ieee80211s
From: Yuan-Chi Pang @ 2018-09-06  8:57 UTC (permalink / raw)
  Cc: fu3mo6goo, Johannes Berg, David S. Miller, linux-wireless, netdev,
	linux-kernel
In-Reply-To: <1535963610.3437.10.camel@sipsolutions.net>

TX status reporting to ieee80211s is through ieee80211s_update_metric.
There are two problems about ieee80211s_update_metric:

1. The purpose is to estimate the fail probability
to a specific link. No need to restrict to data frame.

2. Current implementation does not work if wireless driver does not
pass tx_status with skb.

Fix this by removing ieee80211_is_data condition, passing
ieee80211_tx_status directly to ieee80211s_update_metric, and
putting it in both __ieee80211_tx_status and ieee80211_tx_status_ext.

Signed-off-by: Yuan-Chi Pang <fu3mo6goo@gmail.com>
---
 net/mac80211/mesh.h      | 2 +-
 net/mac80211/mesh_hwmp.c | 8 ++------
 net/mac80211/status.c    | 4 +++-
 3 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/net/mac80211/mesh.h b/net/mac80211/mesh.h
index ee56f18..7ffe56a 100644
--- a/net/mac80211/mesh.h
+++ b/net/mac80211/mesh.h
@@ -217,7 +217,7 @@ void mesh_rmc_free(struct ieee80211_sub_if_data *sdata);
 int mesh_rmc_init(struct ieee80211_sub_if_data *sdata);
 void ieee80211s_init(void);
 void ieee80211s_update_metric(struct ieee80211_local *local,
-			      struct sta_info *sta, struct sk_buff *skb);
+			 struct sta_info *sta, struct ieee80211_tx_status *st);
 void ieee80211_mesh_init_sdata(struct ieee80211_sub_if_data *sdata);
 void ieee80211_mesh_teardown_sdata(struct ieee80211_sub_if_data *sdata);
 int ieee80211_start_mesh(struct ieee80211_sub_if_data *sdata);
diff --git a/net/mac80211/mesh_hwmp.c b/net/mac80211/mesh_hwmp.c
index daf9db3..4286673 100644
--- a/net/mac80211/mesh_hwmp.c
+++ b/net/mac80211/mesh_hwmp.c
@@ -295,15 +295,11 @@ int mesh_path_error_tx(struct ieee80211_sub_if_data *sdata,
 }
 
 void ieee80211s_update_metric(struct ieee80211_local *local,
-		struct sta_info *sta, struct sk_buff *skb)
+		struct sta_info *sta, struct ieee80211_tx_status *st)
 {
-	struct ieee80211_tx_info *txinfo = IEEE80211_SKB_CB(skb);
-	struct ieee80211_hdr *hdr = (struct ieee80211_hdr *) skb->data;
+	struct ieee80211_tx_info *txinfo = st->info;
 	int failed;
 
-	if (!ieee80211_is_data(hdr->frame_control))
-		return;
-
 	failed = !(txinfo->flags & IEEE80211_TX_STAT_ACK);
 
 	/* moving average, scaled to 100.
diff --git a/net/mac80211/status.c b/net/mac80211/status.c
index 9a6d720..dfcd419 100644
--- a/net/mac80211/status.c
+++ b/net/mac80211/status.c
@@ -811,7 +811,7 @@ static void __ieee80211_tx_status(struct ieee80211_hw *hw,
 
 		rate_control_tx_status(local, sband, status);
 		if (ieee80211_vif_is_mesh(&sta->sdata->vif))
-			ieee80211s_update_metric(local, sta, skb);
+			ieee80211s_update_metric(local, sta, status);
 
 		if (!(info->flags & IEEE80211_TX_CTL_INJECTED) && acked)
 			ieee80211_frame_acked(sta, skb);
@@ -972,6 +972,8 @@ void ieee80211_tx_status_ext(struct ieee80211_hw *hw,
 		}
 
 		rate_control_tx_status(local, sband, status);
+		if (ieee80211_vif_is_mesh(&sta->sdata->vif))
+			ieee80211s_update_metric(local, sta, status);
 	}
 
 	if (acked || noack_success) {
-- 
2.7.4

^ permalink raw reply related

* general protection fault in tcp_cleanup_ulp
From: syzbot @ 2018-09-06  8:50 UTC (permalink / raw)
  To: davem, edumazet, kuznet, linux-kernel, netdev, syzkaller-bugs,
	yoshfuji

Hello,

syzbot found the following crash on:

HEAD commit:    b36fdc6853a3 Merge tag 'gpio-v4.19-2' of git://git.kernel...
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1589eb7a400000
kernel config:  https://syzkaller.appspot.com/x/.config?x=4c7e83258d6e0156
dashboard link: https://syzkaller.appspot.com/bug?extid=0b3ccd4f62dac2cf3a7d
compiler:       gcc (GCC) 8.0.1 20180413 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+0b3ccd4f62dac2cf3a7d@syzkaller.appspotmail.com

ISOFS: Unable to identify CD-ROM format.
ISOFS: Unable to identify CD-ROM format.
ISOFS: Unable to identify CD-ROM format.
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
CPU: 1 PID: 27135 Comm: syz-executor0 Not tainted 4.19.0-rc2+ #2
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
RIP: 0010:tcp_cleanup_ulp+0xbe/0x140 net/ipv4/tcp_ulp.c:131
Code: 02 00 0f 85 8a 00 00 00 49 8b 9c 24 88 06 00 00 e8 b7 f8 cd fb 48 8d  
7b 38 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 75  
51 48 8b 7b 38 e8 93 5c c2 fb 4c 89 ea 48 b8 00 00
RSP: 0018:ffff8801c754fb80 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffc90001ed8000
RDX: 0000000000000007 RSI: ffffffff85aed689 RDI: 0000000000000038
RBP: ffff8801c754fba0 R08: 1ffff10038ea9f3c R09: ffffed003b6246de
R10: 0000000000000003 R11: 0000000000000001 R12: ffff8801d7db0440
R13: ffff8801d7db0ac8 R14: ffffffff819162e0 R15: dffffc0000000000
FS:  00007fb062d18700(0000) GS:ffff8801db100000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffe020cfbf0 CR3: 00000001ad873000 CR4: 00000000001426e0
Call Trace:
  smap_release_sock+0x2e5/0x380 kernel/bpf/sockmap.c:1474
  sock_map_delete_elem+0x3e9/0x5a0 kernel/bpf/sockmap.c:1813
  map_delete_elem+0x32e/0x4e0 kernel/bpf/syscall.c:882
  __do_sys_bpf kernel/bpf/syscall.c:2366 [inline]
  __se_sys_bpf kernel/bpf/syscall.c:2334 [inline]
  __x64_sys_bpf+0x342/0x510 kernel/bpf/syscall.c:2334
  do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x457099
Code: fd b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 cb b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fb062d17c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
RAX: ffffffffffffffda RBX: 00007fb062d186d4 RCX: 0000000000457099
RDX: 0000000000000010 RSI: 0000000020000040 RDI: 0000000000000003
RBP: 00000000009300a0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 00000000004cb908 R14: 00000000004c32ca R15: 0000000000000000
Modules linked in:
Dumping ftrace buffer:
---------------------------------
CPU:1 [LOST 1631 EVENTS]
syz-exec-9417    1...2 282913244us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913249us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913254us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913260us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913265us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913270us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913275us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913280us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913285us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913290us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913295us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913313us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913318us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913323us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913327us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913332us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913337us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913342us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913347us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913363us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913368us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913387us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913392us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913409us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913414us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913419us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913424us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913428us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913434us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913439us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913444us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913449us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913454us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913459us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913464us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913469us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913474us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913479us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913484us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913488us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913493us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913498us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913503us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913508us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913513us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913518us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913523us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913528us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913533us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913538us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913543us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913548us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913553us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913558us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913563us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913568us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913573us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913578us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913583us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913588us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913593us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913598us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913603us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913608us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913613us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913617us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913622us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913628us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913633us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913637us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913642us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913647us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913659us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913665us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913670us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913675us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913680us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913685us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913690us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913695us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913700us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913705us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913710us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913715us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913720us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913725us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913730us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913735us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913740us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913745us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913750us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913755us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913760us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913764us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913769us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913774us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913779us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913784us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913789us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913794us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913799us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913804us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913811us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913816us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913821us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913826us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913831us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913836us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913841us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913846us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913851us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913856us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913861us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913866us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913871us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913876us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913881us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913885us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913890us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913895us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913905us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913910us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913915us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913920us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913931us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282913936us : 0: u00000000a32ec4ab	
syz-exec-9417    1.N.2 282914013us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914059us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914065us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914069us : 0: u00000000a32ec4ab	
syz-exec-9417    1.N.2 282914133us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914255us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914261us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914266us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914271us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914276us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914281us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914286us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914291us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914295us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914300us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914304us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914308us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914313us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914317us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914321us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914326us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914331us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914336us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914341us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914346us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914351us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914356us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914361us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914366us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914370us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914375us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914380us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914385us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914390us : 0: u00000000a32ec4ab	
syz-exec-9417    1...2 282914395us : 0: u00000000a32ec4ab	
---------------------------------
---[ end trace 9de9b5e392d96e98 ]---
RIP: 0010:tcp_cleanup_ulp+0xbe/0x140 net/ipv4/tcp_ulp.c:131
Code: 02 00 0f 85 8a 00 00 00 49 8b 9c 24 88 06 00 00 e8 b7 f8 cd fb 48 8d  
7b 38 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 75  
51 48 8b 7b 38 e8 93 5c c2 fb 4c 89 ea 48 b8 00 00
RSP: 0018:ffff8801c754fb80 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffffc90001ed8000
RDX: 0000000000000007 RSI: ffffffff85aed689 RDI: 0000000000000038
RBP: ffff8801c754fba0 R08: 1ffff10038ea9f3c R09: ffffed003b6246de
R10: 0000000000000003 R11: 0000000000000001 R12: ffff8801d7db0440
R13: ffff8801d7db0ac8 R14: ffffffff819162e0 R15: dffffc0000000000
FS:  00007fb062d18700(0000) GS:ffff8801db100000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffe020cfbf0 CR3: 00000001ad873000 CR4: 00000000001426e0


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.

^ permalink raw reply

* [net 09/10] net/mlx5e: Ethtool steering, fix udp source port value
From: Saeed Mahameed @ 2018-09-06  4:09 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Saeed Mahameed
In-Reply-To: <20180906040952.29684-1-saeedm@mellanox.com>

Copy and paste bug was introduced in the offending patch.
We need to write udp source port value into the headers value and not
headers criteria "mask".

Fixes: 142644f8a1f8 ("net/mlx5e: Ethtool steering flow parsing refactoring")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_fs_ethtool.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_fs_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_fs_ethtool.c
index 75bb981e00b7..41cde926cdab 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_fs_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_fs_ethtool.c
@@ -191,7 +191,7 @@ set_udp(void *headers_c, void *headers_v, __be16 psrc_m, __be16 psrc_v,
 {
 	if (psrc_m) {
 		MLX5E_FTE_SET(headers_c, udp_sport, 0xffff);
-		MLX5E_FTE_SET(headers_c, udp_sport, ntohs(psrc_v));
+		MLX5E_FTE_SET(headers_v, udp_sport, ntohs(psrc_v));
 	}
 
 	if (pdst_m) {
-- 
2.17.1

^ permalink raw reply related

* [net 10/10] net/mlx5: Fix possible deadlock from lockdep when adding fte to fg
From: Saeed Mahameed @ 2018-09-06  4:09 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Roi Dayan, Saeed Mahameed
In-Reply-To: <20180906040952.29684-1-saeedm@mellanox.com>

From: Roi Dayan <roid@mellanox.com>

This is a false positive report due to incorrect nested lock
annotations as we lock multiple fgs with the same subclass.
Instead of locking all fgs only lock the one being used as was
done before.

Fixes: bd71b08ec2ee ("net/mlx5: Support multiple updates of steering rules in parallel")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 .../net/ethernet/mellanox/mlx5/core/fs_core.c | 74 +++++++++----------
 1 file changed, 37 insertions(+), 37 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
index 384b560f2a93..37d114c668b7 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -1578,6 +1578,33 @@ static u64 matched_fgs_get_version(struct list_head *match_head)
 	return version;
 }
 
+static struct fs_fte *
+lookup_fte_locked(struct mlx5_flow_group *g,
+		  u32 *match_value,
+		  bool take_write)
+{
+	struct fs_fte *fte_tmp;
+
+	if (take_write)
+		nested_down_write_ref_node(&g->node, FS_LOCK_PARENT);
+	else
+		nested_down_read_ref_node(&g->node, FS_LOCK_PARENT);
+	fte_tmp = rhashtable_lookup_fast(&g->ftes_hash, match_value,
+					 rhash_fte);
+	if (!fte_tmp || !tree_get_node(&fte_tmp->node)) {
+		fte_tmp = NULL;
+		goto out;
+	}
+
+	nested_down_write_ref_node(&fte_tmp->node, FS_LOCK_CHILD);
+out:
+	if (take_write)
+		up_write_ref_node(&g->node);
+	else
+		up_read_ref_node(&g->node);
+	return fte_tmp;
+}
+
 static struct mlx5_flow_handle *
 try_add_to_existing_fg(struct mlx5_flow_table *ft,
 		       struct list_head *match_head,
@@ -1600,10 +1627,6 @@ try_add_to_existing_fg(struct mlx5_flow_table *ft,
 	if (IS_ERR(fte))
 		return  ERR_PTR(-ENOMEM);
 
-	list_for_each_entry(iter, match_head, list) {
-		nested_down_read_ref_node(&iter->g->node, FS_LOCK_PARENT);
-	}
-
 search_again_locked:
 	version = matched_fgs_get_version(match_head);
 	/* Try to find a fg that already contains a matching fte */
@@ -1611,20 +1634,9 @@ try_add_to_existing_fg(struct mlx5_flow_table *ft,
 		struct fs_fte *fte_tmp;
 
 		g = iter->g;
-		fte_tmp = rhashtable_lookup_fast(&g->ftes_hash, spec->match_value,
-						 rhash_fte);
-		if (!fte_tmp || !tree_get_node(&fte_tmp->node))
+		fte_tmp = lookup_fte_locked(g, spec->match_value, take_write);
+		if (!fte_tmp)
 			continue;
-
-		nested_down_write_ref_node(&fte_tmp->node, FS_LOCK_CHILD);
-		if (!take_write) {
-			list_for_each_entry(iter, match_head, list)
-				up_read_ref_node(&iter->g->node);
-		} else {
-			list_for_each_entry(iter, match_head, list)
-				up_write_ref_node(&iter->g->node);
-		}
-
 		rule = add_rule_fg(g, spec->match_value,
 				   flow_act, dest, dest_num, fte_tmp);
 		up_write_ref_node(&fte_tmp->node);
@@ -1633,19 +1645,6 @@ try_add_to_existing_fg(struct mlx5_flow_table *ft,
 		return rule;
 	}
 
-	/* No group with matching fte found. Try to add a new fte to any
-	 * matching fg.
-	 */
-
-	if (!take_write) {
-		list_for_each_entry(iter, match_head, list)
-			up_read_ref_node(&iter->g->node);
-		list_for_each_entry(iter, match_head, list)
-			nested_down_write_ref_node(&iter->g->node,
-						   FS_LOCK_PARENT);
-		take_write = true;
-	}
-
 	/* Check the ft version, for case that new flow group
 	 * was added while the fgs weren't locked
 	 */
@@ -1657,27 +1656,30 @@ try_add_to_existing_fg(struct mlx5_flow_table *ft,
 	/* Check the fgs version, for case the new FTE with the
 	 * same values was added while the fgs weren't locked
 	 */
-	if (version != matched_fgs_get_version(match_head))
+	if (version != matched_fgs_get_version(match_head)) {
+		take_write = true;
 		goto search_again_locked;
+	}
 
 	list_for_each_entry(iter, match_head, list) {
 		g = iter->g;
 
 		if (!g->node.active)
 			continue;
+
+		nested_down_write_ref_node(&g->node, FS_LOCK_PARENT);
+
 		err = insert_fte(g, fte);
 		if (err) {
+			up_write_ref_node(&g->node);
 			if (err == -ENOSPC)
 				continue;
-			list_for_each_entry(iter, match_head, list)
-				up_write_ref_node(&iter->g->node);
 			kmem_cache_free(steering->ftes_cache, fte);
 			return ERR_PTR(err);
 		}
 
 		nested_down_write_ref_node(&fte->node, FS_LOCK_CHILD);
-		list_for_each_entry(iter, match_head, list)
-			up_write_ref_node(&iter->g->node);
+		up_write_ref_node(&g->node);
 		rule = add_rule_fg(g, spec->match_value,
 				   flow_act, dest, dest_num, fte);
 		up_write_ref_node(&fte->node);
@@ -1686,8 +1688,6 @@ try_add_to_existing_fg(struct mlx5_flow_table *ft,
 	}
 	rule = ERR_PTR(-ENOENT);
 out:
-	list_for_each_entry(iter, match_head, list)
-		up_write_ref_node(&iter->g->node);
 	kmem_cache_free(steering->ftes_cache, fte);
 	return rule;
 }
-- 
2.17.1

^ permalink raw reply related

* [net 08/10] net/mlx5: Check for error in mlx5_attach_interface
From: Saeed Mahameed @ 2018-09-06  4:09 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Huy Nguyen, Saeed Mahameed
In-Reply-To: <20180906040952.29684-1-saeedm@mellanox.com>

From: Huy Nguyen <huyn@mellanox.com>

Currently, mlx5_attach_interface does not check for error
after calling intf->attach or intf->add. When these two calls
fails, the client is not initialized and will cause issues such as
kernel panic on invalid address in the teardown path (mlx5_detach_interface)

Fixes: 737a234bb638 ("net/mlx5: Introduce attach/detach to interface API")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/dev.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/dev.c b/drivers/net/ethernet/mellanox/mlx5/core/dev.c
index ada723bd91b6..37ba7c78859d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/dev.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/dev.c
@@ -132,11 +132,11 @@ void mlx5_add_device(struct mlx5_interface *intf, struct mlx5_priv *priv)
 	delayed_event_start(priv);
 
 	dev_ctx->context = intf->add(dev);
-	set_bit(MLX5_INTERFACE_ADDED, &dev_ctx->state);
-	if (intf->attach)
-		set_bit(MLX5_INTERFACE_ATTACHED, &dev_ctx->state);
-
 	if (dev_ctx->context) {
+		set_bit(MLX5_INTERFACE_ADDED, &dev_ctx->state);
+		if (intf->attach)
+			set_bit(MLX5_INTERFACE_ATTACHED, &dev_ctx->state);
+
 		spin_lock_irq(&priv->ctx_lock);
 		list_add_tail(&dev_ctx->list, &priv->ctx_list);
 
@@ -211,12 +211,17 @@ static void mlx5_attach_interface(struct mlx5_interface *intf, struct mlx5_priv
 	if (intf->attach) {
 		if (test_bit(MLX5_INTERFACE_ATTACHED, &dev_ctx->state))
 			goto out;
-		intf->attach(dev, dev_ctx->context);
+		if (intf->attach(dev, dev_ctx->context))
+			goto out;
+
 		set_bit(MLX5_INTERFACE_ATTACHED, &dev_ctx->state);
 	} else {
 		if (test_bit(MLX5_INTERFACE_ADDED, &dev_ctx->state))
 			goto out;
 		dev_ctx->context = intf->add(dev);
+		if (!dev_ctx->context)
+			goto out;
+
 		set_bit(MLX5_INTERFACE_ADDED, &dev_ctx->state);
 	}
 
-- 
2.17.1

^ permalink raw reply related

* [net 05/10] net/mlx5: E-Switch, Fix memory leak when creating switchdev mode FDB tables
From: Saeed Mahameed @ 2018-09-06  4:09 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Raed Salem, Saeed Mahameed
In-Reply-To: <20180906040952.29684-1-saeedm@mellanox.com>

From: Raed Salem <raeds@mellanox.com>

The memory allocated for the slow path table flow group input structure
was not freed upon successful return, fix that.

Fixes: 1967ce6ea5c8 ("net/mlx5: E-Switch, Refactor fast path FDB table creation in switchdev mode")
Signed-off-by: Raed Salem <raeds@mellanox.com>
Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index f72b5c9dcfe9..3028e8d90920 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -663,6 +663,7 @@ static int esw_create_offloads_fdb_tables(struct mlx5_eswitch *esw, int nvports)
 	if (err)
 		goto miss_rule_err;
 
+	kvfree(flow_group_in);
 	return 0;
 
 miss_rule_err:
-- 
2.17.1

^ permalink raw reply related

* [net 07/10] net/mlx5: Consider PCI domain in search for next dev
From: Saeed Mahameed @ 2018-09-06  4:09 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Daniel Jurgens, Saeed Mahameed
In-Reply-To: <20180906040952.29684-1-saeedm@mellanox.com>

From: Daniel Jurgens <danielj@mellanox.com>

The PCI BDF is not unique. PCI domain must also be considered when
searching for the next physical device during lag setup. Example below:

mlx5_core 0000:01:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(128) RxCqeCmprss(0)
mlx5_core 0000:01:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(128) RxCqeCmprss(0)
mlx5_core 0001:01:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(128) RxCqeCmprss(0)
mlx5_core 0001:01:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(128) RxCqeCmprss(0)

Signed-off-by: Daniel Jurgens <danielj@mellanox.com>
Reviewed-by: Aviv Heller <avivh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/dev.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/dev.c b/drivers/net/ethernet/mellanox/mlx5/core/dev.c
index b994b80d5714..ada723bd91b6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/dev.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/dev.c
@@ -391,16 +391,17 @@ void mlx5_remove_dev_by_protocol(struct mlx5_core_dev *dev, int protocol)
 		}
 }
 
-static u16 mlx5_gen_pci_id(struct mlx5_core_dev *dev)
+static u32 mlx5_gen_pci_id(struct mlx5_core_dev *dev)
 {
-	return (u16)((dev->pdev->bus->number << 8) |
+	return (u32)((pci_domain_nr(dev->pdev->bus) << 16) |
+		     (dev->pdev->bus->number << 8) |
 		     PCI_SLOT(dev->pdev->devfn));
 }
 
 /* Must be called with intf_mutex held */
 struct mlx5_core_dev *mlx5_get_next_phys_dev(struct mlx5_core_dev *dev)
 {
-	u16 pci_id = mlx5_gen_pci_id(dev);
+	u32 pci_id = mlx5_gen_pci_id(dev);
 	struct mlx5_core_dev *res = NULL;
 	struct mlx5_core_dev *tmp_dev;
 	struct mlx5_priv *priv;
-- 
2.17.1

^ permalink raw reply related

* [net 06/10] net/mlx5: Fix not releasing read lock when adding flow rules
From: Saeed Mahameed @ 2018-09-06  4:09 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Roi Dayan, Saeed Mahameed
In-Reply-To: <20180906040952.29684-1-saeedm@mellanox.com>

From: Roi Dayan <roid@mellanox.com>

If building match list fg fails and we never jumped to
search_again_locked label then the function returned without
unlocking the read lock.

Fixes: bd71b08ec2ee ("net/mlx5: Support multiple updates of steering rules in parallel")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Maor Gottlieb <maorg@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
index f418541af7cf..384b560f2a93 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
@@ -1726,6 +1726,8 @@ _mlx5_add_flow_rules(struct mlx5_flow_table *ft,
 	if (err) {
 		if (take_write)
 			up_write_ref_node(&ft->node);
+		else
+			up_read_ref_node(&ft->node);
 		return ERR_PTR(err);
 	}
 
-- 
2.17.1

^ permalink raw reply related

* [net 03/10] net/mlx5: Use u16 for Work Queue buffer fragment size
From: Saeed Mahameed @ 2018-09-06  4:09 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Tariq Toukan, Saeed Mahameed
In-Reply-To: <20180906040952.29684-1-saeedm@mellanox.com>

From: Tariq Toukan <tariqt@mellanox.com>

Minimal stride size is 16.
Hence, the number of strides in a fragment (of PAGE_SIZE)
is <= PAGE_SIZE / 16 <= 4K.

u16 is sufficient to represent this.

Fixes: 388ca8be0037 ("IB/mlx5: Implement fragmented completion queue (CQ)")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/wq.c | 4 ++--
 drivers/net/ethernet/mellanox/mlx5/core/wq.h | 2 +-
 include/linux/mlx5/driver.h                  | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/wq.c b/drivers/net/ethernet/mellanox/mlx5/core/wq.c
index c8c315eb5128..d838af9539b1 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/wq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/wq.c
@@ -39,9 +39,9 @@ u32 mlx5_wq_cyc_get_size(struct mlx5_wq_cyc *wq)
 	return (u32)wq->fbc.sz_m1 + 1;
 }
 
-u32 mlx5_wq_cyc_get_frag_size(struct mlx5_wq_cyc *wq)
+u16 mlx5_wq_cyc_get_frag_size(struct mlx5_wq_cyc *wq)
 {
-	return (u32)wq->fbc.frag_sz_m1 + 1;
+	return wq->fbc.frag_sz_m1 + 1;
 }
 
 u32 mlx5_cqwq_get_size(struct mlx5_cqwq *wq)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/wq.h b/drivers/net/ethernet/mellanox/mlx5/core/wq.h
index 2bd4c3184eba..3a1a170bb2d7 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/wq.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/wq.h
@@ -80,7 +80,7 @@ int mlx5_wq_cyc_create(struct mlx5_core_dev *mdev, struct mlx5_wq_param *param,
 		       void *wqc, struct mlx5_wq_cyc *wq,
 		       struct mlx5_wq_ctrl *wq_ctrl);
 u32 mlx5_wq_cyc_get_size(struct mlx5_wq_cyc *wq);
-u32 mlx5_wq_cyc_get_frag_size(struct mlx5_wq_cyc *wq);
+u16 mlx5_wq_cyc_get_frag_size(struct mlx5_wq_cyc *wq);
 
 int mlx5_wq_qp_create(struct mlx5_core_dev *mdev, struct mlx5_wq_param *param,
 		      void *qpc, struct mlx5_wq_qp *wq,
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index aa65f58c6610..3a1258fd8ac3 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -362,7 +362,7 @@ struct mlx5_frag_buf {
 struct mlx5_frag_buf_ctrl {
 	struct mlx5_frag_buf	frag_buf;
 	u32			sz_m1;
-	u32			frag_sz_m1;
+	u16			frag_sz_m1;
 	u32			strides_offset;
 	u8			log_sz;
 	u8			log_stride;
-- 
2.17.1

^ permalink raw reply related

* [net 02/10] net/mlx5: Fix debugfs cleanup in the device init/remove flow
From: Saeed Mahameed @ 2018-09-06  4:09 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Jack Morgenstein, Saeed Mahameed
In-Reply-To: <20180906040952.29684-1-saeedm@mellanox.com>

From: Jack Morgenstein <jackm@dev.mellanox.co.il>

When initializing the device (procedure init_one), the driver
calls mlx5_pci_init to perform pci initialization. As part of this
initialization, mlx5_pci_init creates a debugfs directory.
If this creation fails, init_one aborts, returning failure to
the caller (which is the probe method caller).

The main reason for such a failure to occur is if the debugfs
directory already exists. This can happen if the last time
mlx5_pci_close was called, debugfs_remove (silently) failed due
to the debugfs directory not being empty.

Guarantee that such a debugfs_remove failure will not occur by
instead calling debugfs_remove_recursive in procedure mlx5_pci_close.

Fixes: 59211bd3b632 ("net/mlx5: Split the load/unload flow into hardware and software flows")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/main.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 739aad0a0b35..b5e9f664fc66 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -878,8 +878,10 @@ static int mlx5_pci_init(struct mlx5_core_dev *dev, struct mlx5_priv *priv)
 	priv->numa_node = dev_to_node(&dev->pdev->dev);
 
 	priv->dbg_root = debugfs_create_dir(dev_name(&pdev->dev), mlx5_debugfs_root);
-	if (!priv->dbg_root)
+	if (!priv->dbg_root) {
+		dev_err(&pdev->dev, "Cannot create debugfs dir, aborting\n");
 		return -ENOMEM;
+	}
 
 	err = mlx5_pci_enable_device(dev);
 	if (err) {
@@ -928,7 +930,7 @@ static void mlx5_pci_close(struct mlx5_core_dev *dev, struct mlx5_priv *priv)
 	pci_clear_master(dev->pdev);
 	release_bar(dev->pdev);
 	mlx5_pci_disable_device(dev);
-	debugfs_remove(priv->dbg_root);
+	debugfs_remove_recursive(priv->dbg_root);
 }
 
 static int mlx5_init_once(struct mlx5_core_dev *dev, struct mlx5_priv *priv)
-- 
2.17.1

^ permalink raw reply related

* [net 04/10] net/mlx5: Use u16 for Work Queue buffer strides offset
From: Saeed Mahameed @ 2018-09-06  4:09 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, Tariq Toukan, Saeed Mahameed
In-Reply-To: <20180906040952.29684-1-saeedm@mellanox.com>

From: Tariq Toukan <tariqt@mellanox.com>

Minimal stride size is 16.
Hence, the number of strides in a fragment (of PAGE_SIZE)
is <= PAGE_SIZE / 16 <= 4K.

u16 is sufficient to represent this.

Fixes: d7037ad73daa ("net/mlx5: Fix QP fragmented buffer allocation")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/wq.c | 2 +-
 include/linux/mlx5/driver.h                  | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/wq.c b/drivers/net/ethernet/mellanox/mlx5/core/wq.c
index d838af9539b1..68e7f8df2a6d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/wq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/wq.c
@@ -138,7 +138,7 @@ int mlx5_wq_qp_create(struct mlx5_core_dev *mdev, struct mlx5_wq_param *param,
 		      void *qpc, struct mlx5_wq_qp *wq,
 		      struct mlx5_wq_ctrl *wq_ctrl)
 {
-	u32 sq_strides_offset;
+	u16 sq_strides_offset;
 	u32 rq_pg_remainder;
 	int err;
 
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 3a1258fd8ac3..66d94b4557cf 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -363,7 +363,7 @@ struct mlx5_frag_buf_ctrl {
 	struct mlx5_frag_buf	frag_buf;
 	u32			sz_m1;
 	u16			frag_sz_m1;
-	u32			strides_offset;
+	u16			strides_offset;
 	u8			log_sz;
 	u8			log_stride;
 	u8			log_frag_strides;
@@ -995,7 +995,7 @@ static inline u32 mlx5_base_mkey(const u32 key)
 }
 
 static inline void mlx5_fill_fbc_offset(u8 log_stride, u8 log_sz,
-					u32 strides_offset,
+					u16 strides_offset,
 					struct mlx5_frag_buf_ctrl *fbc)
 {
 	fbc->log_stride = log_stride;
-- 
2.17.1

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox