* [PATCH net-next V3 00/11] Mellanox 100G mlx5 driver receive path optimizations
From: Saeed Mahameed @ 2016-04-20 19:02 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Or Gerlitz, Tal Alon, Tariq Toukan, Eran Ben Elisha,
Eric Dumazet, Jesper Dangaard Brouer, Saeed Mahameed
Hello Dave,
Changes from V2:
- Rebased to 46e7b8d8d53b ("net: dsa: kill circular reference with slave priv")
- Updated: ("net/mlx5e: Support RX multi-packet WQE (Striding RQ)")
* Per Eric Dumazet comment we changed the driver memory handling scheme to
work with order-0 pages rather than order-5 via split_page().
* This means that now a mlx5e rx skb can hold one or (more in case of HW LRO)
skb frag each pointing to a 4K order-0 page rather than one frag with order-5 page.
- Updated: ("net/mlx5e: Add fragmented memory support for RX multi packet WQE")
* Code refactoring and code reuse due the split_page() mechanism,
now the MPWQE and fragmented MPWQE handling almost look the same,
and share most of the code.
- In some cases we see 2%-3% packet rate degradation in comparison to the order-5 pages approach,
due to split_page() cpu consumption, but still we do see 3%-10% improvement in comparison to the
current linear SKB approach.
- We do believe that now the driver memory scheme is significantly less vulnerable
to the memory DOS attack Eric pointed at.
Changes from V1:
- Rebased to efde611b0afa ("Merge branch 'nfp-next'")
- Dropped: ("net/mlx5: Refactor mlx5_core_mr to mkey")
Already merged into 4.6 from rdma tree.
- Dropped: ("net/mlx5_core: Add ConnectX-5 to list of supported devices")
Will be pushed to net as we want it in 4.6 release.
- Dropped: ("net/mlx5e: Change RX moderation period to be based on CQE")
Will be pushed in a later series with full software based adaptive moderation.
- Added: ("net/mlx5e: Delay skb->data access")
Small trivial optimization.
- Updated: ("net/mlx5e: Support RX multi-packet WQE (Striding RQ)")
Changed Striding RQ defaults to:
> NUM WQEs = 16
> Strides Per WQE = 1024
> Stride Size = 128
- Updated: ("net/mlx5e: Use napi_alloc_skb for RX SKB allocations")
Consider the IP packet alignment already done in napi_alloc_skb.
Changes from V0:
- Fixed a typo in commit message reported by Sergei
- Align SKB fragments truesize to stride size
- Use skb_add_rx_frag and remove the use of SKB_TRUESIZE
- Fix: # MTTs alignment on Power PC
- Fix: Free original (unaligned) pointer of MTT array
- Use dev_alloc_pages and dev_alloc_page
- Extend the stats.buff_alloc_err counter
- Reform the copying of packet header into skb linear data
- Add compiler hints for conditional statements
- Prefetch skd->data prior to copying packet header into it
- Rework: mlx5e_complete_rx_fragmented_mpwqe
- Handle SKB fragments before linear data
- Dropped ("net/mlx5e: Prefetch next RX CQE") for now
- Added a small patch that Adds ConnectX-5 devices to the list of supported devices
- Rebased to 1cdba5505555 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next")
This series includes Some RX modifications and optimizations for
the mlx5 Ethernet driver.
>From Rana, we have one patch that adds the support for Connectx-4
queue counters.
>From Tariq, several patches that are centralized around improving
RX path message rate, CPU and Memory utilization, in each patch
commit message you will find the performance improvements numbers
related to that specific patch.
In the 2nd patch we used a queue counter to report "out of buffer"
dropped packet count, "Dropped packets due to lack of software resources"
3rd patch modifies the driver's to RSS default value to be spread along the
close NUMA node cores only for better out of the box experience.
In the 4th and 5th patches we utilized the use of RX multi-packet WQE
(Striding RQ) for better memory utilization especially in case of hardware
LRO is enabled and for better message rate for small packets.
In the 6th and 7th patches we added a fallback mechanism to use fragmented
memory when allocating large WQE strides fails, using UMR
(User Memory Registration) and ICO (Internal Control Operations) SQs.
In the 8th to 11th patches we did some small modification which show some small
extra improvements.
Thanks,
Saeed
Rana Shahout (1):
net/mlx5e: Allocate set of queue counters per netdev
Saeed Mahameed (1):
net/mlx5e: Delay skb->data access
Tariq Toukan (9):
net/mlx5: Introduce device queue counters
net/mlx5e: Use only close NUMA node for default RSS
net/mlx5e: Use function pointers for RX data path handling
net/mlx5e: Support RX multi-packet WQE (Striding RQ)
net/mlx5e: Added ICO SQs
net/mlx5e: Add fragmented memory support for RX multi packet WQE
net/mlx5e: Use napi_alloc_skb for RX SKB allocations
net/mlx5e: Remove redundant barrier
net/mlx5e: Add ethtool counter for RX buffer allocation failures
drivers/net/ethernet/mellanox/mlx5/core/en.h | 202 +++++++-
.../net/ethernet/mellanox/mlx5/core/en_ethtool.c | 28 +-
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 361 +++++++++++--
drivers/net/ethernet/mellanox/mlx5/core/en_rx.c | 566 ++++++++++++++++++--
drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 6 +-
drivers/net/ethernet/mellanox/mlx5/core/en_txrx.c | 59 ++-
drivers/net/ethernet/mellanox/mlx5/core/qp.c | 68 +++
include/linux/mlx5/device.h | 39 ++-
include/linux/mlx5/qp.h | 6 +
9 files changed, 1202 insertions(+), 133 deletions(-)
^ permalink raw reply
* [PATCH net-next V3 02/11] net/mlx5e: Allocate set of queue counters per netdev
From: Saeed Mahameed @ 2016-04-20 19:02 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Or Gerlitz, Tal Alon, Tariq Toukan, Eran Ben Elisha,
Eric Dumazet, Jesper Dangaard Brouer, Rana Shahout,
Saeed Mahameed
In-Reply-To: <1461178939-20687-1-git-send-email-saeedm@mellanox.com>
From: Rana Shahout <ranas@mellanox.com>
Connect all netdev RQs to this set of queue counters.
Also, add an "rx_out_of_buffer" counter to ethtool,
which indicates RX packet drops due to lack of receive
buffers.
Signed-off-by: Rana Shahout <ranas@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en.h | 11 +++++
.../net/ethernet/mellanox/mlx5/core/en_ethtool.c | 11 +++++
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 42 +++++++++++++++++++-
3 files changed, 62 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 879e627..c4ddbe8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -236,6 +236,15 @@ struct mlx5e_pport_stats {
__be64 RFC_2819_counters[NUM_RFC_2819_COUNTERS];
};
+static const char qcounter_stats_strings[][ETH_GSTRING_LEN] = {
+ "rx_out_of_buffer",
+};
+
+struct mlx5e_qcounter_stats {
+ u32 rx_out_of_buffer;
+#define NUM_Q_COUNTERS 1
+};
+
static const char rq_stats_strings[][ETH_GSTRING_LEN] = {
"packets",
"bytes",
@@ -293,6 +302,7 @@ struct mlx5e_sq_stats {
struct mlx5e_stats {
struct mlx5e_vport_stats vport;
struct mlx5e_pport_stats pport;
+ struct mlx5e_qcounter_stats qcnt;
};
struct mlx5e_params {
@@ -575,6 +585,7 @@ struct mlx5e_priv {
struct net_device *netdev;
struct mlx5e_stats stats;
struct mlx5e_tstamp tstamp;
+ u16 q_counter;
};
#define MLX5E_NET_IP_ALIGN 2
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index 68834b7..39c1902 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -165,6 +165,8 @@ static const struct {
},
};
+#define MLX5E_NUM_Q_CNTRS(priv) (NUM_Q_COUNTERS * (!!priv->q_counter))
+
static int mlx5e_get_sset_count(struct net_device *dev, int sset)
{
struct mlx5e_priv *priv = netdev_priv(dev);
@@ -172,6 +174,7 @@ static int mlx5e_get_sset_count(struct net_device *dev, int sset)
switch (sset) {
case ETH_SS_STATS:
return NUM_VPORT_COUNTERS + NUM_PPORT_COUNTERS +
+ MLX5E_NUM_Q_CNTRS(priv) +
priv->params.num_channels * NUM_RQ_STATS +
priv->params.num_channels * priv->params.num_tc *
NUM_SQ_STATS;
@@ -200,6 +203,11 @@ static void mlx5e_get_strings(struct net_device *dev,
strcpy(data + (idx++) * ETH_GSTRING_LEN,
vport_strings[i]);
+ /* Q counters */
+ for (i = 0; i < MLX5E_NUM_Q_CNTRS(priv); i++)
+ strcpy(data + (idx++) * ETH_GSTRING_LEN,
+ qcounter_stats_strings[i]);
+
/* PPORT counters */
for (i = 0; i < NUM_PPORT_COUNTERS; i++)
strcpy(data + (idx++) * ETH_GSTRING_LEN,
@@ -240,6 +248,9 @@ static void mlx5e_get_ethtool_stats(struct net_device *dev,
for (i = 0; i < NUM_VPORT_COUNTERS; i++)
data[idx++] = ((u64 *)&priv->stats.vport)[i];
+ for (i = 0; i < MLX5E_NUM_Q_CNTRS(priv); i++)
+ data[idx++] = ((u32 *)&priv->stats.qcnt)[i];
+
for (i = 0; i < NUM_PPORT_COUNTERS; i++)
data[idx++] = be64_to_cpu(((__be64 *)&priv->stats.pport)[i]);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index e0adb60..7fbe1ba 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -129,6 +129,17 @@ free_out:
kvfree(out);
}
+static void mlx5e_update_q_counter(struct mlx5e_priv *priv)
+{
+ struct mlx5e_qcounter_stats *qcnt = &priv->stats.qcnt;
+
+ if (!priv->q_counter)
+ return;
+
+ mlx5_core_query_out_of_buffer(priv->mdev, priv->q_counter,
+ &qcnt->rx_out_of_buffer);
+}
+
void mlx5e_update_stats(struct mlx5e_priv *priv)
{
struct mlx5_core_dev *mdev = priv->mdev;
@@ -250,6 +261,8 @@ void mlx5e_update_stats(struct mlx5e_priv *priv)
s->rx_csum_sw;
mlx5e_update_pport_counters(priv);
+ mlx5e_update_q_counter(priv);
+
free_out:
kvfree(out);
}
@@ -1055,6 +1068,7 @@ static void mlx5e_build_rq_param(struct mlx5e_priv *priv,
MLX5_SET(wq, wq, log_wq_stride, ilog2(sizeof(struct mlx5e_rx_wqe)));
MLX5_SET(wq, wq, log_wq_sz, priv->params.log_rq_size);
MLX5_SET(wq, wq, pd, priv->pdn);
+ MLX5_SET(rqc, rqc, counter_set_id, priv->q_counter);
param->wq.buf_numa_node = dev_to_node(&priv->mdev->pdev->dev);
param->wq.linear = 1;
@@ -2442,6 +2456,26 @@ static int mlx5e_create_mkey(struct mlx5e_priv *priv, u32 pdn,
return err;
}
+static void mlx5e_create_q_counter(struct mlx5e_priv *priv)
+{
+ struct mlx5_core_dev *mdev = priv->mdev;
+ int err;
+
+ err = mlx5_core_alloc_q_counter(mdev, &priv->q_counter);
+ if (err) {
+ mlx5_core_warn(mdev, "alloc queue counter failed, %d\n", err);
+ priv->q_counter = 0;
+ }
+}
+
+static void mlx5e_destroy_q_counter(struct mlx5e_priv *priv)
+{
+ if (!priv->q_counter)
+ return;
+
+ mlx5_core_dealloc_q_counter(priv->mdev, priv->q_counter);
+}
+
static void *mlx5e_create_netdev(struct mlx5_core_dev *mdev)
{
struct net_device *netdev;
@@ -2527,13 +2561,15 @@ static void *mlx5e_create_netdev(struct mlx5_core_dev *mdev)
goto err_destroy_tirs;
}
+ mlx5e_create_q_counter(priv);
+
mlx5e_init_eth_addr(priv);
mlx5e_vxlan_init(priv);
err = mlx5e_tc_init(priv);
if (err)
- goto err_destroy_flow_tables;
+ goto err_dealloc_q_counters;
#ifdef CONFIG_MLX5_CORE_EN_DCB
mlx5e_dcbnl_ieee_setets_core(priv, &priv->params.ets);
@@ -2556,7 +2592,8 @@ static void *mlx5e_create_netdev(struct mlx5_core_dev *mdev)
err_tc_cleanup:
mlx5e_tc_cleanup(priv);
-err_destroy_flow_tables:
+err_dealloc_q_counters:
+ mlx5e_destroy_q_counter(priv);
mlx5e_destroy_flow_tables(priv);
err_destroy_tirs:
@@ -2605,6 +2642,7 @@ static void mlx5e_destroy_netdev(struct mlx5_core_dev *mdev, void *vpriv)
unregister_netdev(netdev);
mlx5e_tc_cleanup(priv);
mlx5e_vxlan_cleanup(priv);
+ mlx5e_destroy_q_counter(priv);
mlx5e_destroy_flow_tables(priv);
mlx5e_destroy_tirs(priv);
mlx5e_destroy_rqt(priv, MLX5E_SINGLE_RQ_RQT);
--
1.7.1
^ permalink raw reply related
* [PATCH net-next V3 01/11] net/mlx5: Introduce device queue counters
From: Saeed Mahameed @ 2016-04-20 19:02 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Or Gerlitz, Tal Alon, Tariq Toukan, Eran Ben Elisha,
Eric Dumazet, Jesper Dangaard Brouer, Rana Shahout,
Saeed Mahameed
In-Reply-To: <1461178939-20687-1-git-send-email-saeedm@mellanox.com>
From: Tariq Toukan <tariqt@mellanox.com>
A queue counter can collect several statistics for one or more
hardware queues (QPs, RQs, etc ..) that the counter is attached to.
For Ethernet it will provide an "out of buffer" counter which
collects the number of all packets that are dropped due to lack
of software buffers.
Here we add device commands to alloc/query/dealloc queue counters.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Rana Shahout <ranas@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/qp.c | 68 ++++++++++++++++++++++++++
include/linux/mlx5/qp.h | 6 ++
2 files changed, 74 insertions(+), 0 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/qp.c b/drivers/net/ethernet/mellanox/mlx5/core/qp.c
index def2893..b720a27 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/qp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/qp.c
@@ -538,3 +538,71 @@ void mlx5_core_destroy_sq_tracked(struct mlx5_core_dev *dev,
mlx5_core_destroy_sq(dev, sq->qpn);
}
EXPORT_SYMBOL(mlx5_core_destroy_sq_tracked);
+
+int mlx5_core_alloc_q_counter(struct mlx5_core_dev *dev, u16 *counter_id)
+{
+ u32 in[MLX5_ST_SZ_DW(alloc_q_counter_in)];
+ u32 out[MLX5_ST_SZ_DW(alloc_q_counter_out)];
+ int err;
+
+ memset(in, 0, sizeof(in));
+ memset(out, 0, sizeof(out));
+
+ MLX5_SET(alloc_q_counter_in, in, opcode, MLX5_CMD_OP_ALLOC_Q_COUNTER);
+ err = mlx5_cmd_exec_check_status(dev, in, sizeof(in), out, sizeof(out));
+ if (!err)
+ *counter_id = MLX5_GET(alloc_q_counter_out, out,
+ counter_set_id);
+ return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_core_alloc_q_counter);
+
+int mlx5_core_dealloc_q_counter(struct mlx5_core_dev *dev, u16 counter_id)
+{
+ u32 in[MLX5_ST_SZ_DW(dealloc_q_counter_in)];
+ u32 out[MLX5_ST_SZ_DW(dealloc_q_counter_out)];
+
+ memset(in, 0, sizeof(in));
+ memset(out, 0, sizeof(out));
+
+ MLX5_SET(dealloc_q_counter_in, in, opcode,
+ MLX5_CMD_OP_DEALLOC_Q_COUNTER);
+ MLX5_SET(dealloc_q_counter_in, in, counter_set_id, counter_id);
+ return mlx5_cmd_exec_check_status(dev, in, sizeof(in), out,
+ sizeof(out));
+}
+EXPORT_SYMBOL_GPL(mlx5_core_dealloc_q_counter);
+
+int mlx5_core_query_q_counter(struct mlx5_core_dev *dev, u16 counter_id,
+ int reset, void *out, int out_size)
+{
+ u32 in[MLX5_ST_SZ_DW(query_q_counter_in)];
+
+ memset(in, 0, sizeof(in));
+
+ MLX5_SET(query_q_counter_in, in, opcode, MLX5_CMD_OP_QUERY_Q_COUNTER);
+ MLX5_SET(query_q_counter_in, in, clear, reset);
+ MLX5_SET(query_q_counter_in, in, counter_set_id, counter_id);
+ return mlx5_cmd_exec_check_status(dev, in, sizeof(in), out, out_size);
+}
+EXPORT_SYMBOL_GPL(mlx5_core_query_q_counter);
+
+int mlx5_core_query_out_of_buffer(struct mlx5_core_dev *dev, u16 counter_id,
+ u32 *out_of_buffer)
+{
+ int outlen = MLX5_ST_SZ_BYTES(query_q_counter_out);
+ void *out;
+ int err;
+
+ out = mlx5_vzalloc(outlen);
+ if (!out)
+ return -ENOMEM;
+
+ err = mlx5_core_query_q_counter(dev, counter_id, 0, out, outlen);
+ if (!err)
+ *out_of_buffer = MLX5_GET(query_q_counter_out, out,
+ out_of_buffer);
+
+ kfree(out);
+ return err;
+}
diff --git a/include/linux/mlx5/qp.h b/include/linux/mlx5/qp.h
index cf031a3..6422102 100644
--- a/include/linux/mlx5/qp.h
+++ b/include/linux/mlx5/qp.h
@@ -668,6 +668,12 @@ int mlx5_core_create_sq_tracked(struct mlx5_core_dev *dev, u32 *in, int inlen,
struct mlx5_core_qp *sq);
void mlx5_core_destroy_sq_tracked(struct mlx5_core_dev *dev,
struct mlx5_core_qp *sq);
+int mlx5_core_alloc_q_counter(struct mlx5_core_dev *dev, u16 *counter_id);
+int mlx5_core_dealloc_q_counter(struct mlx5_core_dev *dev, u16 counter_id);
+int mlx5_core_query_q_counter(struct mlx5_core_dev *dev, u16 counter_id,
+ int reset, void *out, int out_size);
+int mlx5_core_query_out_of_buffer(struct mlx5_core_dev *dev, u16 counter_id,
+ u32 *out_of_buffer);
static inline const char *mlx5_qp_type_str(int type)
{
--
1.7.1
^ permalink raw reply related
* [PATCH net-next V3 03/11] net/mlx5e: Use only close NUMA node for default RSS
From: Saeed Mahameed @ 2016-04-20 19:02 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Or Gerlitz, Tal Alon, Tariq Toukan, Eran Ben Elisha,
Eric Dumazet, Jesper Dangaard Brouer, Saeed Mahameed
In-Reply-To: <1461178939-20687-1-git-send-email-saeedm@mellanox.com>
From: Tariq Toukan <tariqt@mellanox.com>
Distribute default RSS table uniformly over the rings of the
close NUMA node, instead of all available channels.
This way we enforce the preference of close rings over far ones.
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
---
drivers/net/ethernet/mellanox/mlx5/core/en.h | 3 ++-
.../net/ethernet/mellanox/mlx5/core/en_ethtool.c | 2 +-
drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 15 +++++++++++++--
3 files changed, 16 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index c4ddbe8..7f19644 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -671,7 +671,8 @@ void mlx5e_build_tir_ctx_hash(void *tirc, struct mlx5e_priv *priv);
int mlx5e_open_locked(struct net_device *netdev);
int mlx5e_close_locked(struct net_device *netdev);
-void mlx5e_build_default_indir_rqt(u32 *indirection_rqt, int len,
+void mlx5e_build_default_indir_rqt(struct mlx5_core_dev *mdev,
+ u32 *indirection_rqt, int len,
int num_channels);
static inline void mlx5e_tx_notify_hw(struct mlx5e_sq *sq,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index 39c1902..6f40ba4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -397,7 +397,7 @@ static int mlx5e_set_channels(struct net_device *dev,
mlx5e_close_locked(dev);
priv->params.num_channels = count;
- mlx5e_build_default_indir_rqt(priv->params.indirection_rqt,
+ mlx5e_build_default_indir_rqt(priv->mdev, priv->params.indirection_rqt,
MLX5E_INDIR_RQT_SIZE, count);
if (was_opened)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 7fbe1ba..9b58ef6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2297,11 +2297,22 @@ static void mlx5e_ets_init(struct mlx5e_priv *priv)
}
#endif
-void mlx5e_build_default_indir_rqt(u32 *indirection_rqt, int len,
+void mlx5e_build_default_indir_rqt(struct mlx5_core_dev *mdev,
+ u32 *indirection_rqt, int len,
int num_channels)
{
+ int node = mdev->priv.numa_node;
+ int node_num_of_cores;
int i;
+ if (node == -1)
+ node = first_online_node;
+
+ node_num_of_cores = cpumask_weight(cpumask_of_node(node));
+
+ if (node_num_of_cores)
+ num_channels = min_t(int, num_channels, node_num_of_cores);
+
for (i = 0; i < len; i++)
indirection_rqt[i] = i % num_channels;
}
@@ -2333,7 +2344,7 @@ static void mlx5e_build_netdev_priv(struct mlx5_core_dev *mdev,
netdev_rss_key_fill(priv->params.toeplitz_hash_key,
sizeof(priv->params.toeplitz_hash_key));
- mlx5e_build_default_indir_rqt(priv->params.indirection_rqt,
+ mlx5e_build_default_indir_rqt(mdev, priv->params.indirection_rqt,
MLX5E_INDIR_RQT_SIZE, num_channels);
priv->params.lro_wqe_sz =
--
1.7.1
^ permalink raw reply related
* Re: [PATCH 02/19] io-mapping: Specify mapping size for io_mapping_map_wc()
From: Luis R. Rodriguez @ 2016-04-20 18:58 UTC (permalink / raw)
To: Chris Wilson
Cc: David Airlie, intel-gfx, linux-kernel, Ingo Molnar,
Peter Zijlstra (Intel), Luis R . Rodriguez, dri-devel, netdev,
linux-rdma, Daniel Vetter, Dan Williams, Yishai Hadas,
David Hildenbrand
In-Reply-To: <1461177750-20187-3-git-send-email-chris@chris-wilson.co.uk>
On Wed, Apr 20, 2016 at 07:42:13PM +0100, Chris Wilson wrote:
> The ioremap() hidden behind the io_mapping_map_wc() convenience helper
> can be used for remapping multiple pages. Extend the helper so that
> future callers can use it for larger ranges.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Jani Nikula <jani.nikula@linux.intel.com>
> Cc: David Airlie <airlied@linux.ie>
> Cc: Yishai Hadas <yishaih@mellanox.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
> Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
> Cc: Luis R. Rodriguez <mcgrof@kernel.org>
> Cc: intel-gfx@lists.freedesktop.org
> Cc: dri-devel@lists.freedesktop.org
> Cc: netdev@vger.kernel.org
> Cc: linux-rdma@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
We have 2 callers today, in the future, can you envision
this API getting more options? If so, in order to avoid the
pain of collateral evolutions I can suggest a descriptor
being passed with the required settings / options. This lets
you evolve the API without needing to go in and modify
old users. If you choose not to that's fine too, just
figured I'd chime in with that as I've seen the pain
with other APIs, and I'm putting an end to the needless
set of collateral evolutions this way.
Other than that possible API optimization:
Reviewed-by: Luis R. Rodriguez <mcgrof@kernel.org>
Luis
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx
^ permalink raw reply
* Re: [PATCH iproute2 WIP] ifstat: use new RTM_GETSTATS api
From: Stephen Hemminger @ 2016-04-20 18:53 UTC (permalink / raw)
To: Roopa Prabhu; +Cc: davem, netdev
In-Reply-To: <1461168975-28081-1-git-send-email-roopa@cumulusnetworks.com>
On Wed, 20 Apr 2016 09:16:15 -0700
Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
> +int rtnl_wilddump_stats_req_filter(struct rtnl_handle *rth, int family, int type,
> + __u32 filt_mask)
> +{
> + struct {
> + struct nlmsghdr nlh;
> + struct if_stats_msg ifsm;
> + } req;
Please use C99 initialization instead of memset in new code.
> + int err;
> +
> + memset(&req, 0, sizeof(req));
> + req.nlh.nlmsg_len = sizeof(req);
> + req.nlh.nlmsg_type = type;
> + req.nlh.nlmsg_flags = NLM_F_DUMP|NLM_F_REQUEST;
> + req.nlh.nlmsg_pid = 0;
> + req.nlh.nlmsg_seq = rth->dump = ++rth->seq;
> + req.ifsm.family = family;
> + req.ifsm.filter_mask = filt_mask;
> +
> + err = send(rth->fd, (void*)&req, sizeof(req), 0);
> +
> + return err;
Why not just:
return send(rth->fd, &req, sizoef(req), 0);
> +}
^ permalink raw reply
* [PATCH 02/19] io-mapping: Specify mapping size for io_mapping_map_wc()
From: Chris Wilson @ 2016-04-20 18:42 UTC (permalink / raw)
To: intel-gfx
Cc: Tvrtko Ursulin, Tvrtko Ursulin, Mika Kuoppala, Yishai Hadas,
linux-kernel, Peter Zijlstra (Intel), Luis R . Rodriguez,
dri-devel, netdev, linux-rdma, Daniel Vetter, Dan Williams,
Ingo Molnar, David Hildenbrand
In-Reply-To: <1461177750-20187-1-git-send-email-chris@chris-wilson.co.uk>
The ioremap() hidden behind the io_mapping_map_wc() convenience helper
can be used for remapping multiple pages. Extend the helper so that
future callers can use it for larger ranges.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Yishai Hadas <yishaih@mellanox.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: intel-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: netdev@vger.kernel.org
Cc: linux-rdma@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
drivers/gpu/drm/i915/intel_overlay.c | 3 ++-
drivers/net/ethernet/mellanox/mlx4/pd.c | 4 +++-
include/linux/io-mapping.h | 10 +++++++---
3 files changed, 12 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index 9746b9841c13..0d5a376878d3 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -198,7 +198,8 @@ intel_overlay_map_regs(struct intel_overlay *overlay)
regs = (struct overlay_registers __iomem *)overlay->reg_bo->phys_handle->vaddr;
else
regs = io_mapping_map_wc(ggtt->mappable,
- overlay->flip_addr);
+ overlay->flip_addr,
+ PAGE_SIZE);
return regs;
}
diff --git a/drivers/net/ethernet/mellanox/mlx4/pd.c b/drivers/net/ethernet/mellanox/mlx4/pd.c
index b3cc3ab63799..6fc156a3918d 100644
--- a/drivers/net/ethernet/mellanox/mlx4/pd.c
+++ b/drivers/net/ethernet/mellanox/mlx4/pd.c
@@ -205,7 +205,9 @@ int mlx4_bf_alloc(struct mlx4_dev *dev, struct mlx4_bf *bf, int node)
goto free_uar;
}
- uar->bf_map = io_mapping_map_wc(priv->bf_mapping, uar->index << PAGE_SHIFT);
+ uar->bf_map = io_mapping_map_wc(priv->bf_mapping,
+ uar->index << PAGE_SHIFT,
+ PAGE_SIZE);
if (!uar->bf_map) {
err = -ENOMEM;
goto unamp_uar;
diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h
index e399029b68c5..645ad06b5d52 100644
--- a/include/linux/io-mapping.h
+++ b/include/linux/io-mapping.h
@@ -100,14 +100,16 @@ io_mapping_unmap_atomic(void __iomem *vaddr)
}
static inline void __iomem *
-io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset)
+io_mapping_map_wc(struct io_mapping *mapping,
+ unsigned long offset,
+ unsigned long size)
{
resource_size_t phys_addr;
BUG_ON(offset >= mapping->size);
phys_addr = mapping->base + offset;
- return ioremap_wc(phys_addr, PAGE_SIZE);
+ return ioremap_wc(phys_addr, size);
}
static inline void
@@ -155,7 +157,9 @@ io_mapping_unmap_atomic(void __iomem *vaddr)
/* Non-atomic map/unmap */
static inline void __iomem *
-io_mapping_map_wc(struct io_mapping *mapping, unsigned long offset)
+io_mapping_map_wc(struct io_mapping *mapping,
+ unsigned long offset,
+ unsigned long size)
{
return ((char __force __iomem *) mapping) + offset;
}
--
2.8.1
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
^ permalink raw reply related
* Re: [PATCH V2] net: ethernet: mellanox: correct page conversion
From: Sinan Kaya @ 2016-04-20 18:42 UTC (permalink / raw)
To: Eran Ben Elisha
Cc: Christoph Hellwig, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
timur-sgV2jX0FEOL9JmXXK+q4OQ, cov-sgV2jX0FEOL9JmXXK+q4OQ,
Yishai Hadas, Linux Netdev List,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <CAKHjkjmYNLABO10V1DZQmZ_zczjbfDZU0TDPHoMmv_1FMi9_gA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
On 4/20/2016 2:40 PM, Eran Ben Elisha wrote:
>>
>> It is been 1.5 years since I reported the problem. We came up with three
>> different solutions this week. I'd like to see a version of the solution
>> to get merged until Mellanox comes up with a better solution with another
>> patch. My proposal is to use this one.
>>
>
> We will post our suggestion here in the following days.
>
Thanks, please have me in CC. I'm not subscribed to this group normally.
I can post a tested-by after testing.
--
Sinan Kaya
Qualcomm Technologies, Inc. on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* Re: [PATCH V2] net: ethernet: mellanox: correct page conversion
From: Eran Ben Elisha @ 2016-04-20 18:40 UTC (permalink / raw)
To: Sinan Kaya
Cc: Christoph Hellwig, linux-rdma-u79uwXL29TY76Z2rM5mHXA,
timur-sgV2jX0FEOL9JmXXK+q4OQ, cov-sgV2jX0FEOL9JmXXK+q4OQ,
Yishai Hadas, Linux Netdev List,
linux-kernel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <571785A5.5040306-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
>
> It is been 1.5 years since I reported the problem. We came up with three
> different solutions this week. I'd like to see a version of the solution
> to get merged until Mellanox comes up with a better solution with another
> patch. My proposal is to use this one.
>
We will post our suggestion here in the following days.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* [PATCH net-next 2/2] net: bcmsysport: use napi_complete_done()
From: Florian Fainelli @ 2016-04-20 18:37 UTC (permalink / raw)
To: netdev; +Cc: edumazet, pgynther, Florian Fainelli
In-Reply-To: <1461177429-23553-1-git-send-email-f.fainelli@gmail.com>
By using napi_complete_done(), we allow fine tuning of
/sys/class/net/ethX/gro_flush_timeout for higher GRO aggregation
efficiency for a Gbit NIC.
Check commit 24d2e4a50737 ("tg3: use napi_complete_done()") for details.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
drivers/net/ethernet/broadcom/bcmsysport.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c b/drivers/net/ethernet/broadcom/bcmsysport.c
index 9e3ec739d860..30b0c2895a56 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.c
+++ b/drivers/net/ethernet/broadcom/bcmsysport.c
@@ -831,7 +831,7 @@ static int bcm_sysport_poll(struct napi_struct *napi, int budget)
rdma_writel(priv, priv->rx_c_index, RDMA_CONS_INDEX);
if (work_done < budget) {
- napi_complete(napi);
+ napi_complete_done(napi, work_done);
/* re-enable RX interrupts */
intrl2_0_mask_clear(priv, INTRL2_0_RDMA_MBDONE);
}
--
2.1.0
^ permalink raw reply related
* [PATCH net-next 0/2] net: bcmsysport: utilize newer NAPI APIs
From: Florian Fainelli @ 2016-04-20 18:37 UTC (permalink / raw)
To: netdev; +Cc: edumazet, pgynther, Florian Fainelli
Hi David, Eric, Petri,
These two patches are very analoguous to what was already submitted for
BCMGENET and switch the SYSTEMPORT driver to utilizing __napi_schedule_irqoff()
and napi_complete_done for the RX NAPI context.
Florian Fainelli (2):
net: bcmsysport: use __napi_schedule_irqoff()
net: bcmsysport: use napi_complete_done()
drivers/net/ethernet/broadcom/bcmsysport.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--
2.1.0
^ permalink raw reply
* [PATCH net-next 1/2] net: bcmsysport: use __napi_schedule_irqoff()
From: Florian Fainelli @ 2016-04-20 18:37 UTC (permalink / raw)
To: netdev; +Cc: edumazet, pgynther, Florian Fainelli
In-Reply-To: <1461177429-23553-1-git-send-email-f.fainelli@gmail.com>
Both bcm_sysport_tx_isr() and bcm_sysport_rx_isr() run in hard irq
context, we do not need to block irq again.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
drivers/net/ethernet/broadcom/bcmsysport.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c b/drivers/net/ethernet/broadcom/bcmsysport.c
index 993c780bdfab..9e3ec739d860 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.c
+++ b/drivers/net/ethernet/broadcom/bcmsysport.c
@@ -873,7 +873,7 @@ static irqreturn_t bcm_sysport_rx_isr(int irq, void *dev_id)
if (likely(napi_schedule_prep(&priv->napi))) {
/* disable RX interrupts */
intrl2_0_mask_set(priv, INTRL2_0_RDMA_MBDONE);
- __napi_schedule(&priv->napi);
+ __napi_schedule_irqoff(&priv->napi);
}
}
@@ -916,7 +916,7 @@ static irqreturn_t bcm_sysport_tx_isr(int irq, void *dev_id)
if (likely(napi_schedule_prep(&txr->napi))) {
intrl2_1_mask_set(priv, BIT(ring));
- __napi_schedule(&txr->napi);
+ __napi_schedule_irqoff(&txr->napi);
}
}
--
2.1.0
^ permalink raw reply related
* Re: [PATCH net 4/4] net/mlx4_en: Split SW RX dropped counter per RX ring
From: Florian Fainelli @ 2016-04-20 18:05 UTC (permalink / raw)
To: Or Gerlitz, Eric Dumazet
Cc: David S. Miller, netdev, Eran Ben Elisha, Yishai Hadas,
Saeed Mahameed
In-Reply-To: <571799A3.402@mellanox.com>
On 20/04/16 08:00, Or Gerlitz wrote:
> On 4/20/2016 5:56 PM, Eric Dumazet wrote:
>>> >Fixes: a3333b35da16 ('net/mlx4_en: Moderate ethtool callback to
>>> [...] ')
>>> >Signed-off-by: Eran Ben Elisha<eranbe@mellanox.com>
>>> >Reported-by: Brenden Blanco<bblanco@plumgrid.com>
>>> >Signed-off-by: Saeed Mahameed<saeedm@mellanox.com>
>>> >Signed-off-by: Or Gerlitz<ogerlitz@mellanox.com>
>>> >---
>> Reported-by: Eric Dumazet<edumazet@google.com>
>>
>> (http://www.spinics.net/lists/netdev/msg371318.html )
>
> Hi Eric,
>
> Just to be sure, you'd like me to re-spin this and fix the reporter name?
There is no need for that, patchwork amends Reported-by (and a bunch of
other tags) automatically when somebody replies to the message, see the
resulting mbox for this patch:
http://patchwork.ozlabs.org/patch/612664/mbox/
--
Florian
^ permalink raw reply
* Re: [RFC PATCH 2/5] mlx5: Add support for UDP tunnel segmentation with outer checksum offload
From: Alexander Duyck @ 2016-04-20 18:06 UTC (permalink / raw)
To: Saeed Mahameed
Cc: Alexander Duyck, eugenia, Bruce W Allan, Saeed Mahameed,
Linux Netdev List, intel-wired-lan, Ariel Elior, Michael Chan,
Matthew Finlay
In-Reply-To: <CALzJLG-8enmxxvjbVbcygVgM0VhX7Eo=6B=2pZF5jQzN9L9YEg@mail.gmail.com>
On Wed, Apr 20, 2016 at 10:40 AM, Saeed Mahameed
<saeedm@dev.mellanox.co.il> wrote:
> On Tue, Apr 19, 2016 at 10:06 PM, Alexander Duyck <aduyck@mirantis.com> wrote:
>> This patch assumes that the mlx5 hardware will ignore existing IPv4/v6
>> header fields for length and checksum as well as the length and checksum
>> fields for outer UDP headers.
>>
>> I have no means of testing this as I do not have any mlx5 hardware but
>> thought I would submit it as an RFC to see if anyone out there wants to
>> test this and see if this does in fact enable this functionality allowing
>> us to to segment UDP tunneled frames that have an outer checksum.
>>
>> Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
>> ---
>> drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 7 ++++++-
>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>> index e0adb604f461..57d8da796d50 100644
>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>> @@ -2390,13 +2390,18 @@ static void mlx5e_build_netdev(struct net_device *netdev)
>> netdev->hw_features |= NETIF_F_HW_VLAN_CTAG_FILTER;
>>
>> if (mlx5e_vxlan_allowed(mdev)) {
>> - netdev->hw_features |= NETIF_F_GSO_UDP_TUNNEL;
>> + netdev->hw_features |= NETIF_F_GSO_UDP_TUNNEL |
>> + NETIF_F_GSO_UDP_TUNNEL_CSUM |
>> + NETIF_F_GSO_PARTIAL;
>> netdev->hw_enc_features |= NETIF_F_IP_CSUM;
>> netdev->hw_enc_features |= NETIF_F_RXCSUM;
>> netdev->hw_enc_features |= NETIF_F_TSO;
>> netdev->hw_enc_features |= NETIF_F_TSO6;
>> netdev->hw_enc_features |= NETIF_F_RXHASH;
>> netdev->hw_enc_features |= NETIF_F_GSO_UDP_TUNNEL;
>> + netdev->hw_enc_features |= NETIF_F_GSO_UDP_TUNNEL_CSUM |
>> + NETIF_F_GSO_PARTIAL;
>> + netdev->gso_partial_features = NETIF_F_GSO_UDP_TUNNEL_CSUM;
>> }
>>
>> netdev->features = netdev->hw_features;
>>
>
> Hi Alex,
>
> Adding Matt, VxLAN feature owner from Mellanox,
> Matt please correct me if am wrong, but We already tested GSO VxLAN
> and we saw the TCP/IP checksum offloads for both inner and outer
> headers handled by the hardware.
>
> And looking at mlx5e_sq_xmit:
>
> if (likely(skb->ip_summed == CHECKSUM_PARTIAL)) {
> eseg->cs_flags = MLX5_ETH_WQE_L3_CSUM;
> if (skb->encapsulation) {
> eseg->cs_flags |= MLX5_ETH_WQE_L3_INNER_CSUM |
> MLX5_ETH_WQE_L4_INNER_CSUM;
> sq->stats.csum_offload_inner++;
> } else {
> eseg->cs_flags |= MLX5_ETH_WQE_L4_CSUM;
> }
>
> We enable inner/outer hardware checksumming unconditionally without
> looking at the features Alex is suggesting in this patch,
> Alex, can you elaborate more on the meaning of those features ? and
> why would it work for us without declaring them ?
Well right now the feature list exposed by the device indicates that
TSO is not used if a VxLAN tunnel has a checksum in an outer header.
Since that is not exposed currently that is completely offloaded in
software via GSO.
What the GSO partial does is allow us to treat GSO for tunnels with
checksum like it is GSO for tunnels without checksum by precomputing
the UDP checksum as though the frame had already been segmented and
restricts us to an even multiple of MSS bytes that are to be segmented
between all the frames. One side effect though is that all of the IP
and UDP header fields are also precomputed, but from what I can tell
it looks like the values that would be changed by a change in length
are ignored or overwritten by the hardware and driver anyway.
- Alex
^ permalink raw reply
* Re: Davicom DM9162 PHY supported in the kernel?
From: Florian Fainelli @ 2016-04-20 18:03 UTC (permalink / raw)
To: Amr Bekhit; +Cc: netdev, andrew
In-Reply-To: <CAOLz05qgn2-9Qxm1cR7mXa67GAwD2SGbMnbeeEBrYeSPsBexuw@mail.gmail.com>
Hi,
On 20/04/16 08:21, Amr Bekhit wrote:
> Hello,
>
> I'm using an embedded Linux board based on an AT91SAM9X25 that uses the
> Davicom DM9162IEP PHY chip. I'm struggling to get packets out on the
> wire and I'm suspecting that I might have an issue between the AT91 MAC
> and the PHY chip. I've looked through the kernel config options and the
> kernel already has compiled-in support for the Davicom PHYs, however I
> noticed that according to the help text, only the dm9161e and dm9131
> chips are supported, which may indicate why my ethernet isn't working. I
> was wondering whether the DM9162 is backwards compatible with the
> existing driver? I'm currently using the mainline kernel 4.3. (p.s. I
> know the hardware works fine since I have no problem transferring files
> using tftp via u-boot).
Well, u-boot is a very simplistic networking stack, there could be tons
of issues that get under the radar because it cannot report them
properly, but let's assume it works so you have something to compare
against.
The DM9162 should be very similar to the DM9161, so the first thing
might be trying to add the PHY ID (32-bits OUI) to the matching table in
drivers/net/phy/davicom.c, and make it configure the PHY through
dm9161_config_init() since that looks at the PHY interface (MII, RMII
etc.) and does a bit of configuration here.
Right now, chances are that you are running with the Generic PHY driver
which has no clue about Davicom specific programming (if any). There
could also be board-level fixups required (adjusting trace lengths, if
you are using a RGMII interface for instance), etc.
--
Florian
^ permalink raw reply
* Re: [PATCH] MAINTAINERS: net: add entry for TI Ethernet Switch drivers
From: Tony Lindgren @ 2016-04-20 18:03 UTC (permalink / raw)
To: Grygorii Strashko
Cc: netdev, linux-kernel, Sekhar Nori, linux-omap, David S. Miller,
Mugunthan V N, Richard Cochran
In-Reply-To: <5717ABCC.50406@ti.com>
* Grygorii Strashko <grygorii.strashko@ti.com> [160420 09:19]:
> On 04/20/2016 05:23 PM, Tony Lindgren wrote:
> > * Grygorii Strashko <grygorii.strashko@ti.com> [160420 04:26]:
> >> Add record for TI Ethernet Switch Driver CPSW/CPDMA/MDIO HW
> >> (am33/am43/am57/dr7/davinci) to ensure that related patches
> >> will go through dedicated linux-omap list.
> >>
> >> Also add Mugunthan as maintainer and myself as the reviewer.
> >>
> >> Cc: "David S. Miller" <davem@davemloft.net>
> >> Cc: Mugunthan V N <mugunthanvnm@ti.com>
> >> Cc: Richard Cochran <richardcochran@gmail.com>
> >> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
> >> ---
> >> MAINTAINERS | 8 ++++++++
> >> 1 file changed, 8 insertions(+)
> >>
> >> diff --git a/MAINTAINERS b/MAINTAINERS
> >> index 1d5b4be..aca864d 100644
> >> --- a/MAINTAINERS
> >> +++ b/MAINTAINERS
> >> @@ -11071,6 +11071,14 @@ S: Maintained
> >> F: drivers/clk/ti/
> >> F: include/linux/clk/ti.h
> >>
> >> +TI ETHERNET SWITCH DRIVER (CPSW)
> >> +M: Mugunthan V N <mugunthanvnm@ti.com>
> >> +R: Grygorii Strashko <grygorii.strashko@ti.com>
> >> +L: linux-omap@vger.kernel.org
> >> +S: Maintained
> >> +F: drivers/net/ethernet/ti/cpsw*
> >> +F: drivers/net/ethernet/ti/davinci*
> >> +
> >> TI FLASH MEDIA INTERFACE DRIVER
> >> M: Alex Dubov <oakad@yahoo.com>
> >> S: Maintained
> >> --
> >
> > Please add netdev list also there as the primary list:
> >
> > L: netdev@vger.kernel.org
> > L: linux-omap@vger.kernel.org
> >
> > Then we can easily review and ack the patches for Dave to apply.
> >
>
> I can, but want clarify if it really necessary, because get_maintainer.pl
> automatically adds netdev@vger.kernel.org:
Well it may not be obvious from reading MAINTAINERS file though :)
Tony
^ permalink raw reply
* [PATCH RFC net-next] net: dsa: Provide CPU port statistics to master netdev
From: Florian Fainelli @ 2016-04-20 17:58 UTC (permalink / raw)
To: netdev; +Cc: davem, andrew, vivien.didelot, Florian Fainelli
This patch overloads the DSA master netdev, aka CPU Ethernet MAC to also
include switch-side statistics, which is useful for debugging purposes,
when the switch is not properly connected to the Ethernet MAC (duplex
mismatch, (RG)MII electrical issues etc.).
We accomplish this by retaining the original copy of the master netdev's
ethtool_ops, and just overload the 3 operations we care about:
get_sset_count, get_strings and get_ethtool_stats so as to intercept
these calls and call into the original master_netdev ethtool_ops, plus
our own.
We take this approach as opposed to providing a set of DSA helper
functions that would retrive the CPU port's statistics, because the
entire purpose of DSA is to allow unmodified Ethernet MAC drivers to be
used as CPU conduit interfaces, therefore, statistics overlay in such
drivers would simply not scale.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
include/net/dsa.h | 5 ++++
net/dsa/slave.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 74 insertions(+)
diff --git a/include/net/dsa.h b/include/net/dsa.h
index c4bc42bd3538..67f811f00339 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -111,6 +111,11 @@ struct dsa_switch_tree {
enum dsa_tag_protocol tag_protocol;
/*
+ * Original copy of the master netdev ethtool_ops
+ */
+ struct ethtool_ops master_ethtool_ops;
+
+ /*
* The switch and port to which the CPU is attached.
*/
s8 cpu_switch;
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 2dae0d064359..41283c6f725a 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -666,6 +666,59 @@ static void dsa_slave_get_strings(struct net_device *dev,
}
}
+static void dsa_cpu_port_get_ethtool_stats(struct net_device *dev,
+ struct ethtool_stats *stats,
+ uint64_t *data)
+{
+ struct dsa_switch_tree *dst = dev->dsa_ptr;
+ struct dsa_switch *ds = dst->ds[0];
+ s8 cpu_port = dst->cpu_port;
+ int count = 0;
+
+ if (dst->master_ethtool_ops.get_sset_count) {
+ count = dst->master_ethtool_ops.get_sset_count(dev,
+ ETH_SS_STATS);
+ dst->master_ethtool_ops.get_ethtool_stats(dev, stats, data);
+ }
+
+ if (ds->drv->get_ethtool_stats)
+ ds->drv->get_ethtool_stats(ds, cpu_port, data + count);
+}
+
+static int dsa_cpu_port_get_sset_count(struct net_device *dev, int sset)
+{
+ struct dsa_switch_tree *dst = dev->dsa_ptr;
+ struct dsa_switch *ds = dst->ds[0];
+ int count = 0;
+
+ if (dst->master_ethtool_ops.get_sset_count)
+ count += dst->master_ethtool_ops.get_sset_count(dev, sset);
+
+ if (sset == ETH_SS_STATS && ds->drv->get_sset_count)
+ count += ds->drv->get_sset_count(ds);
+
+ return count;
+}
+
+static void dsa_cpu_port_get_strings(struct net_device *dev,
+ uint32_t stringset, uint8_t *data)
+{
+ struct dsa_switch_tree *dst = dev->dsa_ptr;
+ struct dsa_switch *ds = dst->ds[0];
+ s8 cpu_port = dst->cpu_port;
+ int len = ETH_GSTRING_LEN;
+ int count = 0;
+
+ if (dst->master_ethtool_ops.get_sset_count) {
+ count = dst->master_ethtool_ops.get_sset_count(dev,
+ ETH_SS_STATS);
+ dst->master_ethtool_ops.get_strings(dev, stringset, data);
+ }
+
+ if (stringset == ETH_SS_STATS && ds->drv->get_strings)
+ ds->drv->get_strings(ds, cpu_port, data + count * len);
+}
+
static void dsa_slave_get_ethtool_stats(struct net_device *dev,
struct ethtool_stats *stats,
uint64_t *data)
@@ -821,6 +874,8 @@ static const struct ethtool_ops dsa_slave_ethtool_ops = {
.get_eee = dsa_slave_get_eee,
};
+static struct ethtool_ops dsa_cpu_port_ethtool_ops;
+
static const struct net_device_ops dsa_slave_netdev_ops = {
.ndo_open = dsa_slave_open,
.ndo_stop = dsa_slave_close,
@@ -1038,6 +1093,7 @@ int dsa_slave_create(struct dsa_switch *ds, struct device *parent,
int port, char *name)
{
struct net_device *master = ds->dst->master_netdev;
+ struct dsa_switch_tree *dst = ds->dst;
struct net_device *slave_dev;
struct dsa_slave_priv *p;
int ret;
@@ -1049,6 +1105,19 @@ int dsa_slave_create(struct dsa_switch *ds, struct device *parent,
slave_dev->features = master->vlan_features;
slave_dev->ethtool_ops = &dsa_slave_ethtool_ops;
+ if (master->ethtool_ops != &dsa_cpu_port_ethtool_ops) {
+ memcpy(&dst->master_ethtool_ops, master->ethtool_ops,
+ sizeof(struct ethtool_ops));
+ memcpy(&dsa_cpu_port_ethtool_ops, &dst->master_ethtool_ops,
+ sizeof(struct ethtool_ops));
+ dsa_cpu_port_ethtool_ops.get_sset_count =
+ dsa_cpu_port_get_sset_count;
+ dsa_cpu_port_ethtool_ops.get_ethtool_stats =
+ dsa_cpu_port_get_ethtool_stats;
+ dsa_cpu_port_ethtool_ops.get_strings =
+ dsa_cpu_port_get_strings;
+ master->ethtool_ops = &dsa_cpu_port_ethtool_ops;
+ }
eth_hw_addr_inherit(slave_dev, master);
slave_dev->priv_flags |= IFF_NO_QUEUE;
slave_dev->netdev_ops = &dsa_slave_netdev_ops;
--
2.1.0
^ permalink raw reply related
* Re: [PATCH net-next] net: dsa: remove tag_protocol from dsa_switch
From: Florian Fainelli @ 2016-04-20 17:59 UTC (permalink / raw)
To: Vivien Didelot, netdev; +Cc: linux-kernel, kernel, David S. Miller, Andrew Lunn
In-Reply-To: <1461018244-5371-1-git-send-email-vivien.didelot@savoirfairelinux.com>
On 18/04/16 15:24, Vivien Didelot wrote:
> Having the tag protocol in dsa_switch_driver for setup time and in
> dsa_switch_tree for runtime is enough. Remove dsa_switch's one.
>
> Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
--
Florian
^ permalink raw reply
* Re: [RFC PATCH 2/5] mlx5: Add support for UDP tunnel segmentation with outer checksum offload
From: Saeed Mahameed @ 2016-04-20 17:40 UTC (permalink / raw)
To: Alexander Duyck
Cc: eugenia, bruce.w.allan, Saeed Mahameed, Linux Netdev List,
intel-wired-lan, ariel.elior, mchan, Matthew Finlay
In-Reply-To: <20160419190603.11723.31623.stgit@ahduyck-xeon-server>
On Tue, Apr 19, 2016 at 10:06 PM, Alexander Duyck <aduyck@mirantis.com> wrote:
> This patch assumes that the mlx5 hardware will ignore existing IPv4/v6
> header fields for length and checksum as well as the length and checksum
> fields for outer UDP headers.
>
> I have no means of testing this as I do not have any mlx5 hardware but
> thought I would submit it as an RFC to see if anyone out there wants to
> test this and see if this does in fact enable this functionality allowing
> us to to segment UDP tunneled frames that have an outer checksum.
>
> Signed-off-by: Alexander Duyck <aduyck@mirantis.com>
> ---
> drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> index e0adb604f461..57d8da796d50 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> @@ -2390,13 +2390,18 @@ static void mlx5e_build_netdev(struct net_device *netdev)
> netdev->hw_features |= NETIF_F_HW_VLAN_CTAG_FILTER;
>
> if (mlx5e_vxlan_allowed(mdev)) {
> - netdev->hw_features |= NETIF_F_GSO_UDP_TUNNEL;
> + netdev->hw_features |= NETIF_F_GSO_UDP_TUNNEL |
> + NETIF_F_GSO_UDP_TUNNEL_CSUM |
> + NETIF_F_GSO_PARTIAL;
> netdev->hw_enc_features |= NETIF_F_IP_CSUM;
> netdev->hw_enc_features |= NETIF_F_RXCSUM;
> netdev->hw_enc_features |= NETIF_F_TSO;
> netdev->hw_enc_features |= NETIF_F_TSO6;
> netdev->hw_enc_features |= NETIF_F_RXHASH;
> netdev->hw_enc_features |= NETIF_F_GSO_UDP_TUNNEL;
> + netdev->hw_enc_features |= NETIF_F_GSO_UDP_TUNNEL_CSUM |
> + NETIF_F_GSO_PARTIAL;
> + netdev->gso_partial_features = NETIF_F_GSO_UDP_TUNNEL_CSUM;
> }
>
> netdev->features = netdev->hw_features;
>
Hi Alex,
Adding Matt, VxLAN feature owner from Mellanox,
Matt please correct me if am wrong, but We already tested GSO VxLAN
and we saw the TCP/IP checksum offloads for both inner and outer
headers handled by the hardware.
And looking at mlx5e_sq_xmit:
if (likely(skb->ip_summed == CHECKSUM_PARTIAL)) {
eseg->cs_flags = MLX5_ETH_WQE_L3_CSUM;
if (skb->encapsulation) {
eseg->cs_flags |= MLX5_ETH_WQE_L3_INNER_CSUM |
MLX5_ETH_WQE_L4_INNER_CSUM;
sq->stats.csum_offload_inner++;
} else {
eseg->cs_flags |= MLX5_ETH_WQE_L4_CSUM;
}
We enable inner/outer hardware checksumming unconditionally without
looking at the features Alex is suggesting in this patch,
Alex, can you elaborate more on the meaning of those features ? and
why would it work for us without declaring them ?
^ permalink raw reply
* Re: skb_at_tc_ingress helper breaks compilation of oot modules
From: Alexei Starovoitov @ 2016-04-20 17:33 UTC (permalink / raw)
To: Daniel Borkmann; +Cc: Ingo Saitz, netdev
In-Reply-To: <57175C13.8080109@iogearbox.net>
On Wed, Apr 20, 2016 at 12:38:11PM +0200, Daniel Borkmann wrote:
> On 04/20/2016 12:21 PM, Ingo Saitz wrote:
> >In Linux 4.5, when CONFIG_NET_CLS_ACT is defined, compilation of out of
> >tree modules breaks with undeclared functions/constants. The culprit is:
> >
> >commit fdc5432a7b44ab7de17141beec19d946b9344e91
> >Author: Daniel Borkmann <daniel@iogearbox.net>
> >Date: Thu Jan 7 15:50:22 2016 +0100
> >
> > net, sched: add skb_at_tc_ingress helper
> >
> >which uses G_TC_AT and AT_INGRESS but only includes linux/pkt_cls.h,
> >which does not include these #defines for oot builds. Unfortunately I'm
> >not sure what the correct fix is, maybe the uapi folks could help, but i
> >attached a simple testcase and build log (Makefile is straight from
> >kernelnewbies).
>
> Hmm, your fail.c test case only contains '#include <net/ipv6.h>'?
>
> Note, upstream kernel never cared about out-of-tree modules, only
> in-tree code. ;) Did you run into an issue with any in-tree code?
I'm glad it broke out of tree module. We should do it more often.
llvm constantly reshuffles internal api to incentivize upstreaming
and working with the community.
^ permalink raw reply
* Re: [PATCH net-next V2 05/11] net/mlx5e: Support RX multi-packet WQE (Striding RQ)
From: Saeed Mahameed @ 2016-04-20 16:46 UTC (permalink / raw)
To: Mel Gorman
Cc: Jesper Dangaard Brouer, Eric Dumazet, Saeed Mahameed,
David S. Miller, Linux Netdev List, Or Gerlitz, Tal Alon,
Tariq Toukan, Eran Ben Elisha, Achiad Shochat, linux-mm
In-Reply-To: <20160419173833.GB15167@techsingularity.net>
On Tue, Apr 19, 2016 at 8:39 PM, Mel Gorman <mgorman@techsingularity.net> wrote:
> On Tue, Apr 19, 2016 at 06:25:32PM +0200, Jesper Dangaard Brouer wrote:
>> On Mon, 18 Apr 2016 07:17:13 -0700
>> Eric Dumazet <eric.dumazet@gmail.com> wrote:
>>
>
> alloc_pages_exact()
>
We want to allocate 32 order-0 physically contiguous pages and to free
each one of them individually.
the documentation states "Memory allocated by this function must be
released by free_pages_exact()"
Also it returns a pointer to the memory and we need pointers to pages.
>> > > allocates many physically contiguous pages with order0 ! so we assume
>> > > it is ok to use split_page.
>> >
>> > Note: I have no idea of split_page() performance :
>>
>> Maybe Mel knows?
>
> Irrelevant in comparison to the cost of allocating an order-5 pages if
> one is not already available.
>
we still allocate order-5 pages but now we split them to 32 order-0 pages.
the split adds extra few cpu cycles but it is lookless and
straightforward, and it does the job in terms of better memory
utilization.
now in scenarios where small packets can hold a ref on pages for too
long they would hold a ref on order-0 pages rather than order-5.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply
* Re: [PATCH] MAINTAINERS: net: add entry for TI Ethernet Switch drivers
From: Grygorii Strashko @ 2016-04-20 16:18 UTC (permalink / raw)
To: Tony Lindgren
Cc: netdev, linux-kernel, Sekhar Nori, linux-omap, David S. Miller,
Mugunthan V N, Richard Cochran
In-Reply-To: <20160420142350.GD5995@atomide.com>
On 04/20/2016 05:23 PM, Tony Lindgren wrote:
> * Grygorii Strashko <grygorii.strashko@ti.com> [160420 04:26]:
>> Add record for TI Ethernet Switch Driver CPSW/CPDMA/MDIO HW
>> (am33/am43/am57/dr7/davinci) to ensure that related patches
>> will go through dedicated linux-omap list.
>>
>> Also add Mugunthan as maintainer and myself as the reviewer.
>>
>> Cc: "David S. Miller" <davem@davemloft.net>
>> Cc: Mugunthan V N <mugunthanvnm@ti.com>
>> Cc: Richard Cochran <richardcochran@gmail.com>
>> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
>> ---
>> MAINTAINERS | 8 ++++++++
>> 1 file changed, 8 insertions(+)
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 1d5b4be..aca864d 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -11071,6 +11071,14 @@ S: Maintained
>> F: drivers/clk/ti/
>> F: include/linux/clk/ti.h
>>
>> +TI ETHERNET SWITCH DRIVER (CPSW)
>> +M: Mugunthan V N <mugunthanvnm@ti.com>
>> +R: Grygorii Strashko <grygorii.strashko@ti.com>
>> +L: linux-omap@vger.kernel.org
>> +S: Maintained
>> +F: drivers/net/ethernet/ti/cpsw*
>> +F: drivers/net/ethernet/ti/davinci*
>> +
>> TI FLASH MEDIA INTERFACE DRIVER
>> M: Alex Dubov <oakad@yahoo.com>
>> S: Maintained
>> --
>
> Please add netdev list also there as the primary list:
>
> L: netdev@vger.kernel.org
> L: linux-omap@vger.kernel.org
>
> Then we can easily review and ack the patches for Dave to apply.
>
I can, but want clarify if it really necessary, because get_maintainer.pl
automatically adds netdev@vger.kernel.org:
./scripts/get_maintainer.pl ~/.../0001-drivers-net-cpsw-fix-port_mask-parameters-in-ale-cal.patch
Mugunthan V N <mugunthanvnm@ti.com> (maintainer:TI ETHERNET SWITCH DRIVER (CPSW))
Grygorii Strashko <grygorii.strashko@ti.com> (reviewer:TI ETHERNET SWITCH DRIVER (CPSW))
linux-omap@vger.kernel.org (open list:TI ETHERNET SWITCH DRIVER (CPSW))
netdev@vger.kernel.org (open list:NETWORKING DRIVERS)
linux-kernel@vger.kernel.org (open list)
--
regards,
-grygorii
^ permalink raw reply
* [PATCH iproute2 WIP] ifstat: use new RTM_GETSTATS api
From: Roopa Prabhu @ 2016-04-20 16:16 UTC (permalink / raw)
To: davem; +Cc: netdev
From: Roopa Prabhu <roopa@cumulusnetworks.com>
sample hacked up patch currently used for testing.
needs re-work if ifstat will move to RTM_GETSTATS.
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
---
include/libnetlink.h | 6 ++++++
include/linux/if_link.h | 22 ++++++++++++++++++++++
include/linux/rtnetlink.h | 5 +++++
lib/libnetlink.c | 31 +++++++++++++++++++++++++++++++
misc/ifstat.c | 37 ++++++++++++++++++++-----------------
5 files changed, 84 insertions(+), 17 deletions(-)
diff --git a/include/libnetlink.h b/include/libnetlink.h
index 491263f..ccaab46 100644
--- a/include/libnetlink.h
+++ b/include/libnetlink.h
@@ -44,6 +44,12 @@ int rtnl_dump_request(struct rtnl_handle *rth, int type, void *req,
int rtnl_dump_request_n(struct rtnl_handle *rth, struct nlmsghdr *n)
__attribute__((warn_unused_result));
+int rtnl_wilddump_stats_request(struct rtnl_handle *rth, int family, int type)
+ __attribute__((warn_unused_result));
+int rtnl_wilddump_stats_req_filter(struct rtnl_handle *rth, int family,
+ int type, __u32 filt_mask)
+ __attribute__((warn_unused_result));
+
struct rtnl_ctrl_data {
int nsid;
};
diff --git a/include/linux/if_link.h b/include/linux/if_link.h
index 6a688e8..eb1064a 100644
--- a/include/linux/if_link.h
+++ b/include/linux/if_link.h
@@ -165,6 +165,8 @@ enum {
#define IFLA_RTA(r) ((struct rtattr*)(((char*)(r)) + NLMSG_ALIGN(sizeof(struct ifinfomsg))))
#define IFLA_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct ifinfomsg))
+#define IFLA_RTA_STATS(r) ((struct rtattr*)(((char*)(r)) + NLMSG_ALIGN(sizeof(struct if_stats_msg))))
+
enum {
IFLA_INET_UNSPEC,
IFLA_INET_CONF,
@@ -777,4 +779,24 @@ enum {
#define IFLA_HSR_MAX (__IFLA_HSR_MAX - 1)
+/* STATS section */
+
+struct if_stats_msg {
+ __u8 family;
+ __u8 pad1;
+ __u16 pad2;
+ __u32 ifindex;
+ __u32 filter_mask;
+};
+
+enum {
+ IFLA_STATS_UNSPEC,
+ IFLA_STATS_LINK_64,
+ __IFLA_STATS_MAX,
+};
+
+#define IFLA_STATS_MAX (__IFLA_STATS_MAX - 1)
+
+#define IFLA_STATS_FILTER_BIT(ATTR) (1 << (ATTR - 1))
+
#endif /* _LINUX_IF_LINK_H */
diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index 6aaa2a3..e8cdff5 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -139,6 +139,11 @@ enum {
RTM_GETNSID = 90,
#define RTM_GETNSID RTM_GETNSID
+ RTM_NEWSTATS = 92,
+#define RTM_NEWSTATS RTM_NEWSTATS
+ RTM_GETSTATS = 94,
+#define RTM_GETSTATS RTM_GETSTATS
+
__RTM_MAX,
#define RTM_MAX (((__RTM_MAX + 3) & ~3) - 1)
};
diff --git a/lib/libnetlink.c b/lib/libnetlink.c
index a90e52c..f7baf51 100644
--- a/lib/libnetlink.c
+++ b/lib/libnetlink.c
@@ -838,3 +838,34 @@ int __parse_rtattr_nested_compat(struct rtattr *tb[], int max, struct rtattr *rt
memset(tb, 0, sizeof(struct rtattr *) * (max + 1));
return 0;
}
+
+int rtnl_wilddump_stats_req_filter(struct rtnl_handle *rth, int family, int type,
+ __u32 filt_mask)
+{
+ struct {
+ struct nlmsghdr nlh;
+ struct if_stats_msg ifsm;
+ } req;
+
+ int err;
+
+ memset(&req, 0, sizeof(req));
+ req.nlh.nlmsg_len = sizeof(req);
+ req.nlh.nlmsg_type = type;
+ req.nlh.nlmsg_flags = NLM_F_DUMP|NLM_F_REQUEST;
+ req.nlh.nlmsg_pid = 0;
+ req.nlh.nlmsg_seq = rth->dump = ++rth->seq;
+ req.ifsm.family = family;
+ req.ifsm.filter_mask = filt_mask;
+
+ err = send(rth->fd, (void*)&req, sizeof(req), 0);
+
+ return err;
+}
+
+int rtnl_wilddump_stats_request(struct rtnl_handle *rth, int family, int type)
+{
+ return rtnl_wilddump_stats_req_filter(rth, family, type,
+ IFLA_STATS_FILTER_BIT(IFLA_STATS_LINK_64));
+}
+
diff --git a/misc/ifstat.c b/misc/ifstat.c
index abbb4e7..e517c9a 100644
--- a/misc/ifstat.c
+++ b/misc/ifstat.c
@@ -35,6 +35,8 @@
#include <SNAPSHOT.h>
+#include "utils.h"
+
int dump_zeros;
int reset_history;
int ignore_history;
@@ -49,6 +51,8 @@ double W;
char **patterns;
int npatterns;
+struct rtnl_handle rth;
+
char info_source[128];
int source_mismatch;
@@ -58,9 +62,9 @@ struct ifstat_ent {
struct ifstat_ent *next;
char *name;
int ifindex;
- unsigned long long val[MAXS];
+ __u64 val[MAXS];
double rate[MAXS];
- __u32 ival[MAXS];
+ __u64 ival[MAXS];
};
static const char *stats[MAXS] = {
@@ -109,32 +113,29 @@ static int match(const char *id)
static int get_nlmsg(const struct sockaddr_nl *who,
struct nlmsghdr *m, void *arg)
{
- struct ifinfomsg *ifi = NLMSG_DATA(m);
- struct rtattr *tb[IFLA_MAX+1];
+ struct if_stats_msg *ifsm = NLMSG_DATA(m);
+ struct rtattr * tb[IFLA_STATS_MAX+1];
int len = m->nlmsg_len;
struct ifstat_ent *n;
int i;
- if (m->nlmsg_type != RTM_NEWLINK)
+ if (m->nlmsg_type != RTM_NEWSTATS)
return 0;
- len -= NLMSG_LENGTH(sizeof(*ifi));
+ len -= NLMSG_LENGTH(sizeof(*ifsm));
if (len < 0)
return -1;
- if (!(ifi->ifi_flags&IFF_UP))
- return 0;
-
- parse_rtattr(tb, IFLA_MAX, IFLA_RTA(ifi), len);
- if (tb[IFLA_IFNAME] == NULL || tb[IFLA_STATS] == NULL)
+ parse_rtattr(tb, IFLA_STATS_MAX, IFLA_RTA_STATS(ifsm), len);
+ if (tb[IFLA_STATS_LINK_64] == NULL)
return 0;
n = malloc(sizeof(*n));
if (!n)
abort();
- n->ifindex = ifi->ifi_index;
- n->name = strdup(RTA_DATA(tb[IFLA_IFNAME]));
- memcpy(&n->ival, RTA_DATA(tb[IFLA_STATS]), sizeof(n->ival));
+ n->ifindex = ifsm->ifindex;
+ n->name = strdup(ll_index_to_name(ifsm->ifindex));
+ memcpy(&n->ival, RTA_DATA(tb[IFLA_STATS_LINK_64]), sizeof(n->ival));
memset(&n->rate, 0, sizeof(n->rate));
for (i = 0; i < MAXS; i++)
n->val[i] = n->ival[i];
@@ -151,9 +152,11 @@ static void load_info(void)
if (rtnl_open(&rth, 0) < 0)
exit(1);
- if (rtnl_wilddump_request(&rth, AF_INET, RTM_GETLINK) < 0) {
+ ll_init_map(&rth);
+
+ if (rtnl_wilddump_stats_request(&rth, AF_UNSPEC, RTM_GETSTATS) < 0) {
perror("Cannot send dump request");
- exit(1);
+ exit(1);
}
if (rtnl_dump_filter(&rth, get_nlmsg, NULL) < 0) {
@@ -216,7 +219,7 @@ static void load_raw_table(FILE *fp)
*next++ = 0;
if (sscanf(p, "%llu", n->val+i) != 1)
abort();
- n->ival[i] = (__u32)n->val[i];
+ n->ival[i] = (__u64)n->val[i];
p = next;
if (!(next = strchr(p, ' ')))
abort();
--
1.9.1
^ permalink raw reply related
* Re: [PATCH net-next v6] rtnetlink: add new RTM_GETSTATS message to dump link stats
From: David Miller @ 2016-04-20 16:07 UTC (permalink / raw)
To: roopa; +Cc: netdev, jhs, tgraf, nicolas.dichtel, nikolay
In-Reply-To: <1461167023-7640-1-git-send-email-roopa@cumulusnetworks.com>
From: Roopa Prabhu <roopa@cumulusnetworks.com>
Date: Wed, 20 Apr 2016 08:43:43 -0700
> This patch has been tested with mofified iproute2 ifstat.
Can you please send me the patch you are using? I want to do some quick testing
on sparc64 before I push this out.
Thanks.
^ permalink raw reply
* Re: [PATCH v2 1/1] Revert "Prevent NUll pointer dereference with two PHYs on cpsw"
From: David Miller @ 2016-04-20 16:02 UTC (permalink / raw)
To: andrew.goodbody
Cc: netdev, linux-kernel, linux-omap, mugunthanvnm, grygorii.strashko,
tony
In-Reply-To: <1461165291-25043-2-git-send-email-andrew.goodbody@cambrionix.com>
From: Andrew Goodbody <andrew.goodbody@cambrionix.com>
Date: Wed, 20 Apr 2016 16:14:51 +0100
> This reverts commit cfe255600154f0072d4a8695590dbd194dfd1aeb
>
> This can result in a "Unable to handle kernel paging request"
> during boot. This was due to using an uninitialised struct member,
> data->slaves.
>
> Signed-off-by: Andrew Goodbody <andrew.goodbody@cambrionix.com>
> Tested-by: Tony Lindgren <tony@atomide.com>
> ---
>
> v2 No code change, added signoff and collected tested-by
Applied, thanks.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox