[PATCH v2 net-next 0/2] mvneta xmit

linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2 net-next 0/2] mvneta xmit_more and bql support
@ 2017-01-16 17:08 Marcin Wojtas
  2017-01-16 17:08 ` [PATCH v2 net-next 1/2] net: mvneta: add xmit_more support Marcin Wojtas
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Marcin Wojtas @ 2017-01-16 17:08 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

This is a delayed v2 of short patchset, which introduces xmit_more and BQL
to mvneta driver. The only one change was added in xmit_more support -
condition check preventing excessive descriptors concatenation before
flushing in HW.

Any comments or feedback would be welcome.

Best regards,
Marcin

Changelog:
v1 -> v2:

* Add checking condition that ensures too much descriptors are not
  concatenated before flushing in HW.

Marcin Wojtas (1):
  net: mvneta: add BQL support

Simon Guinot (1):
  net: mvneta: add xmit_more support

 drivers/net/ethernet/marvell/mvneta.c | 35 +++++++++++++++++++++++++++++------
 1 file changed, 29 insertions(+), 6 deletions(-)

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v2 net-next 1/2] net: mvneta: add xmit_more support
  2017-01-16 17:08 [PATCH v2 net-next 0/2] mvneta xmit_more and bql support Marcin Wojtas
@ 2017-01-16 17:08 ` Marcin Wojtas
  2017-01-16 17:08 ` [PATCH v2 net-next 2/2] net: mvneta: add BQL support Marcin Wojtas
  2017-01-17  1:13 ` [PATCH v2 net-next 0/2] mvneta xmit_more and bql support David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: Marcin Wojtas @ 2017-01-16 17:08 UTC (permalink / raw)
  To: linux-arm-kernel

From: Simon Guinot <simon.guinot@sequanux.org>

Basing on xmit_more flag of the skb, TX descriptors can be concatenated
before flushing. This commit delay Tx descriptor flush if the queue is
running and if there is more skb's to send.

A maximum allowed number of descriptors for flushing at once due to
MVNETA_TXQ_UPDATE_REG(q) reqisters limitation, is 255. Because of that
a new macro was added (MVNETA_TXQ_DEC_SENT_MASK) in order to ensure that
concatenated amount of descriptor does not exceed that value.

Signed-off-by: Simon Guinot <simon.guinot@sequanux.org>
Signed-off-by: Marcin Wojtas <mw@semihalf.com>
---
 drivers/net/ethernet/marvell/mvneta.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index 3607d8f..9624537 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -224,6 +224,7 @@
 #define      MVNETA_TXQ_SENT_THRESH_MASK(coal)   ((coal) << 16)
 #define MVNETA_TXQ_UPDATE_REG(q)                 (0x3c60 + ((q) << 2))
 #define      MVNETA_TXQ_DEC_SENT_SHIFT           16
+#define      MVNETA_TXQ_DEC_SENT_MASK            0xff
 #define MVNETA_TXQ_STATUS_REG(q)                 (0x3c40 + ((q) << 2))
 #define      MVNETA_TXQ_SENT_DESC_SHIFT          16
 #define      MVNETA_TXQ_SENT_DESC_MASK           0x3fff0000
@@ -525,6 +526,7 @@ struct mvneta_tx_queue {
 	 * descriptor ring
 	 */
 	int count;
+	int pending;
 	int tx_stop_threshold;
 	int tx_wake_threshold;
 
@@ -818,8 +820,9 @@ static void mvneta_txq_pend_desc_add(struct mvneta_port *pp,
 	/* Only 255 descriptors can be added@once ; Assume caller
 	 * process TX desriptors in quanta less than 256
 	 */
-	val = pend_desc;
+	val = pend_desc + txq->pending;
 	mvreg_write(pp, MVNETA_TXQ_UPDATE_REG(txq->id), val);
+	txq->pending = 0;
 }
 
 /* Get pointer to next TX descriptor to be processed (send) by HW */
@@ -2399,11 +2402,15 @@ static int mvneta_tx(struct sk_buff *skb, struct net_device *dev)
 		struct netdev_queue *nq = netdev_get_tx_queue(dev, txq_id);
 
 		txq->count += frags;
-		mvneta_txq_pend_desc_add(pp, txq, frags);
-
 		if (txq->count >= txq->tx_stop_threshold)
 			netif_tx_stop_queue(nq);
 
+		if (!skb->xmit_more || netif_xmit_stopped(nq) ||
+		    txq->pending + frags > MVNETA_TXQ_DEC_SENT_MASK)
+			mvneta_txq_pend_desc_add(pp, txq, frags);
+		else
+			txq->pending += frags;
+
 		u64_stats_update_begin(&stats->syncp);
 		stats->tx_packets++;
 		stats->tx_bytes  += len;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v2 net-next 2/2] net: mvneta: add BQL support
  2017-01-16 17:08 [PATCH v2 net-next 0/2] mvneta xmit_more and bql support Marcin Wojtas
  2017-01-16 17:08 ` [PATCH v2 net-next 1/2] net: mvneta: add xmit_more support Marcin Wojtas
@ 2017-01-16 17:08 ` Marcin Wojtas
  2017-01-17  1:13 ` [PATCH v2 net-next 0/2] mvneta xmit_more and bql support David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: Marcin Wojtas @ 2017-01-16 17:08 UTC (permalink / raw)
  To: linux-arm-kernel

Tests showed that when whole bandwidth is consumed, the latency for
various kind of traffic can reach high values. With saturated
link (e.g. with iperf from target to host) simple ping could take
significant amount of time. BQL proved to improve this situation
when implemented in mvneta driver. Measurements of ping latency
for 3 link speeds:
Speed | Latency w/o BQL | Latency with BQL
10    |      7-14 ms    |     3.5 ms
100   |      2-12 ms    |     0.6 ms
1000  |   often timeout |   up to 2ms

Decreasing latency as above result in sligt performance cost - 4kpps
(-1.4%) when pushing 64B packets via two bridged interfaces of Armada 38x.
For 1500B packets in the same setup, the mpstat tool showed +8% of
CPU occupation (default affinity, second CPU idle). Even though this
cost seems reasonable to take, considering other improvements.

This commit adds byte queue limit mechanism for the mvneta driver.

Signed-off-by: Marcin Wojtas <mw@semihalf.com>
---
 drivers/net/ethernet/marvell/mvneta.c | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index 9624537..6dcc951 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -1759,8 +1759,10 @@ static struct mvneta_tx_queue *mvneta_tx_done_policy(struct mvneta_port *pp,
 
 /* Free tx queue skbuffs */
 static void mvneta_txq_bufs_free(struct mvneta_port *pp,
-				 struct mvneta_tx_queue *txq, int num)
+				 struct mvneta_tx_queue *txq, int num,
+				 struct netdev_queue *nq)
 {
+	unsigned int bytes_compl = 0, pkts_compl = 0;
 	int i;
 
 	for (i = 0; i < num; i++) {
@@ -1768,6 +1770,11 @@ static void mvneta_txq_bufs_free(struct mvneta_port *pp,
 			txq->txq_get_index;
 		struct sk_buff *skb = txq->tx_skb[txq->txq_get_index];
 
+		if (skb) {
+			bytes_compl += skb->len;
+			pkts_compl++;
+		}
+
 		mvneta_txq_inc_get(txq);
 
 		if (!IS_TSO_HEADER(txq, tx_desc->buf_phys_addr))
@@ -1778,6 +1785,8 @@ static void mvneta_txq_bufs_free(struct mvneta_port *pp,
 			continue;
 		dev_kfree_skb_any(skb);
 	}
+
+	netdev_tx_completed_queue(nq, pkts_compl, bytes_compl);
 }
 
 /* Handle end of transmission */
@@ -1791,7 +1800,7 @@ static void mvneta_txq_done(struct mvneta_port *pp,
 	if (!tx_done)
 		return;
 
-	mvneta_txq_bufs_free(pp, txq, tx_done);
+	mvneta_txq_bufs_free(pp, txq, tx_done, nq);
 
 	txq->count -= tx_done;
 
@@ -2401,6 +2410,8 @@ static int mvneta_tx(struct sk_buff *skb, struct net_device *dev)
 		struct mvneta_pcpu_stats *stats = this_cpu_ptr(pp->stats);
 		struct netdev_queue *nq = netdev_get_tx_queue(dev, txq_id);
 
+		netdev_tx_sent_queue(nq, len);
+
 		txq->count += frags;
 		if (txq->count >= txq->tx_stop_threshold)
 			netif_tx_stop_queue(nq);
@@ -2429,9 +2440,10 @@ static void mvneta_txq_done_force(struct mvneta_port *pp,
 				  struct mvneta_tx_queue *txq)
 
 {
+	struct netdev_queue *nq = netdev_get_tx_queue(pp->dev, txq->id);
 	int tx_done = txq->count;
 
-	mvneta_txq_bufs_free(pp, txq, tx_done);
+	mvneta_txq_bufs_free(pp, txq, tx_done, nq);
 
 	/* reset txq */
 	txq->count = 0;
@@ -2957,6 +2969,8 @@ static int mvneta_txq_init(struct mvneta_port *pp,
 static void mvneta_txq_deinit(struct mvneta_port *pp,
 			      struct mvneta_tx_queue *txq)
 {
+	struct netdev_queue *nq = netdev_get_tx_queue(pp->dev, txq->id);
+
 	kfree(txq->tx_skb);
 
 	if (txq->tso_hdrs)
@@ -2968,6 +2982,8 @@ static void mvneta_txq_deinit(struct mvneta_port *pp,
 				  txq->size * MVNETA_DESC_ALIGNED_SIZE,
 				  txq->descs, txq->descs_phys);
 
+	netdev_tx_reset_queue(nq);
+
 	txq->descs             = NULL;
 	txq->last_desc         = 0;
 	txq->next_desc_to_proc = 0;
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v2 net-next 0/2] mvneta xmit_more and bql support
  2017-01-16 17:08 [PATCH v2 net-next 0/2] mvneta xmit_more and bql support Marcin Wojtas
  2017-01-16 17:08 ` [PATCH v2 net-next 1/2] net: mvneta: add xmit_more support Marcin Wojtas
  2017-01-16 17:08 ` [PATCH v2 net-next 2/2] net: mvneta: add BQL support Marcin Wojtas
@ 2017-01-17  1:13 ` David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: David Miller @ 2017-01-17  1:13 UTC (permalink / raw)
  To: linux-arm-kernel

From: Marcin Wojtas <mw@semihalf.com>
Date: Mon, 16 Jan 2017 18:08:30 +0100

> This is a delayed v2 of short patchset, which introduces xmit_more and BQL
> to mvneta driver. The only one change was added in xmit_more support -
> condition check preventing excessive descriptors concatenation before
> flushing in HW.
> 
> Any comments or feedback would be welcome.
 ...
> Changelog:
> v1 -> v2:
> 
> * Add checking condition that ensures too much descriptors are not
>   concatenated before flushing in HW.

Looks good, series applied, thanks.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-01-17  1:13 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-01-16 17:08 [PATCH v2 net-next 0/2] mvneta xmit_more and bql support Marcin Wojtas
2017-01-16 17:08 ` [PATCH v2 net-next 1/2] net: mvneta: add xmit_more support Marcin Wojtas
2017-01-16 17:08 ` [PATCH v2 net-next 2/2] net: mvneta: add BQL support Marcin Wojtas
2017-01-17  1:13 ` [PATCH v2 net-next 0/2] mvneta xmit_more and bql support David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).