netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v2 0/2] net: renesas: rswitch: Improve perfromance of TX/RX
@ 2023-06-06  8:55 Yoshihiro Shimoda
  2023-06-06  8:55 ` [PATCH net-next v2 1/2] net: renesas: rswitch: Use napi_gro_receive() in RX Yoshihiro Shimoda
  2023-06-06  8:55 ` [PATCH net-next v2 2/2] net: renesas: rswitch: Use hardware pause features Yoshihiro Shimoda
  0 siblings, 2 replies; 7+ messages in thread
From: Yoshihiro Shimoda @ 2023-06-06  8:55 UTC (permalink / raw)
  To: s.shtylyov, davem, edumazet, kuba, pabeni
  Cc: netdev, linux-renesas-soc, Yoshihiro Shimoda

This patch series is based on net-next.git / main branch [1]. This patch
series can improve perfromance of TX in a specific condition. The previous code
used "global rate limiter" feature so that this is possible to cause
performance down if we use multiple ports at the same time. To resolve this
issue, use "hardware pause" features of GWCA and COMA. Note that this is not
related to the ethernet PAUSE frames.

< UDP TX by iperf3 >
 before: about 450Mbps on both tsn0 and tsn1
 after:  about 950Mbps on both tsn0 and tsn1

Also, this patch series can improve performance of RX by using
napi_gro_receive().

< TCP RX by iperf >
 before: about 670Mbps on tsn0
 after:  about 840Mbps on tsn0

[1]
The commit ddb8701dcb67 ("Merge branch 'splice-net-handle-msg_splice_pages-in-af_kcm'")

Changes from v1:
https://lore.kernel.org/all/20230529080840.1156458-1-yoshihiro.shimoda.uh@renesas.com/
 - Rebased on the latest net-next.git / main branch.
 - Use "hardware pause" feature instead of "per-queue limiter" feature.
 - Drop refactaring for "per-queue limiter".
 - Drop dt-bindings update because "hardware pause" doesn't need additional
   clock information.
 - Use napi_gro_receive() to improve RX performance.

Yoshihiro Shimoda (2):
  net: renesas: rswitch: Use napi_gro_receive() in RX
  net: renesas: rswitch: Use hardware pause features

 drivers/net/ethernet/renesas/rswitch.c | 38 ++++++++++----------------
 drivers/net/ethernet/renesas/rswitch.h |  6 ++++
 2 files changed, 21 insertions(+), 23 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH net-next v2 1/2] net: renesas: rswitch: Use napi_gro_receive() in RX
  2023-06-06  8:55 [PATCH net-next v2 0/2] net: renesas: rswitch: Improve perfromance of TX/RX Yoshihiro Shimoda
@ 2023-06-06  8:55 ` Yoshihiro Shimoda
  2023-06-06 17:50   ` Maciej Fijalkowski
  2023-06-06  8:55 ` [PATCH net-next v2 2/2] net: renesas: rswitch: Use hardware pause features Yoshihiro Shimoda
  1 sibling, 1 reply; 7+ messages in thread
From: Yoshihiro Shimoda @ 2023-06-06  8:55 UTC (permalink / raw)
  To: s.shtylyov, davem, edumazet, kuba, pabeni
  Cc: netdev, linux-renesas-soc, Yoshihiro Shimoda

This hardware can receive multiple frames so that using
napi_gro_receive() instead of netif_receive_skb() gets good
performance of RX.

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
---
 drivers/net/ethernet/renesas/rswitch.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/renesas/rswitch.c b/drivers/net/ethernet/renesas/rswitch.c
index aace87139cea..7bb0a6d594a0 100644
--- a/drivers/net/ethernet/renesas/rswitch.c
+++ b/drivers/net/ethernet/renesas/rswitch.c
@@ -729,7 +729,7 @@ static bool rswitch_rx(struct net_device *ndev, int *quota)
 		}
 		skb_put(skb, pkt_len);
 		skb->protocol = eth_type_trans(skb, ndev);
-		netif_receive_skb(skb);
+		napi_gro_receive(&rdev->napi, skb);
 		rdev->ndev->stats.rx_packets++;
 		rdev->ndev->stats.rx_bytes += pkt_len;
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH net-next v2 2/2] net: renesas: rswitch: Use hardware pause features
  2023-06-06  8:55 [PATCH net-next v2 0/2] net: renesas: rswitch: Improve perfromance of TX/RX Yoshihiro Shimoda
  2023-06-06  8:55 ` [PATCH net-next v2 1/2] net: renesas: rswitch: Use napi_gro_receive() in RX Yoshihiro Shimoda
@ 2023-06-06  8:55 ` Yoshihiro Shimoda
  2023-06-06 17:54   ` Maciej Fijalkowski
  1 sibling, 1 reply; 7+ messages in thread
From: Yoshihiro Shimoda @ 2023-06-06  8:55 UTC (permalink / raw)
  To: s.shtylyov, davem, edumazet, kuba, pabeni
  Cc: netdev, linux-renesas-soc, Yoshihiro Shimoda

Use "per priority pause" feature of GWCA and "global pause" feature of
COMA instead of "global rate limiter" of GWCA. Otherwise TX performance
will be low when we use multiple ports at the same time.

Note that these features are not related to the ethernet PAUSE frame.

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
---
 drivers/net/ethernet/renesas/rswitch.c | 36 ++++++++++----------------
 drivers/net/ethernet/renesas/rswitch.h |  6 +++++
 2 files changed, 20 insertions(+), 22 deletions(-)

diff --git a/drivers/net/ethernet/renesas/rswitch.c b/drivers/net/ethernet/renesas/rswitch.c
index 7bb0a6d594a0..84f62c77eb8f 100644
--- a/drivers/net/ethernet/renesas/rswitch.c
+++ b/drivers/net/ethernet/renesas/rswitch.c
@@ -90,6 +90,11 @@ static int rswitch_bpool_config(struct rswitch_private *priv)
 	return rswitch_reg_wait(priv->addr, CABPIRM, CABPIRM_BPR, CABPIRM_BPR);
 }
 
+static void rswitch_coma_init(struct rswitch_private *priv)
+{
+	iowrite32(CABPPFLC_INIT_VALUE, priv->addr + CABPPFLC0);
+}
+
 /* R-Switch-2 block (TOP) */
 static void rswitch_top_init(struct rswitch_private *priv)
 {
@@ -156,24 +161,6 @@ static int rswitch_gwca_axi_ram_reset(struct rswitch_private *priv)
 	return rswitch_reg_wait(priv->addr, GWARIRM, GWARIRM_ARR, GWARIRM_ARR);
 }
 
-static void rswitch_gwca_set_rate_limit(struct rswitch_private *priv, int rate)
-{
-	u32 gwgrlulc, gwgrlc;
-
-	switch (rate) {
-	case 1000:
-		gwgrlulc = 0x0000005f;
-		gwgrlc = 0x00010260;
-		break;
-	default:
-		dev_err(&priv->pdev->dev, "%s: This rate is not supported (%d)\n", __func__, rate);
-		return;
-	}
-
-	iowrite32(gwgrlulc, priv->addr + GWGRLULC);
-	iowrite32(gwgrlc, priv->addr + GWGRLC);
-}
-
 static bool rswitch_is_any_data_irq(struct rswitch_private *priv, u32 *dis, bool tx)
 {
 	u32 *mask = tx ? priv->gwca.tx_irq_bits : priv->gwca.rx_irq_bits;
@@ -402,7 +389,7 @@ static int rswitch_gwca_queue_format(struct net_device *ndev,
 	linkfix->die_dt = DT_LINKFIX;
 	rswitch_desc_set_dptr(linkfix, gq->ring_dma);
 
-	iowrite32(GWDCC_BALR | (gq->dir_tx ? GWDCC_DQT : 0) | GWDCC_EDE,
+	iowrite32(GWDCC_BALR | (gq->dir_tx ? GWDCC_DCP(GWCA_IPV_NUM) | GWDCC_DQT : 0) | GWDCC_EDE,
 		  priv->addr + GWDCC_OFFS(gq->index));
 
 	return 0;
@@ -500,7 +487,8 @@ static int rswitch_gwca_queue_ext_ts_format(struct net_device *ndev,
 	linkfix->die_dt = DT_LINKFIX;
 	rswitch_desc_set_dptr(linkfix, gq->ring_dma);
 
-	iowrite32(GWDCC_BALR | (gq->dir_tx ? GWDCC_DQT : 0) | GWDCC_ETS | GWDCC_EDE,
+	iowrite32(GWDCC_BALR | (gq->dir_tx ? GWDCC_DCP(GWCA_IPV_NUM) | GWDCC_DQT : 0) |
+		  GWDCC_ETS | GWDCC_EDE,
 		  priv->addr + GWDCC_OFFS(gq->index));
 
 	return 0;
@@ -649,7 +637,8 @@ static int rswitch_gwca_hw_init(struct rswitch_private *priv)
 	iowrite32(lower_32_bits(priv->gwca.ts_queue.ring_dma), priv->addr + GWTDCAC10);
 	iowrite32(upper_32_bits(priv->gwca.ts_queue.ring_dma), priv->addr + GWTDCAC00);
 	iowrite32(GWCA_TS_IRQ_BIT, priv->addr + GWTSDCC0);
-	rswitch_gwca_set_rate_limit(priv, priv->gwca.speed);
+
+	iowrite32(GWTPC_PPPL(GWCA_IPV_NUM), priv->addr + GWTPC0);
 
 	for (i = 0; i < RSWITCH_NUM_PORTS; i++) {
 		err = rswitch_rxdmac_init(priv, i);
@@ -1502,7 +1491,8 @@ static netdev_tx_t rswitch_start_xmit(struct sk_buff *skb, struct net_device *nd
 	rswitch_desc_set_dptr(&desc->desc, dma_addr);
 	desc->desc.info_ds = cpu_to_le16(skb->len);
 
-	desc->info1 = cpu_to_le64(INFO1_DV(BIT(rdev->etha->index)) | INFO1_FMT);
+	desc->info1 = cpu_to_le64(INFO1_DV(BIT(rdev->etha->index)) |
+				  INFO1_IPV(GWCA_IPV_NUM) | INFO1_FMT);
 	if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) {
 		struct rswitch_gwca_ts_info *ts_info;
 
@@ -1772,6 +1762,8 @@ static int rswitch_init(struct rswitch_private *priv)
 	if (err < 0)
 		return err;
 
+	rswitch_coma_init(priv);
+
 	err = rswitch_gwca_linkfix_alloc(priv);
 	if (err < 0)
 		return -ENOMEM;
diff --git a/drivers/net/ethernet/renesas/rswitch.h b/drivers/net/ethernet/renesas/rswitch.h
index b3e0411b408e..08dadd28001e 100644
--- a/drivers/net/ethernet/renesas/rswitch.h
+++ b/drivers/net/ethernet/renesas/rswitch.h
@@ -48,6 +48,7 @@
 #define GWCA_NUM_IRQS		8
 #define GWCA_INDEX		0
 #define AGENT_INDEX_GWCA	3
+#define GWCA_IPV_NUM		0
 #define GWRO			RSWITCH_GWCA0_OFFSET
 
 #define GWCA_TS_IRQ_RESOURCE_NAME	"gwca0_rxts0"
@@ -768,11 +769,13 @@ enum rswitch_gwca_mode {
 #define GWARIRM_ARR		BIT(1)
 
 #define GWDCC_BALR		BIT(24)
+#define GWDCC_DCP(prio)		(((prio) & 0x07) << 16)
 #define GWDCC_DQT		BIT(11)
 #define GWDCC_ETS		BIT(9)
 #define GWDCC_EDE		BIT(8)
 
 #define GWTRC(queue)		(GWTRC0 + (queue) / 32 * 4)
+#define GWTPC_PPPL(ipv)		BIT(ipv)
 #define GWDCC_OFFS(queue)	(GWDCC0 + (queue) * 4)
 
 #define GWDIS(i)		(GWDIS0 + (i) * 0x10)
@@ -789,6 +792,8 @@ enum rswitch_gwca_mode {
 #define CABPIRM_BPIOG		BIT(0)
 #define CABPIRM_BPR		BIT(1)
 
+#define CABPPFLC_INIT_VALUE	0x00800080
+
 /* MFWD */
 #define FWPC0_LTHTA		BIT(0)
 #define FWPC0_IP4UE		BIT(3)
@@ -863,6 +868,7 @@ enum DIE_DT {
 
 /* For transmission */
 #define INFO1_TSUN(val)		((u64)(val) << 8ULL)
+#define INFO1_IPV(prio)		((u64)(prio) << 28ULL)
 #define INFO1_CSD0(index)	((u64)(index) << 32ULL)
 #define INFO1_CSD1(index)	((u64)(index) << 40ULL)
 #define INFO1_DV(port_vector)	((u64)(port_vector) << 48ULL)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next v2 1/2] net: renesas: rswitch: Use napi_gro_receive() in RX
  2023-06-06  8:55 ` [PATCH net-next v2 1/2] net: renesas: rswitch: Use napi_gro_receive() in RX Yoshihiro Shimoda
@ 2023-06-06 17:50   ` Maciej Fijalkowski
  2023-06-07  1:00     ` Yoshihiro Shimoda
  0 siblings, 1 reply; 7+ messages in thread
From: Maciej Fijalkowski @ 2023-06-06 17:50 UTC (permalink / raw)
  To: Yoshihiro Shimoda
  Cc: s.shtylyov, davem, edumazet, kuba, pabeni, netdev,
	linux-renesas-soc

On Tue, Jun 06, 2023 at 05:55:57PM +0900, Yoshihiro Shimoda wrote:
> This hardware can receive multiple frames so that using
> napi_gro_receive() instead of netif_receive_skb() gets good
> performance of RX.
> 
> Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>

Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>

> ---
>  drivers/net/ethernet/renesas/rswitch.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/renesas/rswitch.c b/drivers/net/ethernet/renesas/rswitch.c
> index aace87139cea..7bb0a6d594a0 100644
> --- a/drivers/net/ethernet/renesas/rswitch.c
> +++ b/drivers/net/ethernet/renesas/rswitch.c
> @@ -729,7 +729,7 @@ static bool rswitch_rx(struct net_device *ndev, int *quota)
>  		}
>  		skb_put(skb, pkt_len);
>  		skb->protocol = eth_type_trans(skb, ndev);
> -		netif_receive_skb(skb);
> +		napi_gro_receive(&rdev->napi, skb);

Some other optmization which you could do later on is to improve
rswitch_next_queue_index() as it is used on a per packet basis.

>  		rdev->ndev->stats.rx_packets++;
>  		rdev->ndev->stats.rx_bytes += pkt_len;
>  
> -- 
> 2.25.1
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH net-next v2 2/2] net: renesas: rswitch: Use hardware pause features
  2023-06-06  8:55 ` [PATCH net-next v2 2/2] net: renesas: rswitch: Use hardware pause features Yoshihiro Shimoda
@ 2023-06-06 17:54   ` Maciej Fijalkowski
  2023-06-06 23:57     ` Yoshihiro Shimoda
  0 siblings, 1 reply; 7+ messages in thread
From: Maciej Fijalkowski @ 2023-06-06 17:54 UTC (permalink / raw)
  To: Yoshihiro Shimoda
  Cc: s.shtylyov, davem, edumazet, kuba, pabeni, netdev,
	linux-renesas-soc

On Tue, Jun 06, 2023 at 05:55:58PM +0900, Yoshihiro Shimoda wrote:
> Use "per priority pause" feature of GWCA and "global pause" feature of
> COMA instead of "global rate limiter" of GWCA. Otherwise TX performance
> will be low when we use multiple ports at the same time.

does it mean that global pause feature is completely useless?

> 
> Note that these features are not related to the ethernet PAUSE frame.
> 
> Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> ---
>  drivers/net/ethernet/renesas/rswitch.c | 36 ++++++++++----------------
>  drivers/net/ethernet/renesas/rswitch.h |  6 +++++
>  2 files changed, 20 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/net/ethernet/renesas/rswitch.c b/drivers/net/ethernet/renesas/rswitch.c
> index 7bb0a6d594a0..84f62c77eb8f 100644
> --- a/drivers/net/ethernet/renesas/rswitch.c
> +++ b/drivers/net/ethernet/renesas/rswitch.c
> @@ -90,6 +90,11 @@ static int rswitch_bpool_config(struct rswitch_private *priv)
>  	return rswitch_reg_wait(priv->addr, CABPIRM, CABPIRM_BPR, CABPIRM_BPR);
>  }
>  
> +static void rswitch_coma_init(struct rswitch_private *priv)
> +{
> +	iowrite32(CABPPFLC_INIT_VALUE, priv->addr + CABPPFLC0);
> +}
> +
>  /* R-Switch-2 block (TOP) */
>  static void rswitch_top_init(struct rswitch_private *priv)
>  {
> @@ -156,24 +161,6 @@ static int rswitch_gwca_axi_ram_reset(struct rswitch_private *priv)
>  	return rswitch_reg_wait(priv->addr, GWARIRM, GWARIRM_ARR, GWARIRM_ARR);
>  }
>  
> -static void rswitch_gwca_set_rate_limit(struct rswitch_private *priv, int rate)
> -{
> -	u32 gwgrlulc, gwgrlc;
> -
> -	switch (rate) {
> -	case 1000:
> -		gwgrlulc = 0x0000005f;
> -		gwgrlc = 0x00010260;
> -		break;
> -	default:
> -		dev_err(&priv->pdev->dev, "%s: This rate is not supported (%d)\n", __func__, rate);
> -		return;
> -	}
> -
> -	iowrite32(gwgrlulc, priv->addr + GWGRLULC);
> -	iowrite32(gwgrlc, priv->addr + GWGRLC);
> -}
> -
>  static bool rswitch_is_any_data_irq(struct rswitch_private *priv, u32 *dis, bool tx)
>  {
>  	u32 *mask = tx ? priv->gwca.tx_irq_bits : priv->gwca.rx_irq_bits;
> @@ -402,7 +389,7 @@ static int rswitch_gwca_queue_format(struct net_device *ndev,
>  	linkfix->die_dt = DT_LINKFIX;
>  	rswitch_desc_set_dptr(linkfix, gq->ring_dma);
>  
> -	iowrite32(GWDCC_BALR | (gq->dir_tx ? GWDCC_DQT : 0) | GWDCC_EDE,
> +	iowrite32(GWDCC_BALR | (gq->dir_tx ? GWDCC_DCP(GWCA_IPV_NUM) | GWDCC_DQT : 0) | GWDCC_EDE,
>  		  priv->addr + GWDCC_OFFS(gq->index));
>  
>  	return 0;
> @@ -500,7 +487,8 @@ static int rswitch_gwca_queue_ext_ts_format(struct net_device *ndev,
>  	linkfix->die_dt = DT_LINKFIX;
>  	rswitch_desc_set_dptr(linkfix, gq->ring_dma);
>  
> -	iowrite32(GWDCC_BALR | (gq->dir_tx ? GWDCC_DQT : 0) | GWDCC_ETS | GWDCC_EDE,
> +	iowrite32(GWDCC_BALR | (gq->dir_tx ? GWDCC_DCP(GWCA_IPV_NUM) | GWDCC_DQT : 0) |
> +		  GWDCC_ETS | GWDCC_EDE,
>  		  priv->addr + GWDCC_OFFS(gq->index));
>  
>  	return 0;
> @@ -649,7 +637,8 @@ static int rswitch_gwca_hw_init(struct rswitch_private *priv)
>  	iowrite32(lower_32_bits(priv->gwca.ts_queue.ring_dma), priv->addr + GWTDCAC10);
>  	iowrite32(upper_32_bits(priv->gwca.ts_queue.ring_dma), priv->addr + GWTDCAC00);
>  	iowrite32(GWCA_TS_IRQ_BIT, priv->addr + GWTSDCC0);
> -	rswitch_gwca_set_rate_limit(priv, priv->gwca.speed);
> +
> +	iowrite32(GWTPC_PPPL(GWCA_IPV_NUM), priv->addr + GWTPC0);
>  
>  	for (i = 0; i < RSWITCH_NUM_PORTS; i++) {
>  		err = rswitch_rxdmac_init(priv, i);
> @@ -1502,7 +1491,8 @@ static netdev_tx_t rswitch_start_xmit(struct sk_buff *skb, struct net_device *nd
>  	rswitch_desc_set_dptr(&desc->desc, dma_addr);
>  	desc->desc.info_ds = cpu_to_le16(skb->len);
>  
> -	desc->info1 = cpu_to_le64(INFO1_DV(BIT(rdev->etha->index)) | INFO1_FMT);
> +	desc->info1 = cpu_to_le64(INFO1_DV(BIT(rdev->etha->index)) |
> +				  INFO1_IPV(GWCA_IPV_NUM) | INFO1_FMT);
>  	if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) {
>  		struct rswitch_gwca_ts_info *ts_info;
>  
> @@ -1772,6 +1762,8 @@ static int rswitch_init(struct rswitch_private *priv)
>  	if (err < 0)
>  		return err;
>  
> +	rswitch_coma_init(priv);
> +
>  	err = rswitch_gwca_linkfix_alloc(priv);
>  	if (err < 0)
>  		return -ENOMEM;
> diff --git a/drivers/net/ethernet/renesas/rswitch.h b/drivers/net/ethernet/renesas/rswitch.h
> index b3e0411b408e..08dadd28001e 100644
> --- a/drivers/net/ethernet/renesas/rswitch.h
> +++ b/drivers/net/ethernet/renesas/rswitch.h
> @@ -48,6 +48,7 @@
>  #define GWCA_NUM_IRQS		8
>  #define GWCA_INDEX		0
>  #define AGENT_INDEX_GWCA	3
> +#define GWCA_IPV_NUM		0
>  #define GWRO			RSWITCH_GWCA0_OFFSET
>  
>  #define GWCA_TS_IRQ_RESOURCE_NAME	"gwca0_rxts0"
> @@ -768,11 +769,13 @@ enum rswitch_gwca_mode {
>  #define GWARIRM_ARR		BIT(1)
>  
>  #define GWDCC_BALR		BIT(24)
> +#define GWDCC_DCP(prio)		(((prio) & 0x07) << 16)

I'd be glad to see defines for magic numbers above.

>  #define GWDCC_DQT		BIT(11)
>  #define GWDCC_ETS		BIT(9)
>  #define GWDCC_EDE		BIT(8)
>  
>  #define GWTRC(queue)		(GWTRC0 + (queue) / 32 * 4)
> +#define GWTPC_PPPL(ipv)		BIT(ipv)
>  #define GWDCC_OFFS(queue)	(GWDCC0 + (queue) * 4)
>  
>  #define GWDIS(i)		(GWDIS0 + (i) * 0x10)
> @@ -789,6 +792,8 @@ enum rswitch_gwca_mode {
>  #define CABPIRM_BPIOG		BIT(0)
>  #define CABPIRM_BPR		BIT(1)
>  
> +#define CABPPFLC_INIT_VALUE	0x00800080
> +
>  /* MFWD */
>  #define FWPC0_LTHTA		BIT(0)
>  #define FWPC0_IP4UE		BIT(3)
> @@ -863,6 +868,7 @@ enum DIE_DT {
>  
>  /* For transmission */
>  #define INFO1_TSUN(val)		((u64)(val) << 8ULL)
> +#define INFO1_IPV(prio)		((u64)(prio) << 28ULL)
>  #define INFO1_CSD0(index)	((u64)(index) << 32ULL)
>  #define INFO1_CSD1(index)	((u64)(index) << 40ULL)
>  #define INFO1_DV(port_vector)	((u64)(port_vector) << 48ULL)
> -- 
> 2.25.1
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH net-next v2 2/2] net: renesas: rswitch: Use hardware pause features
  2023-06-06 17:54   ` Maciej Fijalkowski
@ 2023-06-06 23:57     ` Yoshihiro Shimoda
  0 siblings, 0 replies; 7+ messages in thread
From: Yoshihiro Shimoda @ 2023-06-06 23:57 UTC (permalink / raw)
  To: Maciej Fijalkowski
  Cc: s.shtylyov@omp.ru, davem@davemloft.net, edumazet@google.com,
	kuba@kernel.org, pabeni@redhat.com, netdev@vger.kernel.org,
	linux-renesas-soc@vger.kernel.org

Hello Maciej,

> From: Maciej Fijalkowski, Sent: Wednesday, June 7, 2023 2:55 AM
> 
> On Tue, Jun 06, 2023 at 05:55:58PM +0900, Yoshihiro Shimoda wrote:
> > Use "per priority pause" feature of GWCA and "global pause" feature of
> > COMA instead of "global rate limiter" of GWCA. Otherwise TX performance
> > will be low when we use multiple ports at the same time.
> 
> does it mean that global pause feature is completely useless?

The global rate limiter is useless, not global pause. I'll revise this
description on v2.

> >
> > Note that these features are not related to the ethernet PAUSE frame.
> >
> > Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> > ---
> >  drivers/net/ethernet/renesas/rswitch.c | 36 ++++++++++----------------
> >  drivers/net/ethernet/renesas/rswitch.h |  6 +++++
> >  2 files changed, 20 insertions(+), 22 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/renesas/rswitch.c b/drivers/net/ethernet/renesas/rswitch.c
> > index 7bb0a6d594a0..84f62c77eb8f 100644
> > --- a/drivers/net/ethernet/renesas/rswitch.c
> > +++ b/drivers/net/ethernet/renesas/rswitch.c
> > @@ -90,6 +90,11 @@ static int rswitch_bpool_config(struct rswitch_private *priv)
> >  	return rswitch_reg_wait(priv->addr, CABPIRM, CABPIRM_BPR, CABPIRM_BPR);
> >  }
> >
> > +static void rswitch_coma_init(struct rswitch_private *priv)
> > +{
> > +	iowrite32(CABPPFLC_INIT_VALUE, priv->addr + CABPPFLC0);
> > +}
> > +
> >  /* R-Switch-2 block (TOP) */
> >  static void rswitch_top_init(struct rswitch_private *priv)
> >  {
> > @@ -156,24 +161,6 @@ static int rswitch_gwca_axi_ram_reset(struct rswitch_private *priv)
> >  	return rswitch_reg_wait(priv->addr, GWARIRM, GWARIRM_ARR, GWARIRM_ARR);
> >  }
> >
> > -static void rswitch_gwca_set_rate_limit(struct rswitch_private *priv, int rate)
> > -{
> > -	u32 gwgrlulc, gwgrlc;
> > -
> > -	switch (rate) {
> > -	case 1000:
> > -		gwgrlulc = 0x0000005f;
> > -		gwgrlc = 0x00010260;
> > -		break;
> > -	default:
> > -		dev_err(&priv->pdev->dev, "%s: This rate is not supported (%d)\n", __func__, rate);
> > -		return;
> > -	}
> > -
> > -	iowrite32(gwgrlulc, priv->addr + GWGRLULC);
> > -	iowrite32(gwgrlc, priv->addr + GWGRLC);
> > -}
> > -
> >  static bool rswitch_is_any_data_irq(struct rswitch_private *priv, u32 *dis, bool tx)
> >  {
> >  	u32 *mask = tx ? priv->gwca.tx_irq_bits : priv->gwca.rx_irq_bits;
> > @@ -402,7 +389,7 @@ static int rswitch_gwca_queue_format(struct net_device *ndev,
> >  	linkfix->die_dt = DT_LINKFIX;
> >  	rswitch_desc_set_dptr(linkfix, gq->ring_dma);
> >
> > -	iowrite32(GWDCC_BALR | (gq->dir_tx ? GWDCC_DQT : 0) | GWDCC_EDE,
> > +	iowrite32(GWDCC_BALR | (gq->dir_tx ? GWDCC_DCP(GWCA_IPV_NUM) | GWDCC_DQT : 0) | GWDCC_EDE,
> >  		  priv->addr + GWDCC_OFFS(gq->index));
> >
> >  	return 0;
> > @@ -500,7 +487,8 @@ static int rswitch_gwca_queue_ext_ts_format(struct net_device *ndev,
> >  	linkfix->die_dt = DT_LINKFIX;
> >  	rswitch_desc_set_dptr(linkfix, gq->ring_dma);
> >
> > -	iowrite32(GWDCC_BALR | (gq->dir_tx ? GWDCC_DQT : 0) | GWDCC_ETS | GWDCC_EDE,
> > +	iowrite32(GWDCC_BALR | (gq->dir_tx ? GWDCC_DCP(GWCA_IPV_NUM) | GWDCC_DQT : 0) |
> > +		  GWDCC_ETS | GWDCC_EDE,
> >  		  priv->addr + GWDCC_OFFS(gq->index));
> >
> >  	return 0;
> > @@ -649,7 +637,8 @@ static int rswitch_gwca_hw_init(struct rswitch_private *priv)
> >  	iowrite32(lower_32_bits(priv->gwca.ts_queue.ring_dma), priv->addr + GWTDCAC10);
> >  	iowrite32(upper_32_bits(priv->gwca.ts_queue.ring_dma), priv->addr + GWTDCAC00);
> >  	iowrite32(GWCA_TS_IRQ_BIT, priv->addr + GWTSDCC0);
> > -	rswitch_gwca_set_rate_limit(priv, priv->gwca.speed);
> > +
> > +	iowrite32(GWTPC_PPPL(GWCA_IPV_NUM), priv->addr + GWTPC0);
> >
> >  	for (i = 0; i < RSWITCH_NUM_PORTS; i++) {
> >  		err = rswitch_rxdmac_init(priv, i);
> > @@ -1502,7 +1491,8 @@ static netdev_tx_t rswitch_start_xmit(struct sk_buff *skb, struct net_device *nd
> >  	rswitch_desc_set_dptr(&desc->desc, dma_addr);
> >  	desc->desc.info_ds = cpu_to_le16(skb->len);
> >
> > -	desc->info1 = cpu_to_le64(INFO1_DV(BIT(rdev->etha->index)) | INFO1_FMT);
> > +	desc->info1 = cpu_to_le64(INFO1_DV(BIT(rdev->etha->index)) |
> > +				  INFO1_IPV(GWCA_IPV_NUM) | INFO1_FMT);
> >  	if (skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP) {
> >  		struct rswitch_gwca_ts_info *ts_info;
> >
> > @@ -1772,6 +1762,8 @@ static int rswitch_init(struct rswitch_private *priv)
> >  	if (err < 0)
> >  		return err;
> >
> > +	rswitch_coma_init(priv);
> > +
> >  	err = rswitch_gwca_linkfix_alloc(priv);
> >  	if (err < 0)
> >  		return -ENOMEM;
> > diff --git a/drivers/net/ethernet/renesas/rswitch.h b/drivers/net/ethernet/renesas/rswitch.h
> > index b3e0411b408e..08dadd28001e 100644
> > --- a/drivers/net/ethernet/renesas/rswitch.h
> > +++ b/drivers/net/ethernet/renesas/rswitch.h
> > @@ -48,6 +48,7 @@
> >  #define GWCA_NUM_IRQS		8
> >  #define GWCA_INDEX		0
> >  #define AGENT_INDEX_GWCA	3
> > +#define GWCA_IPV_NUM		0
> >  #define GWRO			RSWITCH_GWCA0_OFFSET
> >
> >  #define GWCA_TS_IRQ_RESOURCE_NAME	"gwca0_rxts0"
> > @@ -768,11 +769,13 @@ enum rswitch_gwca_mode {
> >  #define GWARIRM_ARR		BIT(1)
> >
> >  #define GWDCC_BALR		BIT(24)
> > +#define GWDCC_DCP(prio)		(((prio) & 0x07) << 16)
> 
> I'd be glad to see defines for magic numbers above.

I'll add defines about 0x07 and 16 on v2.

Best regards,
Yoshihiro Shimoda

> >  #define GWDCC_DQT		BIT(11)
> >  #define GWDCC_ETS		BIT(9)
> >  #define GWDCC_EDE		BIT(8)
> >
> >  #define GWTRC(queue)		(GWTRC0 + (queue) / 32 * 4)
> > +#define GWTPC_PPPL(ipv)		BIT(ipv)
> >  #define GWDCC_OFFS(queue)	(GWDCC0 + (queue) * 4)
> >
> >  #define GWDIS(i)		(GWDIS0 + (i) * 0x10)
> > @@ -789,6 +792,8 @@ enum rswitch_gwca_mode {
> >  #define CABPIRM_BPIOG		BIT(0)
> >  #define CABPIRM_BPR		BIT(1)
> >
> > +#define CABPPFLC_INIT_VALUE	0x00800080
> > +
> >  /* MFWD */
> >  #define FWPC0_LTHTA		BIT(0)
> >  #define FWPC0_IP4UE		BIT(3)
> > @@ -863,6 +868,7 @@ enum DIE_DT {
> >
> >  /* For transmission */
> >  #define INFO1_TSUN(val)		((u64)(val) << 8ULL)
> > +#define INFO1_IPV(prio)		((u64)(prio) << 28ULL)
> >  #define INFO1_CSD0(index)	((u64)(index) << 32ULL)
> >  #define INFO1_CSD1(index)	((u64)(index) << 40ULL)
> >  #define INFO1_DV(port_vector)	((u64)(port_vector) << 48ULL)
> > --
> > 2.25.1
> >
> >

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: [PATCH net-next v2 1/2] net: renesas: rswitch: Use napi_gro_receive() in RX
  2023-06-06 17:50   ` Maciej Fijalkowski
@ 2023-06-07  1:00     ` Yoshihiro Shimoda
  0 siblings, 0 replies; 7+ messages in thread
From: Yoshihiro Shimoda @ 2023-06-07  1:00 UTC (permalink / raw)
  To: Maciej Fijalkowski
  Cc: s.shtylyov@omp.ru, davem@davemloft.net, edumazet@google.com,
	kuba@kernel.org, pabeni@redhat.com, netdev@vger.kernel.org,
	linux-renesas-soc@vger.kernel.org

Hello Maciej,

> From: Maciej Fijalkowski, Sent: Wednesday, June 7, 2023 2:51 AM
> 
> On Tue, Jun 06, 2023 at 05:55:57PM +0900, Yoshihiro Shimoda wrote:
> > This hardware can receive multiple frames so that using
> > napi_gro_receive() instead of netif_receive_skb() gets good
> > performance of RX.
> >
> > Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> 
> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>

Thank you for your review!

> > ---
> >  drivers/net/ethernet/renesas/rswitch.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/ethernet/renesas/rswitch.c b/drivers/net/ethernet/renesas/rswitch.c
> > index aace87139cea..7bb0a6d594a0 100644
> > --- a/drivers/net/ethernet/renesas/rswitch.c
> > +++ b/drivers/net/ethernet/renesas/rswitch.c
> > @@ -729,7 +729,7 @@ static bool rswitch_rx(struct net_device *ndev, int *quota)
> >  		}
> >  		skb_put(skb, pkt_len);
> >  		skb->protocol = eth_type_trans(skb, ndev);
> > -		netif_receive_skb(skb);
> > +		napi_gro_receive(&rdev->napi, skb);
> 
> Some other optmization which you could do later on is to improve
> rswitch_next_queue_index() as it is used on a per packet basis.

Thank you for your suggestion! I'll try this later.

Best regards,
Yoshihiro Shimoda

> >  		rdev->ndev->stats.rx_packets++;
> >  		rdev->ndev->stats.rx_bytes += pkt_len;
> >
> > --
> > 2.25.1
> >
> >

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-06-07  1:00 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-06  8:55 [PATCH net-next v2 0/2] net: renesas: rswitch: Improve perfromance of TX/RX Yoshihiro Shimoda
2023-06-06  8:55 ` [PATCH net-next v2 1/2] net: renesas: rswitch: Use napi_gro_receive() in RX Yoshihiro Shimoda
2023-06-06 17:50   ` Maciej Fijalkowski
2023-06-07  1:00     ` Yoshihiro Shimoda
2023-06-06  8:55 ` [PATCH net-next v2 2/2] net: renesas: rswitch: Use hardware pause features Yoshihiro Shimoda
2023-06-06 17:54   ` Maciej Fijalkowski
2023-06-06 23:57     ` Yoshihiro Shimoda

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).