netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next] net: xilinx: axienet: Configure and report coalesce parameters in DMAengine flow
@ 2025-05-25 10:22 Suraj Gupta
  2025-05-26  0:36 ` kernel test robot
  2025-05-27 16:16 ` Sean Anderson
  0 siblings, 2 replies; 13+ messages in thread
From: Suraj Gupta @ 2025-05-25 10:22 UTC (permalink / raw)
  To: andrew+netdev, davem, edumazet, kuba, pabeni, vkoul, michal.simek,
	sean.anderson, radhey.shyam.pandey, horms
  Cc: netdev, linux-arm-kernel, linux-kernel, git, harini.katakam

Add support to configure / report interrupt coalesce count and delay via
ethtool in DMAEngine flow.
Netperf numbers are not good when using non-dmaengine default values,
so tuned coalesce count and delay and defined separate default
values in dmaengine flow.

Netperf numbers and CPU utilisation change in DMAengine flow after
introducing coalescing with default parameters:
coalesce parameters:
   Transfer type	  Before(w/o coalescing)  After(with coalescing)
TCP Tx, CPU utilisation%	925, 27			941, 22
TCP Rx, CPU utilisation%	607, 32			741, 36
UDP Tx, CPU utilisation%	857, 31			960, 28
UDP Rx, CPU utilisation%	762, 26			783, 18

Above numbers are observed with 4x Cortex-a53.

Signed-off-by: Suraj Gupta <suraj.gupta2@amd.com>
---
This patch depend on following AXI DMA dmengine driver changes sent to
dmaengine mailing list as pre-requisit series:
https://lore.kernel.org/all/20250525101617.1168991-1-suraj.gupta2@amd.com/ 
---
 drivers/net/ethernet/xilinx/xilinx_axienet.h  |  6 +++
 .../net/ethernet/xilinx/xilinx_axienet_main.c | 53 +++++++++++++++++++
 2 files changed, 59 insertions(+)

diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet.h b/drivers/net/ethernet/xilinx/xilinx_axienet.h
index 5ff742103beb..cdf6cbb6f2fd 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet.h
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet.h
@@ -126,6 +126,12 @@
 #define XAXIDMA_DFT_TX_USEC		50
 #define XAXIDMA_DFT_RX_USEC		16
 
+/* Default TX/RX Threshold and delay timer values for SGDMA mode with DMAEngine */
+#define XAXIDMAENGINE_DFT_TX_THRESHOLD	16
+#define XAXIDMAENGINE_DFT_TX_USEC	5
+#define XAXIDMAENGINE_DFT_RX_THRESHOLD	24
+#define XAXIDMAENGINE_DFT_RX_USEC	16
+
 #define XAXIDMA_BD_CTRL_TXSOF_MASK	0x08000000 /* First tx packet */
 #define XAXIDMA_BD_CTRL_TXEOF_MASK	0x04000000 /* Last tx packet */
 #define XAXIDMA_BD_CTRL_ALL_MASK	0x0C000000 /* All control bits */
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index 1b7a653c1f4e..f9c7d90d4ecb 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -1505,6 +1505,7 @@ static int axienet_init_dmaengine(struct net_device *ndev)
 {
 	struct axienet_local *lp = netdev_priv(ndev);
 	struct skbuf_dma_descriptor *skbuf_dma;
+	struct dma_slave_config tx_config, rx_config;
 	int i, ret;
 
 	lp->tx_chan = dma_request_chan(lp->dev, "tx_chan0");
@@ -1520,6 +1521,22 @@ static int axienet_init_dmaengine(struct net_device *ndev)
 		goto err_dma_release_tx;
 	}
 
+	tx_config.coalesce_cnt = XAXIDMAENGINE_DFT_TX_THRESHOLD;
+	tx_config.coalesce_usecs = XAXIDMAENGINE_DFT_TX_USEC;
+	rx_config.coalesce_cnt = XAXIDMAENGINE_DFT_RX_THRESHOLD;
+	rx_config.coalesce_usecs =  XAXIDMAENGINE_DFT_RX_USEC;
+
+	ret = dmaengine_slave_config(lp->tx_chan, &tx_config);
+	if (ret) {
+		dev_err(lp->dev, "Failed to configure Tx coalesce parameters\n");
+		goto err_dma_release_tx;
+	}
+	ret = dmaengine_slave_config(lp->rx_chan, &rx_config);
+	if (ret) {
+		dev_err(lp->dev, "Failed to configure Rx coalesce parameters\n");
+		goto err_dma_release_tx;
+	}
+
 	lp->tx_ring_tail = 0;
 	lp->tx_ring_head = 0;
 	lp->rx_ring_tail = 0;
@@ -2170,6 +2187,19 @@ axienet_ethtools_get_coalesce(struct net_device *ndev,
 	struct axienet_local *lp = netdev_priv(ndev);
 	u32 cr;
 
+	if (lp->use_dmaengine) {
+		struct dma_slave_caps tx_caps, rx_caps;
+
+		dma_get_slave_caps(lp->tx_chan, &tx_caps);
+		dma_get_slave_caps(lp->rx_chan, &rx_caps);
+
+		ecoalesce->tx_max_coalesced_frames = tx_caps.coalesce_cnt;
+		ecoalesce->tx_coalesce_usecs = tx_caps.coalesce_usecs;
+		ecoalesce->rx_max_coalesced_frames = rx_caps.coalesce_cnt;
+		ecoalesce->rx_coalesce_usecs = rx_caps.coalesce_usecs;
+		return 0;
+	}
+
 	ecoalesce->use_adaptive_rx_coalesce = lp->rx_dim_enabled;
 
 	spin_lock_irq(&lp->rx_cr_lock);
@@ -2233,6 +2263,29 @@ axienet_ethtools_set_coalesce(struct net_device *ndev,
 		return -EINVAL;
 	}
 
+	if (lp->use_dmaengine)	{
+		struct dma_slave_config tx_cfg, rx_cfg;
+		int ret;
+
+		tx_cfg.coalesce_cnt = ecoalesce->tx_max_coalesced_frames;
+		tx_cfg.coalesce_usecs = ecoalesce->tx_coalesce_usecs;
+		rx_cfg.coalesce_cnt = ecoalesce->rx_max_coalesced_frames;
+		rx_cfg.coalesce_usecs = ecoalesce->rx_coalesce_usecs;
+
+		ret = dmaengine_slave_config(lp->tx_chan, &tx_cfg);
+		if (ret) {
+			NL_SET_ERR_MSG(extack, "failed to set tx coalesce parameters");
+			return ret;
+		}
+
+		ret = dmaengine_slave_config(lp->rx_chan, &rx_cfg);
+		if (ret) {
+			NL_SET_ERR_MSG(extack, "failed to set rx coalesce parameters");
+			return ret;
+		}
+		return 0;
+	}
+
 	if (new_dim && !old_dim) {
 		cr = axienet_calc_cr(lp, axienet_dim_coalesce_count_rx(lp),
 				     ecoalesce->rx_coalesce_usecs);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next] net: xilinx: axienet: Configure and report coalesce parameters in DMAengine flow
  2025-05-25 10:22 [PATCH net-next] net: xilinx: axienet: Configure and report coalesce parameters in DMAengine flow Suraj Gupta
@ 2025-05-26  0:36 ` kernel test robot
  2025-05-27 16:16 ` Sean Anderson
  1 sibling, 0 replies; 13+ messages in thread
From: kernel test robot @ 2025-05-26  0:36 UTC (permalink / raw)
  To: Suraj Gupta, andrew+netdev, davem, edumazet, kuba, pabeni, vkoul,
	michal.simek, sean.anderson, radhey.shyam.pandey, horms
  Cc: oe-kbuild-all, netdev, linux-arm-kernel, linux-kernel, git,
	harini.katakam

Hi Suraj,

kernel test robot noticed the following build errors:

[auto build test ERROR on net-next/main]

url:    https://github.com/intel-lab-lkp/linux/commits/Suraj-Gupta/net-xilinx-axienet-Configure-and-report-coalesce-parameters-in-DMAengine-flow/20250525-182400
base:   net-next/main
patch link:    https://lore.kernel.org/r/20250525102217.1181104-1-suraj.gupta2%40amd.com
patch subject: [PATCH net-next] net: xilinx: axienet: Configure and report coalesce parameters in DMAengine flow
config: alpha-allyesconfig (https://download.01.org/0day-ci/archive/20250526/202505260804.Mhztve8t-lkp@intel.com/config)
compiler: alpha-linux-gcc (GCC) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250526/202505260804.Mhztve8t-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202505260804.Mhztve8t-lkp@intel.com/

All errors (new ones prefixed by >>):

   drivers/net/ethernet/xilinx/xilinx_axienet_main.c: In function 'axienet_init_dmaengine':
>> drivers/net/ethernet/xilinx/xilinx_axienet_main.c:1524:18: error: 'struct dma_slave_config' has no member named 'coalesce_cnt'
    1524 |         tx_config.coalesce_cnt = XAXIDMAENGINE_DFT_TX_THRESHOLD;
         |                  ^
>> drivers/net/ethernet/xilinx/xilinx_axienet_main.c:1525:18: error: 'struct dma_slave_config' has no member named 'coalesce_usecs'
    1525 |         tx_config.coalesce_usecs = XAXIDMAENGINE_DFT_TX_USEC;
         |                  ^
   drivers/net/ethernet/xilinx/xilinx_axienet_main.c:1526:18: error: 'struct dma_slave_config' has no member named 'coalesce_cnt'
    1526 |         rx_config.coalesce_cnt = XAXIDMAENGINE_DFT_RX_THRESHOLD;
         |                  ^
   drivers/net/ethernet/xilinx/xilinx_axienet_main.c:1527:18: error: 'struct dma_slave_config' has no member named 'coalesce_usecs'
    1527 |         rx_config.coalesce_usecs =  XAXIDMAENGINE_DFT_RX_USEC;
         |                  ^
   drivers/net/ethernet/xilinx/xilinx_axienet_main.c: In function 'axienet_ethtools_get_coalesce':
>> drivers/net/ethernet/xilinx/xilinx_axienet_main.c:2196:61: error: 'struct dma_slave_caps' has no member named 'coalesce_cnt'
    2196 |                 ecoalesce->tx_max_coalesced_frames = tx_caps.coalesce_cnt;
         |                                                             ^
>> drivers/net/ethernet/xilinx/xilinx_axienet_main.c:2197:55: error: 'struct dma_slave_caps' has no member named 'coalesce_usecs'
    2197 |                 ecoalesce->tx_coalesce_usecs = tx_caps.coalesce_usecs;
         |                                                       ^
   drivers/net/ethernet/xilinx/xilinx_axienet_main.c:2198:61: error: 'struct dma_slave_caps' has no member named 'coalesce_cnt'
    2198 |                 ecoalesce->rx_max_coalesced_frames = rx_caps.coalesce_cnt;
         |                                                             ^
   drivers/net/ethernet/xilinx/xilinx_axienet_main.c:2199:55: error: 'struct dma_slave_caps' has no member named 'coalesce_usecs'
    2199 |                 ecoalesce->rx_coalesce_usecs = rx_caps.coalesce_usecs;
         |                                                       ^
   drivers/net/ethernet/xilinx/xilinx_axienet_main.c: In function 'axienet_ethtools_set_coalesce':
   drivers/net/ethernet/xilinx/xilinx_axienet_main.c:2270:23: error: 'struct dma_slave_config' has no member named 'coalesce_cnt'
    2270 |                 tx_cfg.coalesce_cnt = ecoalesce->tx_max_coalesced_frames;
         |                       ^
   drivers/net/ethernet/xilinx/xilinx_axienet_main.c:2271:23: error: 'struct dma_slave_config' has no member named 'coalesce_usecs'
    2271 |                 tx_cfg.coalesce_usecs = ecoalesce->tx_coalesce_usecs;
         |                       ^
   drivers/net/ethernet/xilinx/xilinx_axienet_main.c:2272:23: error: 'struct dma_slave_config' has no member named 'coalesce_cnt'
    2272 |                 rx_cfg.coalesce_cnt = ecoalesce->rx_max_coalesced_frames;
         |                       ^
   drivers/net/ethernet/xilinx/xilinx_axienet_main.c:2273:23: error: 'struct dma_slave_config' has no member named 'coalesce_usecs'
    2273 |                 rx_cfg.coalesce_usecs = ecoalesce->rx_coalesce_usecs;
         |                       ^


vim +1524 drivers/net/ethernet/xilinx/xilinx_axienet_main.c

  1494	
  1495	/**
  1496	 * axienet_init_dmaengine - init the dmaengine code.
  1497	 * @ndev:       Pointer to net_device structure
  1498	 *
  1499	 * Return: 0, on success.
  1500	 *          non-zero error value on failure
  1501	 *
  1502	 * This is the dmaengine initialization code.
  1503	 */
  1504	static int axienet_init_dmaengine(struct net_device *ndev)
  1505	{
  1506		struct axienet_local *lp = netdev_priv(ndev);
  1507		struct skbuf_dma_descriptor *skbuf_dma;
  1508		struct dma_slave_config tx_config, rx_config;
  1509		int i, ret;
  1510	
  1511		lp->tx_chan = dma_request_chan(lp->dev, "tx_chan0");
  1512		if (IS_ERR(lp->tx_chan)) {
  1513			dev_err(lp->dev, "No Ethernet DMA (TX) channel found\n");
  1514			return PTR_ERR(lp->tx_chan);
  1515		}
  1516	
  1517		lp->rx_chan = dma_request_chan(lp->dev, "rx_chan0");
  1518		if (IS_ERR(lp->rx_chan)) {
  1519			ret = PTR_ERR(lp->rx_chan);
  1520			dev_err(lp->dev, "No Ethernet DMA (RX) channel found\n");
  1521			goto err_dma_release_tx;
  1522		}
  1523	
> 1524		tx_config.coalesce_cnt = XAXIDMAENGINE_DFT_TX_THRESHOLD;
> 1525		tx_config.coalesce_usecs = XAXIDMAENGINE_DFT_TX_USEC;
> 1526		rx_config.coalesce_cnt = XAXIDMAENGINE_DFT_RX_THRESHOLD;
> 1527		rx_config.coalesce_usecs =  XAXIDMAENGINE_DFT_RX_USEC;
  1528	
  1529		ret = dmaengine_slave_config(lp->tx_chan, &tx_config);
  1530		if (ret) {
  1531			dev_err(lp->dev, "Failed to configure Tx coalesce parameters\n");
  1532			goto err_dma_release_tx;
  1533		}
  1534		ret = dmaengine_slave_config(lp->rx_chan, &rx_config);
  1535		if (ret) {
  1536			dev_err(lp->dev, "Failed to configure Rx coalesce parameters\n");
  1537			goto err_dma_release_tx;
  1538		}
  1539	
  1540		lp->tx_ring_tail = 0;
  1541		lp->tx_ring_head = 0;
  1542		lp->rx_ring_tail = 0;
  1543		lp->rx_ring_head = 0;
  1544		lp->tx_skb_ring = kcalloc(TX_BD_NUM_MAX, sizeof(*lp->tx_skb_ring),
  1545					  GFP_KERNEL);
  1546		if (!lp->tx_skb_ring) {
  1547			ret = -ENOMEM;
  1548			goto err_dma_release_rx;
  1549		}
  1550		for (i = 0; i < TX_BD_NUM_MAX; i++) {
  1551			skbuf_dma = kzalloc(sizeof(*skbuf_dma), GFP_KERNEL);
  1552			if (!skbuf_dma) {
  1553				ret = -ENOMEM;
  1554				goto err_free_tx_skb_ring;
  1555			}
  1556			lp->tx_skb_ring[i] = skbuf_dma;
  1557		}
  1558	
  1559		lp->rx_skb_ring = kcalloc(RX_BUF_NUM_DEFAULT, sizeof(*lp->rx_skb_ring),
  1560					  GFP_KERNEL);
  1561		if (!lp->rx_skb_ring) {
  1562			ret = -ENOMEM;
  1563			goto err_free_tx_skb_ring;
  1564		}
  1565		for (i = 0; i < RX_BUF_NUM_DEFAULT; i++) {
  1566			skbuf_dma = kzalloc(sizeof(*skbuf_dma), GFP_KERNEL);
  1567			if (!skbuf_dma) {
  1568				ret = -ENOMEM;
  1569				goto err_free_rx_skb_ring;
  1570			}
  1571			lp->rx_skb_ring[i] = skbuf_dma;
  1572		}
  1573		/* TODO: Instead of BD_NUM_DEFAULT use runtime support */
  1574		for (i = 0; i < RX_BUF_NUM_DEFAULT; i++)
  1575			axienet_rx_submit_desc(ndev);
  1576		dma_async_issue_pending(lp->rx_chan);
  1577	
  1578		return 0;
  1579	
  1580	err_free_rx_skb_ring:
  1581		for (i = 0; i < RX_BUF_NUM_DEFAULT; i++)
  1582			kfree(lp->rx_skb_ring[i]);
  1583		kfree(lp->rx_skb_ring);
  1584	err_free_tx_skb_ring:
  1585		for (i = 0; i < TX_BD_NUM_MAX; i++)
  1586			kfree(lp->tx_skb_ring[i]);
  1587		kfree(lp->tx_skb_ring);
  1588	err_dma_release_rx:
  1589		dma_release_channel(lp->rx_chan);
  1590	err_dma_release_tx:
  1591		dma_release_channel(lp->tx_chan);
  1592		return ret;
  1593	}
  1594	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next] net: xilinx: axienet: Configure and report coalesce parameters in DMAengine flow
  2025-05-25 10:22 [PATCH net-next] net: xilinx: axienet: Configure and report coalesce parameters in DMAengine flow Suraj Gupta
  2025-05-26  0:36 ` kernel test robot
@ 2025-05-27 16:16 ` Sean Anderson
  2025-05-28 12:00   ` Gupta, Suraj
  1 sibling, 1 reply; 13+ messages in thread
From: Sean Anderson @ 2025-05-27 16:16 UTC (permalink / raw)
  To: Suraj Gupta, andrew+netdev, davem, edumazet, kuba, pabeni, vkoul,
	michal.simek, radhey.shyam.pandey, horms
  Cc: netdev, linux-arm-kernel, linux-kernel, git, harini.katakam

On 5/25/25 06:22, Suraj Gupta wrote:
> Add support to configure / report interrupt coalesce count and delay via
> ethtool in DMAEngine flow.
> Netperf numbers are not good when using non-dmaengine default values,
> so tuned coalesce count and delay and defined separate default
> values in dmaengine flow.
> 
> Netperf numbers and CPU utilisation change in DMAengine flow after
> introducing coalescing with default parameters:
> coalesce parameters:
>    Transfer type	  Before(w/o coalescing)  After(with coalescing)
> TCP Tx, CPU utilisation%	925, 27			941, 22
> TCP Rx, CPU utilisation%	607, 32			741, 36
> UDP Tx, CPU utilisation%	857, 31			960, 28
> UDP Rx, CPU utilisation%	762, 26			783, 18
> 
> Above numbers are observed with 4x Cortex-a53.

How does this affect latency? I would expect these RX settings to
increase latency around 5-10x. I only use these settings with DIM since
it will disable coalescing during periods of light load for better
latency.

(of course the way to fix this in general is RSS or some other method
involving multiple queues).

> Signed-off-by: Suraj Gupta <suraj.gupta2@amd.com>
> ---
> This patch depend on following AXI DMA dmengine driver changes sent to
> dmaengine mailing list as pre-requisit series:
> https://lore.kernel.org/all/20250525101617.1168991-1-suraj.gupta2@amd.com/ 
> ---
>  drivers/net/ethernet/xilinx/xilinx_axienet.h  |  6 +++
>  .../net/ethernet/xilinx/xilinx_axienet_main.c | 53 +++++++++++++++++++
>  2 files changed, 59 insertions(+)
> 
> diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet.h b/drivers/net/ethernet/xilinx/xilinx_axienet.h
> index 5ff742103beb..cdf6cbb6f2fd 100644
> --- a/drivers/net/ethernet/xilinx/xilinx_axienet.h
> +++ b/drivers/net/ethernet/xilinx/xilinx_axienet.h
> @@ -126,6 +126,12 @@
>  #define XAXIDMA_DFT_TX_USEC		50
>  #define XAXIDMA_DFT_RX_USEC		16
>  
> +/* Default TX/RX Threshold and delay timer values for SGDMA mode with DMAEngine */
> +#define XAXIDMAENGINE_DFT_TX_THRESHOLD	16
> +#define XAXIDMAENGINE_DFT_TX_USEC	5
> +#define XAXIDMAENGINE_DFT_RX_THRESHOLD	24
> +#define XAXIDMAENGINE_DFT_RX_USEC	16
> +
>  #define XAXIDMA_BD_CTRL_TXSOF_MASK	0x08000000 /* First tx packet */
>  #define XAXIDMA_BD_CTRL_TXEOF_MASK	0x04000000 /* Last tx packet */
>  #define XAXIDMA_BD_CTRL_ALL_MASK	0x0C000000 /* All control bits */
> diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> index 1b7a653c1f4e..f9c7d90d4ecb 100644
> --- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> +++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> @@ -1505,6 +1505,7 @@ static int axienet_init_dmaengine(struct net_device *ndev)
>  {
>  	struct axienet_local *lp = netdev_priv(ndev);
>  	struct skbuf_dma_descriptor *skbuf_dma;
> +	struct dma_slave_config tx_config, rx_config;
>  	int i, ret;
>  
>  	lp->tx_chan = dma_request_chan(lp->dev, "tx_chan0");
> @@ -1520,6 +1521,22 @@ static int axienet_init_dmaengine(struct net_device *ndev)
>  		goto err_dma_release_tx;
>  	}
>  
> +	tx_config.coalesce_cnt = XAXIDMAENGINE_DFT_TX_THRESHOLD;
> +	tx_config.coalesce_usecs = XAXIDMAENGINE_DFT_TX_USEC;
> +	rx_config.coalesce_cnt = XAXIDMAENGINE_DFT_RX_THRESHOLD;
> +	rx_config.coalesce_usecs =  XAXIDMAENGINE_DFT_RX_USEC;

I think it would be clearer to just do something like

	struct dma_slave_config tx_config = {
		.coalesce_cnt = 16,
		.coalesce_usecs = 5,
	};

since these are only used once. And this ensures that you initialize the
whole struct.

But what tree are you using? I don't see these members on net-next or
dmaengine.

> +	ret = dmaengine_slave_config(lp->tx_chan, &tx_config);
> +	if (ret) {
> +		dev_err(lp->dev, "Failed to configure Tx coalesce parameters\n");
> +		goto err_dma_release_tx;
> +	}
> +	ret = dmaengine_slave_config(lp->rx_chan, &rx_config);
> +	if (ret) {
> +		dev_err(lp->dev, "Failed to configure Rx coalesce parameters\n");
> +		goto err_dma_release_tx;
> +	}
> +
>  	lp->tx_ring_tail = 0;
>  	lp->tx_ring_head = 0;
>  	lp->rx_ring_tail = 0;
> @@ -2170,6 +2187,19 @@ axienet_ethtools_get_coalesce(struct net_device *ndev,
>  	struct axienet_local *lp = netdev_priv(ndev);
>  	u32 cr;
>  
> +	if (lp->use_dmaengine) {
> +		struct dma_slave_caps tx_caps, rx_caps;
> +
> +		dma_get_slave_caps(lp->tx_chan, &tx_caps);
> +		dma_get_slave_caps(lp->rx_chan, &rx_caps);
> +
> +		ecoalesce->tx_max_coalesced_frames = tx_caps.coalesce_cnt;
> +		ecoalesce->tx_coalesce_usecs = tx_caps.coalesce_usecs;
> +		ecoalesce->rx_max_coalesced_frames = rx_caps.coalesce_cnt;
> +		ecoalesce->rx_coalesce_usecs = rx_caps.coalesce_usecs;
> +		return 0;
> +	}
> +
>  	ecoalesce->use_adaptive_rx_coalesce = lp->rx_dim_enabled;
>  
>  	spin_lock_irq(&lp->rx_cr_lock);
> @@ -2233,6 +2263,29 @@ axienet_ethtools_set_coalesce(struct net_device *ndev,
>  		return -EINVAL;
>  	}
>  
> +	if (lp->use_dmaengine)	{
> +		struct dma_slave_config tx_cfg, rx_cfg;
> +		int ret;
> +
> +		tx_cfg.coalesce_cnt = ecoalesce->tx_max_coalesced_frames;
> +		tx_cfg.coalesce_usecs = ecoalesce->tx_coalesce_usecs;
> +		rx_cfg.coalesce_cnt = ecoalesce->rx_max_coalesced_frames;
> +		rx_cfg.coalesce_usecs = ecoalesce->rx_coalesce_usecs;
> +
> +		ret = dmaengine_slave_config(lp->tx_chan, &tx_cfg);
> +		if (ret) {
> +			NL_SET_ERR_MSG(extack, "failed to set tx coalesce parameters");
> +			return ret;
> +		}
> +
> +		ret = dmaengine_slave_config(lp->rx_chan, &rx_cfg);
> +		if (ret) {
> +			NL_SET_ERR_MSG(extack, "failed to set rx coalesce parameters");
> +			return ret;
> +		}
> +		return 0;
> +	}
> +
>  	if (new_dim && !old_dim) {
>  		cr = axienet_calc_cr(lp, axienet_dim_coalesce_count_rx(lp),
>  				     ecoalesce->rx_coalesce_usecs);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH net-next] net: xilinx: axienet: Configure and report coalesce parameters in DMAengine flow
  2025-05-27 16:16 ` Sean Anderson
@ 2025-05-28 12:00   ` Gupta, Suraj
  2025-05-28 13:09     ` Subbaraya Sundeep
  2025-05-29 16:17     ` Sean Anderson
  0 siblings, 2 replies; 13+ messages in thread
From: Gupta, Suraj @ 2025-05-28 12:00 UTC (permalink / raw)
  To: Sean Anderson, andrew+netdev@lunn.ch, davem@davemloft.net,
	edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
	vkoul@kernel.org, Simek, Michal, Pandey, Radhey Shyam,
	horms@kernel.org
  Cc: netdev@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, git (AMD-Xilinx), Katakam, Harini

[AMD Official Use Only - AMD Internal Distribution Only]

> -----Original Message-----
> From: Sean Anderson <sean.anderson@linux.dev>
> Sent: Tuesday, May 27, 2025 9:47 PM
> To: Gupta, Suraj <Suraj.Gupta2@amd.com>; andrew+netdev@lunn.ch;
> davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
> pabeni@redhat.com; vkoul@kernel.org; Simek, Michal <michal.simek@amd.com>;
> Pandey, Radhey Shyam <radhey.shyam.pandey@amd.com>; horms@kernel.org
> Cc: netdev@vger.kernel.org; linux-arm-kernel@lists.infradead.org; linux-
> kernel@vger.kernel.org; git (AMD-Xilinx) <git@amd.com>; Katakam, Harini
> <harini.katakam@amd.com>
> Subject: Re: [PATCH net-next] net: xilinx: axienet: Configure and report coalesce
> parameters in DMAengine flow
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> On 5/25/25 06:22, Suraj Gupta wrote:
> > Add support to configure / report interrupt coalesce count and delay
> > via ethtool in DMAEngine flow.
> > Netperf numbers are not good when using non-dmaengine default values,
> > so tuned coalesce count and delay and defined separate default values
> > in dmaengine flow.
> >
> > Netperf numbers and CPU utilisation change in DMAengine flow after
> > introducing coalescing with default parameters:
> > coalesce parameters:
> >    Transfer type        Before(w/o coalescing)  After(with coalescing)
> > TCP Tx, CPU utilisation%      925, 27                 941, 22
> > TCP Rx, CPU utilisation%      607, 32                 741, 36
> > UDP Tx, CPU utilisation%      857, 31                 960, 28
> > UDP Rx, CPU utilisation%      762, 26                 783, 18
> >
> > Above numbers are observed with 4x Cortex-a53.
>
> How does this affect latency? I would expect these RX settings to increase latency
> around 5-10x. I only use these settings with DIM since it will disable coalescing
> during periods of light load for better latency.
>
> (of course the way to fix this in general is RSS or some other method involving
> multiple queues).
>

I took values before NAPI addition in legacy flow (rx_threshold: 24, rx_usec: 50) as reference. But netperf numbers were low with them, so tried tuning both and selected the pair which gives good numbers.

> > Signed-off-by: Suraj Gupta <suraj.gupta2@amd.com>
> > ---
> > This patch depend on following AXI DMA dmengine driver changes sent to
> > dmaengine mailing list as pre-requisit series:
> > https://lore.kernel.org/all/20250525101617.1168991-1-suraj.gupta2@amd.
> > com/
> > ---
> >  drivers/net/ethernet/xilinx/xilinx_axienet.h  |  6 +++
> > .../net/ethernet/xilinx/xilinx_axienet_main.c | 53 +++++++++++++++++++
> >  2 files changed, 59 insertions(+)
> >
> > diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet.h
> > b/drivers/net/ethernet/xilinx/xilinx_axienet.h
> > index 5ff742103beb..cdf6cbb6f2fd 100644
> > --- a/drivers/net/ethernet/xilinx/xilinx_axienet.h
> > +++ b/drivers/net/ethernet/xilinx/xilinx_axienet.h
> > @@ -126,6 +126,12 @@
> >  #define XAXIDMA_DFT_TX_USEC          50
> >  #define XAXIDMA_DFT_RX_USEC          16
> >
> > +/* Default TX/RX Threshold and delay timer values for SGDMA mode with
> DMAEngine */
> > +#define XAXIDMAENGINE_DFT_TX_THRESHOLD       16
> > +#define XAXIDMAENGINE_DFT_TX_USEC    5
> > +#define XAXIDMAENGINE_DFT_RX_THRESHOLD       24
> > +#define XAXIDMAENGINE_DFT_RX_USEC    16
> > +
> >  #define XAXIDMA_BD_CTRL_TXSOF_MASK   0x08000000 /* First tx packet */
> >  #define XAXIDMA_BD_CTRL_TXEOF_MASK   0x04000000 /* Last tx packet */
> >  #define XAXIDMA_BD_CTRL_ALL_MASK     0x0C000000 /* All control bits */
> > diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> > b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> > index 1b7a653c1f4e..f9c7d90d4ecb 100644
> > --- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> > +++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> > @@ -1505,6 +1505,7 @@ static int axienet_init_dmaengine(struct
> > net_device *ndev)  {
> >       struct axienet_local *lp = netdev_priv(ndev);
> >       struct skbuf_dma_descriptor *skbuf_dma;
> > +     struct dma_slave_config tx_config, rx_config;
> >       int i, ret;
> >
> >       lp->tx_chan = dma_request_chan(lp->dev, "tx_chan0"); @@ -1520,6
> > +1521,22 @@ static int axienet_init_dmaengine(struct net_device *ndev)
> >               goto err_dma_release_tx;
> >       }
> >
> > +     tx_config.coalesce_cnt = XAXIDMAENGINE_DFT_TX_THRESHOLD;
> > +     tx_config.coalesce_usecs = XAXIDMAENGINE_DFT_TX_USEC;
> > +     rx_config.coalesce_cnt = XAXIDMAENGINE_DFT_RX_THRESHOLD;
> > +     rx_config.coalesce_usecs =  XAXIDMAENGINE_DFT_RX_USEC;
>
> I think it would be clearer to just do something like
>
>         struct dma_slave_config tx_config = {
>                 .coalesce_cnt = 16,
>                 .coalesce_usecs = 5,
>         };
>
> since these are only used once. And this ensures that you initialize the whole struct.
>
> But what tree are you using? I don't see these members on net-next or dmaengine.

These changes are proposed in separate series in dmaengine https://lore.kernel.org/all/20250525101617.1168991-2-suraj.gupta2@amd.com/ and I described it here below my SOB.

>
> > +     ret = dmaengine_slave_config(lp->tx_chan, &tx_config);
> > +     if (ret) {
> > +             dev_err(lp->dev, "Failed to configure Tx coalesce parameters\n");
> > +             goto err_dma_release_tx;
> > +     }
> > +     ret = dmaengine_slave_config(lp->rx_chan, &rx_config);
> > +     if (ret) {
> > +             dev_err(lp->dev, "Failed to configure Rx coalesce parameters\n");
> > +             goto err_dma_release_tx;
> > +     }
> > +
> >       lp->tx_ring_tail = 0;
> >       lp->tx_ring_head = 0;
> >       lp->rx_ring_tail = 0;
> > @@ -2170,6 +2187,19 @@ axienet_ethtools_get_coalesce(struct net_device
> *ndev,
> >       struct axienet_local *lp = netdev_priv(ndev);
> >       u32 cr;
> >
> > +     if (lp->use_dmaengine) {
> > +             struct dma_slave_caps tx_caps, rx_caps;
> > +
> > +             dma_get_slave_caps(lp->tx_chan, &tx_caps);
> > +             dma_get_slave_caps(lp->rx_chan, &rx_caps);
> > +
> > +             ecoalesce->tx_max_coalesced_frames = tx_caps.coalesce_cnt;
> > +             ecoalesce->tx_coalesce_usecs = tx_caps.coalesce_usecs;
> > +             ecoalesce->rx_max_coalesced_frames = rx_caps.coalesce_cnt;
> > +             ecoalesce->rx_coalesce_usecs = rx_caps.coalesce_usecs;
> > +             return 0;
> > +     }
> > +
> >       ecoalesce->use_adaptive_rx_coalesce = lp->rx_dim_enabled;
> >
> >       spin_lock_irq(&lp->rx_cr_lock);
> > @@ -2233,6 +2263,29 @@ axienet_ethtools_set_coalesce(struct net_device
> *ndev,
> >               return -EINVAL;
> >       }
> >
> > +     if (lp->use_dmaengine)  {
> > +             struct dma_slave_config tx_cfg, rx_cfg;
> > +             int ret;
> > +
> > +             tx_cfg.coalesce_cnt = ecoalesce->tx_max_coalesced_frames;
> > +             tx_cfg.coalesce_usecs = ecoalesce->tx_coalesce_usecs;
> > +             rx_cfg.coalesce_cnt = ecoalesce->rx_max_coalesced_frames;
> > +             rx_cfg.coalesce_usecs = ecoalesce->rx_coalesce_usecs;
> > +
> > +             ret = dmaengine_slave_config(lp->tx_chan, &tx_cfg);
> > +             if (ret) {
> > +                     NL_SET_ERR_MSG(extack, "failed to set tx coalesce parameters");
> > +                     return ret;
> > +             }
> > +
> > +             ret = dmaengine_slave_config(lp->rx_chan, &rx_cfg);
> > +             if (ret) {
> > +                     NL_SET_ERR_MSG(extack, "failed to set rx coalesce
> parameters");
> > +                     return ret;
> > +             }
> > +             return 0;
> > +     }
> > +
> >       if (new_dim && !old_dim) {
> >               cr = axienet_calc_cr(lp, axienet_dim_coalesce_count_rx(lp),
> >                                    ecoalesce->rx_coalesce_usecs);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next] net: xilinx: axienet: Configure and report coalesce parameters in DMAengine flow
  2025-05-28 12:00   ` Gupta, Suraj
@ 2025-05-28 13:09     ` Subbaraya Sundeep
  2025-05-29 16:17     ` Sean Anderson
  1 sibling, 0 replies; 13+ messages in thread
From: Subbaraya Sundeep @ 2025-05-28 13:09 UTC (permalink / raw)
  To: Gupta, Suraj
  Cc: Sean Anderson, andrew+netdev@lunn.ch, davem@davemloft.net,
	edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
	vkoul@kernel.org, Simek, Michal, Pandey, Radhey Shyam,
	horms@kernel.org, netdev@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, git (AMD-Xilinx), Katakam, Harini

On 2025-05-28 at 12:00:56, Gupta, Suraj (Suraj.Gupta2@amd.com) wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
>
Fix your mail settings. Cannot be internal if posting to mailing list :)

Thanks,
Sundeep

> > -----Original Message-----
> > From: Sean Anderson <sean.anderson@linux.dev>
> > Sent: Tuesday, May 27, 2025 9:47 PM
> > To: Gupta, Suraj <Suraj.Gupta2@amd.com>; andrew+netdev@lunn.ch;
> > davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
> > pabeni@redhat.com; vkoul@kernel.org; Simek, Michal <michal.simek@amd.com>;
> > Pandey, Radhey Shyam <radhey.shyam.pandey@amd.com>; horms@kernel.org
> > Cc: netdev@vger.kernel.org; linux-arm-kernel@lists.infradead.org; linux-
> > kernel@vger.kernel.org; git (AMD-Xilinx) <git@amd.com>; Katakam, Harini
> > <harini.katakam@amd.com>
> > Subject: Re: [PATCH net-next] net: xilinx: axienet: Configure and report coalesce
> > parameters in DMAengine flow
> >
> > Caution: This message originated from an External Source. Use proper caution
> > when opening attachments, clicking links, or responding.
> >
> >
> > On 5/25/25 06:22, Suraj Gupta wrote:
> > > Add support to configure / report interrupt coalesce count and delay
> > > via ethtool in DMAEngine flow.
> > > Netperf numbers are not good when using non-dmaengine default values,
> > > so tuned coalesce count and delay and defined separate default values
> > > in dmaengine flow.
> > >
> > > Netperf numbers and CPU utilisation change in DMAengine flow after
> > > introducing coalescing with default parameters:
> > > coalesce parameters:
> > >    Transfer type        Before(w/o coalescing)  After(with coalescing)
> > > TCP Tx, CPU utilisation%      925, 27                 941, 22
> > > TCP Rx, CPU utilisation%      607, 32                 741, 36
> > > UDP Tx, CPU utilisation%      857, 31                 960, 28
> > > UDP Rx, CPU utilisation%      762, 26                 783, 18
> > >
> > > Above numbers are observed with 4x Cortex-a53.
> >
> > How does this affect latency? I would expect these RX settings to increase latency
> > around 5-10x. I only use these settings with DIM since it will disable coalescing
> > during periods of light load for better latency.
> >
> > (of course the way to fix this in general is RSS or some other method involving
> > multiple queues).
> >
> 
> I took values before NAPI addition in legacy flow (rx_threshold: 24, rx_usec: 50) as reference. But netperf numbers were low with them, so tried tuning both and selected the pair which gives good numbers.
> 
> > > Signed-off-by: Suraj Gupta <suraj.gupta2@amd.com>
> > > ---
> > > This patch depend on following AXI DMA dmengine driver changes sent to
> > > dmaengine mailing list as pre-requisit series:
> > > https://lore.kernel.org/all/20250525101617.1168991-1-suraj.gupta2@amd.
> > > com/
> > > ---
> > >  drivers/net/ethernet/xilinx/xilinx_axienet.h  |  6 +++
> > > .../net/ethernet/xilinx/xilinx_axienet_main.c | 53 +++++++++++++++++++
> > >  2 files changed, 59 insertions(+)
> > >
> > > diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet.h
> > > b/drivers/net/ethernet/xilinx/xilinx_axienet.h
> > > index 5ff742103beb..cdf6cbb6f2fd 100644
> > > --- a/drivers/net/ethernet/xilinx/xilinx_axienet.h
> > > +++ b/drivers/net/ethernet/xilinx/xilinx_axienet.h
> > > @@ -126,6 +126,12 @@
> > >  #define XAXIDMA_DFT_TX_USEC          50
> > >  #define XAXIDMA_DFT_RX_USEC          16
> > >
> > > +/* Default TX/RX Threshold and delay timer values for SGDMA mode with
> > DMAEngine */
> > > +#define XAXIDMAENGINE_DFT_TX_THRESHOLD       16
> > > +#define XAXIDMAENGINE_DFT_TX_USEC    5
> > > +#define XAXIDMAENGINE_DFT_RX_THRESHOLD       24
> > > +#define XAXIDMAENGINE_DFT_RX_USEC    16
> > > +
> > >  #define XAXIDMA_BD_CTRL_TXSOF_MASK   0x08000000 /* First tx packet */
> > >  #define XAXIDMA_BD_CTRL_TXEOF_MASK   0x04000000 /* Last tx packet */
> > >  #define XAXIDMA_BD_CTRL_ALL_MASK     0x0C000000 /* All control bits */
> > > diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> > > b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> > > index 1b7a653c1f4e..f9c7d90d4ecb 100644
> > > --- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> > > +++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> > > @@ -1505,6 +1505,7 @@ static int axienet_init_dmaengine(struct
> > > net_device *ndev)  {
> > >       struct axienet_local *lp = netdev_priv(ndev);
> > >       struct skbuf_dma_descriptor *skbuf_dma;
> > > +     struct dma_slave_config tx_config, rx_config;
> > >       int i, ret;
> > >
> > >       lp->tx_chan = dma_request_chan(lp->dev, "tx_chan0"); @@ -1520,6
> > > +1521,22 @@ static int axienet_init_dmaengine(struct net_device *ndev)
> > >               goto err_dma_release_tx;
> > >       }
> > >
> > > +     tx_config.coalesce_cnt = XAXIDMAENGINE_DFT_TX_THRESHOLD;
> > > +     tx_config.coalesce_usecs = XAXIDMAENGINE_DFT_TX_USEC;
> > > +     rx_config.coalesce_cnt = XAXIDMAENGINE_DFT_RX_THRESHOLD;
> > > +     rx_config.coalesce_usecs =  XAXIDMAENGINE_DFT_RX_USEC;
> >
> > I think it would be clearer to just do something like
> >
> >         struct dma_slave_config tx_config = {
> >                 .coalesce_cnt = 16,
> >                 .coalesce_usecs = 5,
> >         };
> >
> > since these are only used once. And this ensures that you initialize the whole struct.
> >
> > But what tree are you using? I don't see these members on net-next or dmaengine.
> 
> These changes are proposed in separate series in dmaengine https://lore.kernel.org/all/20250525101617.1168991-2-suraj.gupta2@amd.com/ and I described it here below my SOB.
> 
> >
> > > +     ret = dmaengine_slave_config(lp->tx_chan, &tx_config);
> > > +     if (ret) {
> > > +             dev_err(lp->dev, "Failed to configure Tx coalesce parameters\n");
> > > +             goto err_dma_release_tx;
> > > +     }
> > > +     ret = dmaengine_slave_config(lp->rx_chan, &rx_config);
> > > +     if (ret) {
> > > +             dev_err(lp->dev, "Failed to configure Rx coalesce parameters\n");
> > > +             goto err_dma_release_tx;
> > > +     }
> > > +
> > >       lp->tx_ring_tail = 0;
> > >       lp->tx_ring_head = 0;
> > >       lp->rx_ring_tail = 0;
> > > @@ -2170,6 +2187,19 @@ axienet_ethtools_get_coalesce(struct net_device
> > *ndev,
> > >       struct axienet_local *lp = netdev_priv(ndev);
> > >       u32 cr;
> > >
> > > +     if (lp->use_dmaengine) {
> > > +             struct dma_slave_caps tx_caps, rx_caps;
> > > +
> > > +             dma_get_slave_caps(lp->tx_chan, &tx_caps);
> > > +             dma_get_slave_caps(lp->rx_chan, &rx_caps);
> > > +
> > > +             ecoalesce->tx_max_coalesced_frames = tx_caps.coalesce_cnt;
> > > +             ecoalesce->tx_coalesce_usecs = tx_caps.coalesce_usecs;
> > > +             ecoalesce->rx_max_coalesced_frames = rx_caps.coalesce_cnt;
> > > +             ecoalesce->rx_coalesce_usecs = rx_caps.coalesce_usecs;
> > > +             return 0;
> > > +     }
> > > +
> > >       ecoalesce->use_adaptive_rx_coalesce = lp->rx_dim_enabled;
> > >
> > >       spin_lock_irq(&lp->rx_cr_lock);
> > > @@ -2233,6 +2263,29 @@ axienet_ethtools_set_coalesce(struct net_device
> > *ndev,
> > >               return -EINVAL;
> > >       }
> > >
> > > +     if (lp->use_dmaengine)  {
> > > +             struct dma_slave_config tx_cfg, rx_cfg;
> > > +             int ret;
> > > +
> > > +             tx_cfg.coalesce_cnt = ecoalesce->tx_max_coalesced_frames;
> > > +             tx_cfg.coalesce_usecs = ecoalesce->tx_coalesce_usecs;
> > > +             rx_cfg.coalesce_cnt = ecoalesce->rx_max_coalesced_frames;
> > > +             rx_cfg.coalesce_usecs = ecoalesce->rx_coalesce_usecs;
> > > +
> > > +             ret = dmaengine_slave_config(lp->tx_chan, &tx_cfg);
> > > +             if (ret) {
> > > +                     NL_SET_ERR_MSG(extack, "failed to set tx coalesce parameters");
> > > +                     return ret;
> > > +             }
> > > +
> > > +             ret = dmaengine_slave_config(lp->rx_chan, &rx_cfg);
> > > +             if (ret) {
> > > +                     NL_SET_ERR_MSG(extack, "failed to set rx coalesce
> > parameters");
> > > +                     return ret;
> > > +             }
> > > +             return 0;
> > > +     }
> > > +
> > >       if (new_dim && !old_dim) {
> > >               cr = axienet_calc_cr(lp, axienet_dim_coalesce_count_rx(lp),
> > >                                    ecoalesce->rx_coalesce_usecs);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next] net: xilinx: axienet: Configure and report coalesce parameters in DMAengine flow
  2025-05-28 12:00   ` Gupta, Suraj
  2025-05-28 13:09     ` Subbaraya Sundeep
@ 2025-05-29 16:17     ` Sean Anderson
  2025-05-29 16:29       ` Andrew Lunn
  2025-05-30 10:18       ` Gupta, Suraj
  1 sibling, 2 replies; 13+ messages in thread
From: Sean Anderson @ 2025-05-29 16:17 UTC (permalink / raw)
  To: Gupta, Suraj, andrew+netdev@lunn.ch, davem@davemloft.net,
	edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
	vkoul@kernel.org, Simek, Michal, Pandey, Radhey Shyam,
	horms@kernel.org
  Cc: netdev@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, git (AMD-Xilinx), Katakam, Harini

On 5/28/25 08:00, Gupta, Suraj wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
> 
>> -----Original Message-----
>> From: Sean Anderson <sean.anderson@linux.dev>
>> Sent: Tuesday, May 27, 2025 9:47 PM
>> To: Gupta, Suraj <Suraj.Gupta2@amd.com>; andrew+netdev@lunn.ch;
>> davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
>> pabeni@redhat.com; vkoul@kernel.org; Simek, Michal <michal.simek@amd.com>;
>> Pandey, Radhey Shyam <radhey.shyam.pandey@amd.com>; horms@kernel.org
>> Cc: netdev@vger.kernel.org; linux-arm-kernel@lists.infradead.org; linux-
>> kernel@vger.kernel.org; git (AMD-Xilinx) <git@amd.com>; Katakam, Harini
>> <harini.katakam@amd.com>
>> Subject: Re: [PATCH net-next] net: xilinx: axienet: Configure and report coalesce
>> parameters in DMAengine flow
>>
>> Caution: This message originated from an External Source. Use proper caution
>> when opening attachments, clicking links, or responding.
>>
>>
>> On 5/25/25 06:22, Suraj Gupta wrote:
>> > Add support to configure / report interrupt coalesce count and delay
>> > via ethtool in DMAEngine flow.
>> > Netperf numbers are not good when using non-dmaengine default values,
>> > so tuned coalesce count and delay and defined separate default values
>> > in dmaengine flow.
>> >
>> > Netperf numbers and CPU utilisation change in DMAengine flow after
>> > introducing coalescing with default parameters:
>> > coalesce parameters:
>> >    Transfer type        Before(w/o coalescing)  After(with coalescing)
>> > TCP Tx, CPU utilisation%      925, 27                 941, 22
>> > TCP Rx, CPU utilisation%      607, 32                 741, 36
>> > UDP Tx, CPU utilisation%      857, 31                 960, 28
>> > UDP Rx, CPU utilisation%      762, 26                 783, 18
>> >
>> > Above numbers are observed with 4x Cortex-a53.
>>
>> How does this affect latency? I would expect these RX settings to increase latency
>> around 5-10x. I only use these settings with DIM since it will disable coalescing
>> during periods of light load for better latency.
>>
>> (of course the way to fix this in general is RSS or some other method involving
>> multiple queues).
>>
> 
> I took values before NAPI addition in legacy flow (rx_threshold: 24, rx_usec: 50) as reference. But netperf numbers were low with them, so tried tuning both and selected the pair which gives good numbers.

Yeah, but the reason is that you are trading latency for throughput.
There is only one queue, so when the interface is saturated you will not
get good latency anyway (since latency-sensitive packets will get
head-of-line blocked). But when activity is sparse you can good latency
if there is no coalescing. So I think coalescing should only be used
when there is a lot of traffic. Hence why I only adjusted the settings
once I implemented DIM. I think you should be able to implement it by
calling net_dim from axienet_dma_rx_cb, but it will not be as efficient
without NAPI.

Actually, if you are looking into improving performance, I think lack of
NAPI is probably the biggest limitation with the dmaengine backend.

>> > Signed-off-by: Suraj Gupta <suraj.gupta2@amd.com>
>> > ---
>> > This patch depend on following AXI DMA dmengine driver changes sent to
>> > dmaengine mailing list as pre-requisit series:
>> > https://lore.kernel.org/all/20250525101617.1168991-1-suraj.gupta2@amd.
>> > com/
>> > ---
>> >  drivers/net/ethernet/xilinx/xilinx_axienet.h  |  6 +++
>> > .../net/ethernet/xilinx/xilinx_axienet_main.c | 53 +++++++++++++++++++
>> >  2 files changed, 59 insertions(+)
>> >
>> > diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet.h
>> > b/drivers/net/ethernet/xilinx/xilinx_axienet.h
>> > index 5ff742103beb..cdf6cbb6f2fd 100644
>> > --- a/drivers/net/ethernet/xilinx/xilinx_axienet.h
>> > +++ b/drivers/net/ethernet/xilinx/xilinx_axienet.h
>> > @@ -126,6 +126,12 @@
>> >  #define XAXIDMA_DFT_TX_USEC          50
>> >  #define XAXIDMA_DFT_RX_USEC          16
>> >
>> > +/* Default TX/RX Threshold and delay timer values for SGDMA mode with
>> DMAEngine */
>> > +#define XAXIDMAENGINE_DFT_TX_THRESHOLD       16
>> > +#define XAXIDMAENGINE_DFT_TX_USEC    5
>> > +#define XAXIDMAENGINE_DFT_RX_THRESHOLD       24
>> > +#define XAXIDMAENGINE_DFT_RX_USEC    16
>> > +
>> >  #define XAXIDMA_BD_CTRL_TXSOF_MASK   0x08000000 /* First tx packet */
>> >  #define XAXIDMA_BD_CTRL_TXEOF_MASK   0x04000000 /* Last tx packet */
>> >  #define XAXIDMA_BD_CTRL_ALL_MASK     0x0C000000 /* All control bits */
>> > diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
>> > b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
>> > index 1b7a653c1f4e..f9c7d90d4ecb 100644
>> > --- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
>> > +++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
>> > @@ -1505,6 +1505,7 @@ static int axienet_init_dmaengine(struct
>> > net_device *ndev)  {
>> >       struct axienet_local *lp = netdev_priv(ndev);
>> >       struct skbuf_dma_descriptor *skbuf_dma;
>> > +     struct dma_slave_config tx_config, rx_config;
>> >       int i, ret;
>> >
>> >       lp->tx_chan = dma_request_chan(lp->dev, "tx_chan0"); @@ -1520,6
>> > +1521,22 @@ static int axienet_init_dmaengine(struct net_device *ndev)
>> >               goto err_dma_release_tx;
>> >       }
>> >
>> > +     tx_config.coalesce_cnt = XAXIDMAENGINE_DFT_TX_THRESHOLD;
>> > +     tx_config.coalesce_usecs = XAXIDMAENGINE_DFT_TX_USEC;
>> > +     rx_config.coalesce_cnt = XAXIDMAENGINE_DFT_RX_THRESHOLD;
>> > +     rx_config.coalesce_usecs =  XAXIDMAENGINE_DFT_RX_USEC;
>>
>> I think it would be clearer to just do something like
>>
>>         struct dma_slave_config tx_config = {
>>                 .coalesce_cnt = 16,
>>                 .coalesce_usecs = 5,
>>         };
>>
>> since these are only used once. And this ensures that you initialize the whole struct.
>>
>> But what tree are you using? I don't see these members on net-next or dmaengine.
> 
> These changes are proposed in separate series in dmaengine https://lore.kernel.org/all/20250525101617.1168991-2-suraj.gupta2@amd.com/ and I described it here below my SOB.

I think you should post those patches with this series to allow them to
be reviewed appropriately.

--Sean

>>
>> > +     ret = dmaengine_slave_config(lp->tx_chan, &tx_config);
>> > +     if (ret) {
>> > +             dev_err(lp->dev, "Failed to configure Tx coalesce parameters\n");
>> > +             goto err_dma_release_tx;
>> > +     }
>> > +     ret = dmaengine_slave_config(lp->rx_chan, &rx_config);
>> > +     if (ret) {
>> > +             dev_err(lp->dev, "Failed to configure Rx coalesce parameters\n");
>> > +             goto err_dma_release_tx;
>> > +     }
>> > +
>> >       lp->tx_ring_tail = 0;
>> >       lp->tx_ring_head = 0;
>> >       lp->rx_ring_tail = 0;
>> > @@ -2170,6 +2187,19 @@ axienet_ethtools_get_coalesce(struct net_device
>> *ndev,
>> >       struct axienet_local *lp = netdev_priv(ndev);
>> >       u32 cr;
>> >
>> > +     if (lp->use_dmaengine) {
>> > +             struct dma_slave_caps tx_caps, rx_caps;
>> > +
>> > +             dma_get_slave_caps(lp->tx_chan, &tx_caps);
>> > +             dma_get_slave_caps(lp->rx_chan, &rx_caps);
>> > +
>> > +             ecoalesce->tx_max_coalesced_frames = tx_caps.coalesce_cnt;
>> > +             ecoalesce->tx_coalesce_usecs = tx_caps.coalesce_usecs;
>> > +             ecoalesce->rx_max_coalesced_frames = rx_caps.coalesce_cnt;
>> > +             ecoalesce->rx_coalesce_usecs = rx_caps.coalesce_usecs;
>> > +             return 0;
>> > +     }
>> > +
>> >       ecoalesce->use_adaptive_rx_coalesce = lp->rx_dim_enabled;
>> >
>> >       spin_lock_irq(&lp->rx_cr_lock);
>> > @@ -2233,6 +2263,29 @@ axienet_ethtools_set_coalesce(struct net_device
>> *ndev,
>> >               return -EINVAL;
>> >       }
>> >
>> > +     if (lp->use_dmaengine)  {
>> > +             struct dma_slave_config tx_cfg, rx_cfg;
>> > +             int ret;
>> > +
>> > +             tx_cfg.coalesce_cnt = ecoalesce->tx_max_coalesced_frames;
>> > +             tx_cfg.coalesce_usecs = ecoalesce->tx_coalesce_usecs;
>> > +             rx_cfg.coalesce_cnt = ecoalesce->rx_max_coalesced_frames;
>> > +             rx_cfg.coalesce_usecs = ecoalesce->rx_coalesce_usecs;
>> > +
>> > +             ret = dmaengine_slave_config(lp->tx_chan, &tx_cfg);
>> > +             if (ret) {
>> > +                     NL_SET_ERR_MSG(extack, "failed to set tx coalesce parameters");
>> > +                     return ret;
>> > +             }
>> > +
>> > +             ret = dmaengine_slave_config(lp->rx_chan, &rx_cfg);
>> > +             if (ret) {
>> > +                     NL_SET_ERR_MSG(extack, "failed to set rx coalesce
>> parameters");
>> > +                     return ret;
>> > +             }
>> > +             return 0;
>> > +     }
>> > +
>> >       if (new_dim && !old_dim) {
>> >               cr = axienet_calc_cr(lp, axienet_dim_coalesce_count_rx(lp),
>> >                                    ecoalesce->rx_coalesce_usecs);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next] net: xilinx: axienet: Configure and report coalesce parameters in DMAengine flow
  2025-05-29 16:17     ` Sean Anderson
@ 2025-05-29 16:29       ` Andrew Lunn
  2025-05-29 16:35         ` Sean Anderson
  2025-05-30 10:18       ` Gupta, Suraj
  1 sibling, 1 reply; 13+ messages in thread
From: Andrew Lunn @ 2025-05-29 16:29 UTC (permalink / raw)
  To: Sean Anderson
  Cc: Gupta, Suraj, andrew+netdev@lunn.ch, davem@davemloft.net,
	edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
	vkoul@kernel.org, Simek, Michal, Pandey, Radhey Shyam,
	horms@kernel.org, netdev@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, git (AMD-Xilinx), Katakam, Harini

> Yeah, but the reason is that you are trading latency for throughput.
> There is only one queue, so when the interface is saturated you will not
> get good latency anyway (since latency-sensitive packets will get
> head-of-line blocked). But when activity is sparse you can good latency
> if there is no coalescing. So I think coalescing should only be used
> when there is a lot of traffic. Hence why I only adjusted the settings
> once I implemented DIM. I think you should be able to implement it by
> calling net_dim from axienet_dma_rx_cb, but it will not be as efficient
> without NAPI.
> 
> Actually, if you are looking into improving performance, I think lack of
> NAPI is probably the biggest limitation with the dmaengine backend.

It latency is the goal, especially for mixing high and low priority
traffic, having BQL implemented is also important. Does this driver
have that?

	Andrew

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next] net: xilinx: axienet: Configure and report coalesce parameters in DMAengine flow
  2025-05-29 16:29       ` Andrew Lunn
@ 2025-05-29 16:35         ` Sean Anderson
  0 siblings, 0 replies; 13+ messages in thread
From: Sean Anderson @ 2025-05-29 16:35 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Gupta, Suraj, andrew+netdev@lunn.ch, davem@davemloft.net,
	edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
	vkoul@kernel.org, Simek, Michal, Pandey, Radhey Shyam,
	horms@kernel.org, netdev@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, git (AMD-Xilinx), Katakam, Harini

On 5/29/25 12:29, Andrew Lunn wrote:
>> Yeah, but the reason is that you are trading latency for throughput.
>> There is only one queue, so when the interface is saturated you will not
>> get good latency anyway (since latency-sensitive packets will get
>> head-of-line blocked). But when activity is sparse you can good latency
>> if there is no coalescing. So I think coalescing should only be used
>> when there is a lot of traffic. Hence why I only adjusted the settings
>> once I implemented DIM. I think you should be able to implement it by
>> calling net_dim from axienet_dma_rx_cb, but it will not be as efficient
>> without NAPI.
>> 
>> Actually, if you are looking into improving performance, I think lack of
>> NAPI is probably the biggest limitation with the dmaengine backend.
> 
> It latency is the goal, especially for mixing high and low priority
> traffic, having BQL implemented is also important. Does this driver
> have that?
> 
> 	Andrew

Yes, see commit c900e49d58eb ("net: xilinx: axienet: Implement BQL").

--Sean

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH net-next] net: xilinx: axienet: Configure and report coalesce parameters in DMAengine flow
  2025-05-29 16:17     ` Sean Anderson
  2025-05-29 16:29       ` Andrew Lunn
@ 2025-05-30 10:18       ` Gupta, Suraj
  2025-05-30 11:53         ` Gupta, Suraj
  2025-05-30 20:44         ` Sean Anderson
  1 sibling, 2 replies; 13+ messages in thread
From: Gupta, Suraj @ 2025-05-30 10:18 UTC (permalink / raw)
  To: Sean Anderson, andrew+netdev@lunn.ch, davem@davemloft.net,
	edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
	vkoul@kernel.org, Simek, Michal, Pandey, Radhey Shyam,
	horms@kernel.org
  Cc: netdev@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, git (AMD-Xilinx), Katakam, Harini

[AMD Official Use Only - AMD Internal Distribution Only]

> -----Original Message-----
> From: Sean Anderson <sean.anderson@linux.dev>
> Sent: Thursday, May 29, 2025 9:48 PM
> To: Gupta, Suraj <Suraj.Gupta2@amd.com>; andrew+netdev@lunn.ch;
> davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
> pabeni@redhat.com; vkoul@kernel.org; Simek, Michal <michal.simek@amd.com>;
> Pandey, Radhey Shyam <radhey.shyam.pandey@amd.com>; horms@kernel.org
> Cc: netdev@vger.kernel.org; linux-arm-kernel@lists.infradead.org; linux-
> kernel@vger.kernel.org; git (AMD-Xilinx) <git@amd.com>; Katakam, Harini
> <harini.katakam@amd.com>
> Subject: Re: [PATCH net-next] net: xilinx: axienet: Configure and report coalesce
> parameters in DMAengine flow
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> On 5/28/25 08:00, Gupta, Suraj wrote:
> > [AMD Official Use Only - AMD Internal Distribution Only]
> >
> >> -----Original Message-----
> >> From: Sean Anderson <sean.anderson@linux.dev>
> >> Sent: Tuesday, May 27, 2025 9:47 PM
> >> To: Gupta, Suraj <Suraj.Gupta2@amd.com>; andrew+netdev@lunn.ch;
> >> davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
> >> pabeni@redhat.com; vkoul@kernel.org; Simek, Michal
> >> <michal.simek@amd.com>; Pandey, Radhey Shyam
> >> <radhey.shyam.pandey@amd.com>; horms@kernel.org
> >> Cc: netdev@vger.kernel.org; linux-arm-kernel@lists.infradead.org;
> >> linux- kernel@vger.kernel.org; git (AMD-Xilinx) <git@amd.com>;
> >> Katakam, Harini <harini.katakam@amd.com>
> >> Subject: Re: [PATCH net-next] net: xilinx: axienet: Configure and
> >> report coalesce parameters in DMAengine flow
> >>
> >> Caution: This message originated from an External Source. Use proper
> >> caution when opening attachments, clicking links, or responding.
> >>
> >>
> >> On 5/25/25 06:22, Suraj Gupta wrote:
> >> > Add support to configure / report interrupt coalesce count and
> >> > delay via ethtool in DMAEngine flow.
> >> > Netperf numbers are not good when using non-dmaengine default
> >> > values, so tuned coalesce count and delay and defined separate
> >> > default values in dmaengine flow.
> >> >
> >> > Netperf numbers and CPU utilisation change in DMAengine flow after
> >> > introducing coalescing with default parameters:
> >> > coalesce parameters:
> >> >    Transfer type        Before(w/o coalescing)  After(with coalescing)
> >> > TCP Tx, CPU utilisation%      925, 27                 941, 22
> >> > TCP Rx, CPU utilisation%      607, 32                 741, 36
> >> > UDP Tx, CPU utilisation%      857, 31                 960, 28
> >> > UDP Rx, CPU utilisation%      762, 26                 783, 18
> >> >
> >> > Above numbers are observed with 4x Cortex-a53.
> >>
> >> How does this affect latency? I would expect these RX settings to
> >> increase latency around 5-10x. I only use these settings with DIM
> >> since it will disable coalescing during periods of light load for better latency.
> >>
> >> (of course the way to fix this in general is RSS or some other method
> >> involving multiple queues).
> >>
> >
> > I took values before NAPI addition in legacy flow (rx_threshold: 24, rx_usec: 50) as
> reference. But netperf numbers were low with them, so tried tuning both and
> selected the pair which gives good numbers.
>
> Yeah, but the reason is that you are trading latency for throughput.
> There is only one queue, so when the interface is saturated you will not get good
> latency anyway (since latency-sensitive packets will get head-of-line blocked). But
> when activity is sparse you can good latency if there is no coalescing. So I think
> coalescing should only be used when there is a lot of traffic. Hence why I only
> adjusted the settings once I implemented DIM. I think you should be able to
> implement it by calling net_dim from axienet_dma_rx_cb, but it will not be as efficient
> without NAPI.
>

Ok, got it. I'll keep default values used before NAPI in legacy flow (coalesce count: 24, delay: 50) for both Tx and Rx and remove perf comparisons.

> Actually, if you are looking into improving performance, I think lack of NAPI is
> probably the biggest limitation with the dmaengine backend.
>
Yes, I agree. NAPI for DMAEngine implementation is underway and will be sent to mainline soon.

> >> > Signed-off-by: Suraj Gupta <suraj.gupta2@amd.com>
> >> > ---
> >> > This patch depend on following AXI DMA dmengine driver changes sent
> >> > to dmaengine mailing list as pre-requisit series:
> >> > https://lore.kernel.org/all/20250525101617.1168991-1-suraj.gupta2@amd.
> >> > com/
> >> > ---
> >> >  drivers/net/ethernet/xilinx/xilinx_axienet.h  |  6 +++
> >> > .../net/ethernet/xilinx/xilinx_axienet_main.c | 53
> >> > +++++++++++++++++++
> >> >  2 files changed, 59 insertions(+)
> >> >
> >> > diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet.h
> >> > b/drivers/net/ethernet/xilinx/xilinx_axienet.h
> >> > index 5ff742103beb..cdf6cbb6f2fd 100644
> >> > --- a/drivers/net/ethernet/xilinx/xilinx_axienet.h
> >> > +++ b/drivers/net/ethernet/xilinx/xilinx_axienet.h
> >> > @@ -126,6 +126,12 @@
> >> >  #define XAXIDMA_DFT_TX_USEC          50
> >> >  #define XAXIDMA_DFT_RX_USEC          16
> >> >
> >> > +/* Default TX/RX Threshold and delay timer values for SGDMA mode
> >> > +with
> >> DMAEngine */
> >> > +#define XAXIDMAENGINE_DFT_TX_THRESHOLD       16
> >> > +#define XAXIDMAENGINE_DFT_TX_USEC    5
> >> > +#define XAXIDMAENGINE_DFT_RX_THRESHOLD       24
> >> > +#define XAXIDMAENGINE_DFT_RX_USEC    16
> >> > +
> >> >  #define XAXIDMA_BD_CTRL_TXSOF_MASK   0x08000000 /* First tx packet
> */
> >> >  #define XAXIDMA_BD_CTRL_TXEOF_MASK   0x04000000 /* Last tx packet
> */
> >> >  #define XAXIDMA_BD_CTRL_ALL_MASK     0x0C000000 /* All control bits */
> >> > diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> >> > b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> >> > index 1b7a653c1f4e..f9c7d90d4ecb 100644
> >> > --- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> >> > +++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> >> > @@ -1505,6 +1505,7 @@ static int axienet_init_dmaengine(struct
> >> > net_device *ndev)  {
> >> >       struct axienet_local *lp = netdev_priv(ndev);
> >> >       struct skbuf_dma_descriptor *skbuf_dma;
> >> > +     struct dma_slave_config tx_config, rx_config;
> >> >       int i, ret;
> >> >
> >> >       lp->tx_chan = dma_request_chan(lp->dev, "tx_chan0"); @@
> >> > -1520,6
> >> > +1521,22 @@ static int axienet_init_dmaengine(struct net_device
> >> > +*ndev)
> >> >               goto err_dma_release_tx;
> >> >       }
> >> >
> >> > +     tx_config.coalesce_cnt = XAXIDMAENGINE_DFT_TX_THRESHOLD;
> >> > +     tx_config.coalesce_usecs = XAXIDMAENGINE_DFT_TX_USEC;
> >> > +     rx_config.coalesce_cnt = XAXIDMAENGINE_DFT_RX_THRESHOLD;
> >> > +     rx_config.coalesce_usecs =  XAXIDMAENGINE_DFT_RX_USEC;
> >>
> >> I think it would be clearer to just do something like
> >>
> >>         struct dma_slave_config tx_config = {
> >>                 .coalesce_cnt = 16,
> >>                 .coalesce_usecs = 5,
> >>         };
> >>
> >> since these are only used once. And this ensures that you initialize the whole
> struct.
> >>
> >> But what tree are you using? I don't see these members on net-next or
> dmaengine.
> >
> > These changes are proposed in separate series in dmaengine
> https://lore.kernel.org/all/20250525101617.1168991-2-suraj.gupta2@amd.com/ and I
> described it here below my SOB.
>
> I think you should post those patches with this series to allow them to be reviewed
> appropriately.
>
> --Sean

DMAengine series functionality depends on commit (https://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine.git/commit/drivers/dma/xilinx?h=next&id=7e01511443c30a55a5ae78d3debd46d4d872517e) in dmaengine which is currently not there in net-next. So I sent that to dmaengine only. Please let me know if any way to send as single series.
>
> >>
> >> > +     ret = dmaengine_slave_config(lp->tx_chan, &tx_config);
> >> > +     if (ret) {
> >> > +             dev_err(lp->dev, "Failed to configure Tx coalesce parameters\n");
> >> > +             goto err_dma_release_tx;
> >> > +     }
> >> > +     ret = dmaengine_slave_config(lp->rx_chan, &rx_config);
> >> > +     if (ret) {
> >> > +             dev_err(lp->dev, "Failed to configure Rx coalesce parameters\n");
> >> > +             goto err_dma_release_tx;
> >> > +     }
> >> > +
> >> >       lp->tx_ring_tail = 0;
> >> >       lp->tx_ring_head = 0;
> >> >       lp->rx_ring_tail = 0;
> >> > @@ -2170,6 +2187,19 @@ axienet_ethtools_get_coalesce(struct
> >> > net_device
> >> *ndev,
> >> >       struct axienet_local *lp = netdev_priv(ndev);
> >> >       u32 cr;
> >> >
> >> > +     if (lp->use_dmaengine) {
> >> > +             struct dma_slave_caps tx_caps, rx_caps;
> >> > +
> >> > +             dma_get_slave_caps(lp->tx_chan, &tx_caps);
> >> > +             dma_get_slave_caps(lp->rx_chan, &rx_caps);
> >> > +
> >> > +             ecoalesce->tx_max_coalesced_frames = tx_caps.coalesce_cnt;
> >> > +             ecoalesce->tx_coalesce_usecs = tx_caps.coalesce_usecs;
> >> > +             ecoalesce->rx_max_coalesced_frames = rx_caps.coalesce_cnt;
> >> > +             ecoalesce->rx_coalesce_usecs = rx_caps.coalesce_usecs;
> >> > +             return 0;
> >> > +     }
> >> > +
> >> >       ecoalesce->use_adaptive_rx_coalesce = lp->rx_dim_enabled;
> >> >
> >> >       spin_lock_irq(&lp->rx_cr_lock); @@ -2233,6 +2263,29 @@
> >> > axienet_ethtools_set_coalesce(struct net_device
> >> *ndev,
> >> >               return -EINVAL;
> >> >       }
> >> >
> >> > +     if (lp->use_dmaengine)  {
> >> > +             struct dma_slave_config tx_cfg, rx_cfg;
> >> > +             int ret;
> >> > +
> >> > +             tx_cfg.coalesce_cnt = ecoalesce->tx_max_coalesced_frames;
> >> > +             tx_cfg.coalesce_usecs = ecoalesce->tx_coalesce_usecs;
> >> > +             rx_cfg.coalesce_cnt = ecoalesce->rx_max_coalesced_frames;
> >> > +             rx_cfg.coalesce_usecs = ecoalesce->rx_coalesce_usecs;
> >> > +
> >> > +             ret = dmaengine_slave_config(lp->tx_chan, &tx_cfg);
> >> > +             if (ret) {
> >> > +                     NL_SET_ERR_MSG(extack, "failed to set tx coalesce
> parameters");
> >> > +                     return ret;
> >> > +             }
> >> > +
> >> > +             ret = dmaengine_slave_config(lp->rx_chan, &rx_cfg);
> >> > +             if (ret) {
> >> > +                     NL_SET_ERR_MSG(extack, "failed to set rx
> >> > + coalesce
> >> parameters");
> >> > +                     return ret;
> >> > +             }
> >> > +             return 0;
> >> > +     }
> >> > +
> >> >       if (new_dim && !old_dim) {
> >> >               cr = axienet_calc_cr(lp, axienet_dim_coalesce_count_rx(lp),
> >> >                                    ecoalesce->rx_coalesce_usecs);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH net-next] net: xilinx: axienet: Configure and report coalesce parameters in DMAengine flow
  2025-05-30 10:18       ` Gupta, Suraj
@ 2025-05-30 11:53         ` Gupta, Suraj
  2025-05-30 20:44         ` Sean Anderson
  1 sibling, 0 replies; 13+ messages in thread
From: Gupta, Suraj @ 2025-05-30 11:53 UTC (permalink / raw)
  To: Sean Anderson, andrew+netdev@lunn.ch, davem@davemloft.net,
	edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
	vkoul@kernel.org, Simek, Michal, Pandey, Radhey Shyam,
	horms@kernel.org
  Cc: netdev@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, git (AMD-Xilinx), Katakam, Harini

[Public]

> -----Original Message-----
> From: Gupta, Suraj <Suraj.Gupta2@amd.com>
> Sent: Friday, May 30, 2025 3:49 PM
> To: Sean Anderson <sean.anderson@linux.dev>; andrew+netdev@lunn.ch;
> davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
> pabeni@redhat.com; vkoul@kernel.org; Simek, Michal <michal.simek@amd.com>;
> Pandey, Radhey Shyam <radhey.shyam.pandey@amd.com>; horms@kernel.org
> Cc: netdev@vger.kernel.org; linux-arm-kernel@lists.infradead.org; linux-
> kernel@vger.kernel.org; git (AMD-Xilinx) <git@amd.com>; Katakam, Harini
> <harini.katakam@amd.com>
> Subject: RE: [PATCH net-next] net: xilinx: axienet: Configure and report coalesce
> parameters in DMAengine flow
>
> [AMD Official Use Only - AMD Internal Distribution Only]
>

*Modified sensitivity level to Public.

> > -----Original Message-----
> > From: Sean Anderson <sean.anderson@linux.dev>
> > Sent: Thursday, May 29, 2025 9:48 PM
> > To: Gupta, Suraj <Suraj.Gupta2@amd.com>; andrew+netdev@lunn.ch;
> > davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
> > pabeni@redhat.com; vkoul@kernel.org; Simek, Michal
> > <michal.simek@amd.com>; Pandey, Radhey Shyam
> > <radhey.shyam.pandey@amd.com>; horms@kernel.org
> > Cc: netdev@vger.kernel.org; linux-arm-kernel@lists.infradead.org;
> > linux- kernel@vger.kernel.org; git (AMD-Xilinx) <git@amd.com>;
> > Katakam, Harini <harini.katakam@amd.com>
> > Subject: Re: [PATCH net-next] net: xilinx: axienet: Configure and
> > report coalesce parameters in DMAengine flow
> >
> > Caution: This message originated from an External Source. Use proper
> > caution when opening attachments, clicking links, or responding.
> >
> >
> > On 5/28/25 08:00, Gupta, Suraj wrote:
> > > [AMD Official Use Only - AMD Internal Distribution Only]
> > >
> > >> -----Original Message-----
> > >> From: Sean Anderson <sean.anderson@linux.dev>
> > >> Sent: Tuesday, May 27, 2025 9:47 PM
> > >> To: Gupta, Suraj <Suraj.Gupta2@amd.com>; andrew+netdev@lunn.ch;
> > >> davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
> > >> pabeni@redhat.com; vkoul@kernel.org; Simek, Michal
> > >> <michal.simek@amd.com>; Pandey, Radhey Shyam
> > >> <radhey.shyam.pandey@amd.com>; horms@kernel.org
> > >> Cc: netdev@vger.kernel.org; linux-arm-kernel@lists.infradead.org;
> > >> linux- kernel@vger.kernel.org; git (AMD-Xilinx) <git@amd.com>;
> > >> Katakam, Harini <harini.katakam@amd.com>
> > >> Subject: Re: [PATCH net-next] net: xilinx: axienet: Configure and
> > >> report coalesce parameters in DMAengine flow
> > >>
> > >> Caution: This message originated from an External Source. Use
> > >> proper caution when opening attachments, clicking links, or responding.
> > >>
> > >>
> > >> On 5/25/25 06:22, Suraj Gupta wrote:
> > >> > Add support to configure / report interrupt coalesce count and
> > >> > delay via ethtool in DMAEngine flow.
> > >> > Netperf numbers are not good when using non-dmaengine default
> > >> > values, so tuned coalesce count and delay and defined separate
> > >> > default values in dmaengine flow.
> > >> >
> > >> > Netperf numbers and CPU utilisation change in DMAengine flow
> > >> > after introducing coalescing with default parameters:
> > >> > coalesce parameters:
> > >> >    Transfer type        Before(w/o coalescing)  After(with coalescing)
> > >> > TCP Tx, CPU utilisation%      925, 27                 941, 22
> > >> > TCP Rx, CPU utilisation%      607, 32                 741, 36
> > >> > UDP Tx, CPU utilisation%      857, 31                 960, 28
> > >> > UDP Rx, CPU utilisation%      762, 26                 783, 18
> > >> >
> > >> > Above numbers are observed with 4x Cortex-a53.
> > >>
> > >> How does this affect latency? I would expect these RX settings to
> > >> increase latency around 5-10x. I only use these settings with DIM
> > >> since it will disable coalescing during periods of light load for better latency.
> > >>
> > >> (of course the way to fix this in general is RSS or some other
> > >> method involving multiple queues).
> > >>
> > >
> > > I took values before NAPI addition in legacy flow (rx_threshold: 24,
> > > rx_usec: 50) as
> > reference. But netperf numbers were low with them, so tried tuning
> > both and selected the pair which gives good numbers.
> >
> > Yeah, but the reason is that you are trading latency for throughput.
> > There is only one queue, so when the interface is saturated you will
> > not get good latency anyway (since latency-sensitive packets will get
> > head-of-line blocked). But when activity is sparse you can good
> > latency if there is no coalescing. So I think coalescing should only
> > be used when there is a lot of traffic. Hence why I only adjusted the
> > settings once I implemented DIM. I think you should be able to
> > implement it by calling net_dim from axienet_dma_rx_cb, but it will not be as
> efficient without NAPI.
> >
>
> Ok, got it. I'll keep default values used before NAPI in legacy flow (coalesce count:
> 24, delay: 50) for both Tx and Rx and remove perf comparisons.
>
> > Actually, if you are looking into improving performance, I think lack
> > of NAPI is probably the biggest limitation with the dmaengine backend.
> >
> Yes, I agree. NAPI for DMAEngine implementation is underway and will be sent to
> mainline soon.
>
> > >> > Signed-off-by: Suraj Gupta <suraj.gupta2@amd.com>
> > >> > ---
> > >> > This patch depend on following AXI DMA dmengine driver changes
> > >> > sent to dmaengine mailing list as pre-requisit series:
> > >> > https://lore.kernel.org/all/20250525101617.1168991-1-suraj.gupta2@amd.
> > >> > com/
> > >> > ---
> > >> >  drivers/net/ethernet/xilinx/xilinx_axienet.h  |  6 +++
> > >> > .../net/ethernet/xilinx/xilinx_axienet_main.c | 53
> > >> > +++++++++++++++++++
> > >> >  2 files changed, 59 insertions(+)
> > >> >
> > >> > diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet.h
> > >> > b/drivers/net/ethernet/xilinx/xilinx_axienet.h
> > >> > index 5ff742103beb..cdf6cbb6f2fd 100644
> > >> > --- a/drivers/net/ethernet/xilinx/xilinx_axienet.h
> > >> > +++ b/drivers/net/ethernet/xilinx/xilinx_axienet.h
> > >> > @@ -126,6 +126,12 @@
> > >> >  #define XAXIDMA_DFT_TX_USEC          50
> > >> >  #define XAXIDMA_DFT_RX_USEC          16
> > >> >
> > >> > +/* Default TX/RX Threshold and delay timer values for SGDMA mode
> > >> > +with
> > >> DMAEngine */
> > >> > +#define XAXIDMAENGINE_DFT_TX_THRESHOLD       16
> > >> > +#define XAXIDMAENGINE_DFT_TX_USEC    5
> > >> > +#define XAXIDMAENGINE_DFT_RX_THRESHOLD       24
> > >> > +#define XAXIDMAENGINE_DFT_RX_USEC    16
> > >> > +
> > >> >  #define XAXIDMA_BD_CTRL_TXSOF_MASK   0x08000000 /* First tx
> packet
> > */
> > >> >  #define XAXIDMA_BD_CTRL_TXEOF_MASK   0x04000000 /* Last tx
> packet
> > */
> > >> >  #define XAXIDMA_BD_CTRL_ALL_MASK     0x0C000000 /* All control bits
> */
> > >> > diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> > >> > b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> > >> > index 1b7a653c1f4e..f9c7d90d4ecb 100644
> > >> > --- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> > >> > +++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> > >> > @@ -1505,6 +1505,7 @@ static int axienet_init_dmaengine(struct
> > >> > net_device *ndev)  {
> > >> >       struct axienet_local *lp = netdev_priv(ndev);
> > >> >       struct skbuf_dma_descriptor *skbuf_dma;
> > >> > +     struct dma_slave_config tx_config, rx_config;
> > >> >       int i, ret;
> > >> >
> > >> >       lp->tx_chan = dma_request_chan(lp->dev, "tx_chan0"); @@
> > >> > -1520,6
> > >> > +1521,22 @@ static int axienet_init_dmaengine(struct net_device
> > >> > +*ndev)
> > >> >               goto err_dma_release_tx;
> > >> >       }
> > >> >
> > >> > +     tx_config.coalesce_cnt = XAXIDMAENGINE_DFT_TX_THRESHOLD;
> > >> > +     tx_config.coalesce_usecs = XAXIDMAENGINE_DFT_TX_USEC;
> > >> > +     rx_config.coalesce_cnt = XAXIDMAENGINE_DFT_RX_THRESHOLD;
> > >> > +     rx_config.coalesce_usecs =  XAXIDMAENGINE_DFT_RX_USEC;
> > >>
> > >> I think it would be clearer to just do something like
> > >>
> > >>         struct dma_slave_config tx_config = {
> > >>                 .coalesce_cnt = 16,
> > >>                 .coalesce_usecs = 5,
> > >>         };
> > >>
> > >> since these are only used once. And this ensures that you
> > >> initialize the whole
> > struct.
> > >>
> > >> But what tree are you using? I don't see these members on net-next
> > >> or
> > dmaengine.
> > >
> > > These changes are proposed in separate series in dmaengine
> > https://lore.kernel.org/all/20250525101617.1168991-2-suraj.gupta2@amd.
> > com/ and I described it here below my SOB.
> >
> > I think you should post those patches with this series to allow them
> > to be reviewed appropriately.
> >
> > --Sean
>
> DMAengine series functionality depends on commit
> (https://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine.git/commit/drivers/d
> ma/xilinx?h=next&id=7e01511443c30a55a5ae78d3debd46d4d872517e) in
> dmaengine which is currently not there in net-next. So I sent that to dmaengine only.
> Please let me know if any way to send as single series.
> >
> > >>
> > >> > +     ret = dmaengine_slave_config(lp->tx_chan, &tx_config);
> > >> > +     if (ret) {
> > >> > +             dev_err(lp->dev, "Failed to configure Tx coalesce parameters\n");
> > >> > +             goto err_dma_release_tx;
> > >> > +     }
> > >> > +     ret = dmaengine_slave_config(lp->rx_chan, &rx_config);
> > >> > +     if (ret) {
> > >> > +             dev_err(lp->dev, "Failed to configure Rx coalesce parameters\n");
> > >> > +             goto err_dma_release_tx;
> > >> > +     }
> > >> > +
> > >> >       lp->tx_ring_tail = 0;
> > >> >       lp->tx_ring_head = 0;
> > >> >       lp->rx_ring_tail = 0;
> > >> > @@ -2170,6 +2187,19 @@ axienet_ethtools_get_coalesce(struct
> > >> > net_device
> > >> *ndev,
> > >> >       struct axienet_local *lp = netdev_priv(ndev);
> > >> >       u32 cr;
> > >> >
> > >> > +     if (lp->use_dmaengine) {
> > >> > +             struct dma_slave_caps tx_caps, rx_caps;
> > >> > +
> > >> > +             dma_get_slave_caps(lp->tx_chan, &tx_caps);
> > >> > +             dma_get_slave_caps(lp->rx_chan, &rx_caps);
> > >> > +
> > >> > +             ecoalesce->tx_max_coalesced_frames = tx_caps.coalesce_cnt;
> > >> > +             ecoalesce->tx_coalesce_usecs = tx_caps.coalesce_usecs;
> > >> > +             ecoalesce->rx_max_coalesced_frames = rx_caps.coalesce_cnt;
> > >> > +             ecoalesce->rx_coalesce_usecs = rx_caps.coalesce_usecs;
> > >> > +             return 0;
> > >> > +     }
> > >> > +
> > >> >       ecoalesce->use_adaptive_rx_coalesce = lp->rx_dim_enabled;
> > >> >
> > >> >       spin_lock_irq(&lp->rx_cr_lock); @@ -2233,6 +2263,29 @@
> > >> > axienet_ethtools_set_coalesce(struct net_device
> > >> *ndev,
> > >> >               return -EINVAL;
> > >> >       }
> > >> >
> > >> > +     if (lp->use_dmaengine)  {
> > >> > +             struct dma_slave_config tx_cfg, rx_cfg;
> > >> > +             int ret;
> > >> > +
> > >> > +             tx_cfg.coalesce_cnt = ecoalesce->tx_max_coalesced_frames;
> > >> > +             tx_cfg.coalesce_usecs = ecoalesce->tx_coalesce_usecs;
> > >> > +             rx_cfg.coalesce_cnt = ecoalesce->rx_max_coalesced_frames;
> > >> > +             rx_cfg.coalesce_usecs =
> > >> > + ecoalesce->rx_coalesce_usecs;
> > >> > +
> > >> > +             ret = dmaengine_slave_config(lp->tx_chan, &tx_cfg);
> > >> > +             if (ret) {
> > >> > +                     NL_SET_ERR_MSG(extack, "failed to set tx
> > >> > + coalesce
> > parameters");
> > >> > +                     return ret;
> > >> > +             }
> > >> > +
> > >> > +             ret = dmaengine_slave_config(lp->rx_chan, &rx_cfg);
> > >> > +             if (ret) {
> > >> > +                     NL_SET_ERR_MSG(extack, "failed to set rx
> > >> > + coalesce
> > >> parameters");
> > >> > +                     return ret;
> > >> > +             }
> > >> > +             return 0;
> > >> > +     }
> > >> > +
> > >> >       if (new_dim && !old_dim) {
> > >> >               cr = axienet_calc_cr(lp, axienet_dim_coalesce_count_rx(lp),
> > >> >                                    ecoalesce->rx_coalesce_usecs);


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next] net: xilinx: axienet: Configure and report coalesce parameters in DMAengine flow
  2025-05-30 10:18       ` Gupta, Suraj
  2025-05-30 11:53         ` Gupta, Suraj
@ 2025-05-30 20:44         ` Sean Anderson
  2025-06-03 11:07           ` Gupta, Suraj
  1 sibling, 1 reply; 13+ messages in thread
From: Sean Anderson @ 2025-05-30 20:44 UTC (permalink / raw)
  To: Gupta, Suraj, andrew+netdev@lunn.ch, davem@davemloft.net,
	edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
	vkoul@kernel.org, Simek, Michal, Pandey, Radhey Shyam,
	horms@kernel.org
  Cc: netdev@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, git (AMD-Xilinx), Katakam, Harini

On 5/30/25 06:18, Gupta, Suraj wrote:
> [AMD Official Use Only - AMD Internal Distribution Only]
> 
>> -----Original Message-----
>> From: Sean Anderson <sean.anderson@linux.dev>
>> Sent: Thursday, May 29, 2025 9:48 PM
>> To: Gupta, Suraj <Suraj.Gupta2@amd.com>; andrew+netdev@lunn.ch;
>> davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
>> pabeni@redhat.com; vkoul@kernel.org; Simek, Michal <michal.simek@amd.com>;
>> Pandey, Radhey Shyam <radhey.shyam.pandey@amd.com>; horms@kernel.org
>> Cc: netdev@vger.kernel.org; linux-arm-kernel@lists.infradead.org; linux-
>> kernel@vger.kernel.org; git (AMD-Xilinx) <git@amd.com>; Katakam, Harini
>> <harini.katakam@amd.com>
>> Subject: Re: [PATCH net-next] net: xilinx: axienet: Configure and report coalesce
>> parameters in DMAengine flow
>>
>> Caution: This message originated from an External Source. Use proper caution
>> when opening attachments, clicking links, or responding.
>>
>>
>> On 5/28/25 08:00, Gupta, Suraj wrote:
>> > [AMD Official Use Only - AMD Internal Distribution Only]
>> >
>> >> -----Original Message-----
>> >> From: Sean Anderson <sean.anderson@linux.dev>
>> >> Sent: Tuesday, May 27, 2025 9:47 PM
>> >> To: Gupta, Suraj <Suraj.Gupta2@amd.com>; andrew+netdev@lunn.ch;
>> >> davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
>> >> pabeni@redhat.com; vkoul@kernel.org; Simek, Michal
>> >> <michal.simek@amd.com>; Pandey, Radhey Shyam
>> >> <radhey.shyam.pandey@amd.com>; horms@kernel.org
>> >> Cc: netdev@vger.kernel.org; linux-arm-kernel@lists.infradead.org;
>> >> linux- kernel@vger.kernel.org; git (AMD-Xilinx) <git@amd.com>;
>> >> Katakam, Harini <harini.katakam@amd.com>
>> >> Subject: Re: [PATCH net-next] net: xilinx: axienet: Configure and
>> >> report coalesce parameters in DMAengine flow
>> >>
>> >> Caution: This message originated from an External Source. Use proper
>> >> caution when opening attachments, clicking links, or responding.
>> >>
>> >>
>> >> On 5/25/25 06:22, Suraj Gupta wrote:
>> >> > Add support to configure / report interrupt coalesce count and
>> >> > delay via ethtool in DMAEngine flow.
>> >> > Netperf numbers are not good when using non-dmaengine default
>> >> > values, so tuned coalesce count and delay and defined separate
>> >> > default values in dmaengine flow.
>> >> >
>> >> > Netperf numbers and CPU utilisation change in DMAengine flow after
>> >> > introducing coalescing with default parameters:
>> >> > coalesce parameters:
>> >> >    Transfer type        Before(w/o coalescing)  After(with coalescing)
>> >> > TCP Tx, CPU utilisation%      925, 27                 941, 22
>> >> > TCP Rx, CPU utilisation%      607, 32                 741, 36
>> >> > UDP Tx, CPU utilisation%      857, 31                 960, 28
>> >> > UDP Rx, CPU utilisation%      762, 26                 783, 18
>> >> >
>> >> > Above numbers are observed with 4x Cortex-a53.
>> >>
>> >> How does this affect latency? I would expect these RX settings to
>> >> increase latency around 5-10x. I only use these settings with DIM
>> >> since it will disable coalescing during periods of light load for better latency.
>> >>
>> >> (of course the way to fix this in general is RSS or some other method
>> >> involving multiple queues).
>> >>
>> >
>> > I took values before NAPI addition in legacy flow (rx_threshold: 24, rx_usec: 50) as
>> reference. But netperf numbers were low with them, so tried tuning both and
>> selected the pair which gives good numbers.
>>
>> Yeah, but the reason is that you are trading latency for throughput.
>> There is only one queue, so when the interface is saturated you will not get good
>> latency anyway (since latency-sensitive packets will get head-of-line blocked). But
>> when activity is sparse you can good latency if there is no coalescing. So I think
>> coalescing should only be used when there is a lot of traffic. Hence why I only
>> adjusted the settings once I implemented DIM. I think you should be able to
>> implement it by calling net_dim from axienet_dma_rx_cb, but it will not be as efficient
>> without NAPI.
>>
> 
> Ok, got it. I'll keep default values used before NAPI in legacy flow (coalesce count: 24, delay: 50) for both Tx and Rx and remove perf comparisons.

Those settings are actually probably even worse for latency. I'd leave
the settings at 0/0 (coalescing disabled) to match the existing
behavior. I think the perf comparisons are helpful, especially for
people who know they are going to be throughput-limited.

My main point is that I think extending the dmaengine API to allow for
DIM will have practical benefits in reduced latency.

>> Actually, if you are looking into improving performance, I think lack of NAPI is
>> probably the biggest limitation with the dmaengine backend.
>>
> Yes, I agree. NAPI for DMAEngine implementation is underway and will be sent to mainline soon.

Looking forward to it.

>> >> > Signed-off-by: Suraj Gupta <suraj.gupta2@amd.com>
>> >> > ---
>> >> > This patch depend on following AXI DMA dmengine driver changes sent
>> >> > to dmaengine mailing list as pre-requisit series:
>> >> > https://lore.kernel.org/all/20250525101617.1168991-1-suraj.gupta2@amd.
>> >> > com/
>> >> > ---
>> >> >  drivers/net/ethernet/xilinx/xilinx_axienet.h  |  6 +++
>> >> > .../net/ethernet/xilinx/xilinx_axienet_main.c | 53
>> >> > +++++++++++++++++++
>> >> >  2 files changed, 59 insertions(+)
>> >> >
>> >> > diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet.h
>> >> > b/drivers/net/ethernet/xilinx/xilinx_axienet.h
>> >> > index 5ff742103beb..cdf6cbb6f2fd 100644
>> >> > --- a/drivers/net/ethernet/xilinx/xilinx_axienet.h
>> >> > +++ b/drivers/net/ethernet/xilinx/xilinx_axienet.h
>> >> > @@ -126,6 +126,12 @@
>> >> >  #define XAXIDMA_DFT_TX_USEC          50
>> >> >  #define XAXIDMA_DFT_RX_USEC          16
>> >> >
>> >> > +/* Default TX/RX Threshold and delay timer values for SGDMA mode
>> >> > +with
>> >> DMAEngine */
>> >> > +#define XAXIDMAENGINE_DFT_TX_THRESHOLD       16
>> >> > +#define XAXIDMAENGINE_DFT_TX_USEC    5
>> >> > +#define XAXIDMAENGINE_DFT_RX_THRESHOLD       24
>> >> > +#define XAXIDMAENGINE_DFT_RX_USEC    16
>> >> > +
>> >> >  #define XAXIDMA_BD_CTRL_TXSOF_MASK   0x08000000 /* First tx packet
>> */
>> >> >  #define XAXIDMA_BD_CTRL_TXEOF_MASK   0x04000000 /* Last tx packet
>> */
>> >> >  #define XAXIDMA_BD_CTRL_ALL_MASK     0x0C000000 /* All control bits */
>> >> > diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
>> >> > b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
>> >> > index 1b7a653c1f4e..f9c7d90d4ecb 100644
>> >> > --- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
>> >> > +++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
>> >> > @@ -1505,6 +1505,7 @@ static int axienet_init_dmaengine(struct
>> >> > net_device *ndev)  {
>> >> >       struct axienet_local *lp = netdev_priv(ndev);
>> >> >       struct skbuf_dma_descriptor *skbuf_dma;
>> >> > +     struct dma_slave_config tx_config, rx_config;
>> >> >       int i, ret;
>> >> >
>> >> >       lp->tx_chan = dma_request_chan(lp->dev, "tx_chan0"); @@
>> >> > -1520,6
>> >> > +1521,22 @@ static int axienet_init_dmaengine(struct net_device
>> >> > +*ndev)
>> >> >               goto err_dma_release_tx;
>> >> >       }
>> >> >
>> >> > +     tx_config.coalesce_cnt = XAXIDMAENGINE_DFT_TX_THRESHOLD;
>> >> > +     tx_config.coalesce_usecs = XAXIDMAENGINE_DFT_TX_USEC;
>> >> > +     rx_config.coalesce_cnt = XAXIDMAENGINE_DFT_RX_THRESHOLD;
>> >> > +     rx_config.coalesce_usecs =  XAXIDMAENGINE_DFT_RX_USEC;
>> >>
>> >> I think it would be clearer to just do something like
>> >>
>> >>         struct dma_slave_config tx_config = {
>> >>                 .coalesce_cnt = 16,
>> >>                 .coalesce_usecs = 5,
>> >>         };
>> >>
>> >> since these are only used once. And this ensures that you initialize the whole
>> struct.
>> >>
>> >> But what tree are you using? I don't see these members on net-next or
>> dmaengine.
>> >
>> > These changes are proposed in separate series in dmaengine
>> https://lore.kernel.org/all/20250525101617.1168991-2-suraj.gupta2@amd.com/ and I
>> described it here below my SOB.
>>
>> I think you should post those patches with this series to allow them to be reviewed
>> appropriately.
>>
>> --Sean
> 
> DMAengine series functionality depends on commit
> (https://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine.git/commit/drivers/dma/xilinx?h=next&id=7e01511443c30a55a5ae78d3debd46d4d872517e)
> in dmaengine which is currently not there in net-next. So I sent that
> to dmaengine only. Please let me know if any way to send as single
> series.

It looks like this won't cause any conflicts, so I think you can just
send the whole series with a note in the cover letter like

| This series depends on commit 7e01511443c3 ("dmaengine: xilinx_dma:
| Set dma_device directions") currently in dmaengine/next.

--Sean

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH net-next] net: xilinx: axienet: Configure and report coalesce parameters in DMAengine flow
  2025-05-30 20:44         ` Sean Anderson
@ 2025-06-03 11:07           ` Gupta, Suraj
  2025-06-09 16:22             ` Sean Anderson
  0 siblings, 1 reply; 13+ messages in thread
From: Gupta, Suraj @ 2025-06-03 11:07 UTC (permalink / raw)
  To: Sean Anderson, andrew+netdev@lunn.ch, davem@davemloft.net,
	edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
	vkoul@kernel.org, Simek, Michal, Pandey, Radhey Shyam,
	horms@kernel.org
  Cc: netdev@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, git (AMD-Xilinx), Katakam, Harini

[Public]

> -----Original Message-----
> From: Sean Anderson <sean.anderson@linux.dev>
> Sent: Saturday, May 31, 2025 2:15 AM
> To: Gupta, Suraj <Suraj.Gupta2@amd.com>; andrew+netdev@lunn.ch;
> davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
> pabeni@redhat.com; vkoul@kernel.org; Simek, Michal <michal.simek@amd.com>;
> Pandey, Radhey Shyam <radhey.shyam.pandey@amd.com>; horms@kernel.org
> Cc: netdev@vger.kernel.org; linux-arm-kernel@lists.infradead.org; linux-
> kernel@vger.kernel.org; git (AMD-Xilinx) <git@amd.com>; Katakam, Harini
> <harini.katakam@amd.com>
> Subject: Re: [PATCH net-next] net: xilinx: axienet: Configure and report coalesce
> parameters in DMAengine flow
>
> Caution: This message originated from an External Source. Use proper caution
> when opening attachments, clicking links, or responding.
>
>
> On 5/30/25 06:18, Gupta, Suraj wrote:
> > [AMD Official Use Only - AMD Internal Distribution Only]
> >
> >> -----Original Message-----
> >> From: Sean Anderson <sean.anderson@linux.dev>
> >> Sent: Thursday, May 29, 2025 9:48 PM
> >> To: Gupta, Suraj <Suraj.Gupta2@amd.com>; andrew+netdev@lunn.ch;
> >> davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
> >> pabeni@redhat.com; vkoul@kernel.org; Simek, Michal
> >> <michal.simek@amd.com>; Pandey, Radhey Shyam
> >> <radhey.shyam.pandey@amd.com>; horms@kernel.org
> >> Cc: netdev@vger.kernel.org; linux-arm-kernel@lists.infradead.org;
> >> linux- kernel@vger.kernel.org; git (AMD-Xilinx) <git@amd.com>;
> >> Katakam, Harini <harini.katakam@amd.com>
> >> Subject: Re: [PATCH net-next] net: xilinx: axienet: Configure and
> >> report coalesce parameters in DMAengine flow
> >>
> >> Caution: This message originated from an External Source. Use proper
> >> caution when opening attachments, clicking links, or responding.
> >>
> >>
> >> On 5/28/25 08:00, Gupta, Suraj wrote:
> >> > [AMD Official Use Only - AMD Internal Distribution Only]
> >> >
> >> >> -----Original Message-----
> >> >> From: Sean Anderson <sean.anderson@linux.dev>
> >> >> Sent: Tuesday, May 27, 2025 9:47 PM
> >> >> To: Gupta, Suraj <Suraj.Gupta2@amd.com>; andrew+netdev@lunn.ch;
> >> >> davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
> >> >> pabeni@redhat.com; vkoul@kernel.org; Simek, Michal
> >> >> <michal.simek@amd.com>; Pandey, Radhey Shyam
> >> >> <radhey.shyam.pandey@amd.com>; horms@kernel.org
> >> >> Cc: netdev@vger.kernel.org; linux-arm-kernel@lists.infradead.org;
> >> >> linux- kernel@vger.kernel.org; git (AMD-Xilinx) <git@amd.com>;
> >> >> Katakam, Harini <harini.katakam@amd.com>
> >> >> Subject: Re: [PATCH net-next] net: xilinx: axienet: Configure and
> >> >> report coalesce parameters in DMAengine flow
> >> >>
> >> >> Caution: This message originated from an External Source. Use
> >> >> proper caution when opening attachments, clicking links, or responding.
> >> >>
> >> >>
> >> >> On 5/25/25 06:22, Suraj Gupta wrote:
> >> >> > Add support to configure / report interrupt coalesce count and
> >> >> > delay via ethtool in DMAEngine flow.
> >> >> > Netperf numbers are not good when using non-dmaengine default
> >> >> > values, so tuned coalesce count and delay and defined separate
> >> >> > default values in dmaengine flow.
> >> >> >
> >> >> > Netperf numbers and CPU utilisation change in DMAengine flow
> >> >> > after introducing coalescing with default parameters:
> >> >> > coalesce parameters:
> >> >> >    Transfer type        Before(w/o coalescing)  After(with coalescing)
> >> >> > TCP Tx, CPU utilisation%      925, 27                 941, 22
> >> >> > TCP Rx, CPU utilisation%      607, 32                 741, 36
> >> >> > UDP Tx, CPU utilisation%      857, 31                 960, 28
> >> >> > UDP Rx, CPU utilisation%      762, 26                 783, 18
> >> >> >
> >> >> > Above numbers are observed with 4x Cortex-a53.
> >> >>
> >> >> How does this affect latency? I would expect these RX settings to
> >> >> increase latency around 5-10x. I only use these settings with DIM
> >> >> since it will disable coalescing during periods of light load for better latency.
> >> >>
> >> >> (of course the way to fix this in general is RSS or some other
> >> >> method involving multiple queues).
> >> >>
> >> >
> >> > I took values before NAPI addition in legacy flow (rx_threshold:
> >> > 24, rx_usec: 50) as
> >> reference. But netperf numbers were low with them, so tried tuning
> >> both and selected the pair which gives good numbers.
> >>
> >> Yeah, but the reason is that you are trading latency for throughput.
> >> There is only one queue, so when the interface is saturated you will
> >> not get good latency anyway (since latency-sensitive packets will get
> >> head-of-line blocked). But when activity is sparse you can good
> >> latency if there is no coalescing. So I think coalescing should only
> >> be used when there is a lot of traffic. Hence why I only adjusted the
> >> settings once I implemented DIM. I think you should be able to
> >> implement it by calling net_dim from axienet_dma_rx_cb, but it will not be as
> efficient without NAPI.
> >>
> >
> > Ok, got it. I'll keep default values used before NAPI in legacy flow (coalesce count:
> 24, delay: 50) for both Tx and Rx and remove perf comparisons.
>
> Those settings are actually probably even worse for latency. I'd leave the settings at
> 0/0 (coalescing disabled) to match the existing behavior. I think the perf comparisons
> are helpful, especially for people who know they are going to be throughput-limited.
>
> My main point is that I think extending the dmaengine API to allow for DIM will have
> practical benefits in reduced latency.
>
Sure, will implement DIM for both Tx and Rx in next version. However, I noticed it's implemented for Rx only in legacy flow. Is there any specific reason for that?

> >> Actually, if you are looking into improving performance, I think lack
> >> of NAPI is probably the biggest limitation with the dmaengine backend.
> >>
> > Yes, I agree. NAPI for DMAEngine implementation is underway and will be sent to
> mainline soon.
>
> Looking forward to it.
>
> >> >> > Signed-off-by: Suraj Gupta <suraj.gupta2@amd.com>
> >> >> > ---
> >> >> > This patch depend on following AXI DMA dmengine driver changes
> >> >> > sent to dmaengine mailing list as pre-requisit series:
> >> >> > https://lore.kernel.org/all/20250525101617.1168991-1-suraj.gupta2@amd.
> >> >> > com/
> >> >> > ---
> >> >> >  drivers/net/ethernet/xilinx/xilinx_axienet.h  |  6 +++
> >> >> > .../net/ethernet/xilinx/xilinx_axienet_main.c | 53
> >> >> > +++++++++++++++++++
> >> >> >  2 files changed, 59 insertions(+)
> >> >> >
> >> >> > diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet.h
> >> >> > b/drivers/net/ethernet/xilinx/xilinx_axienet.h
> >> >> > index 5ff742103beb..cdf6cbb6f2fd 100644
> >> >> > --- a/drivers/net/ethernet/xilinx/xilinx_axienet.h
> >> >> > +++ b/drivers/net/ethernet/xilinx/xilinx_axienet.h
> >> >> > @@ -126,6 +126,12 @@
> >> >> >  #define XAXIDMA_DFT_TX_USEC          50
> >> >> >  #define XAXIDMA_DFT_RX_USEC          16
> >> >> >
> >> >> > +/* Default TX/RX Threshold and delay timer values for SGDMA
> >> >> > +mode with
> >> >> DMAEngine */
> >> >> > +#define XAXIDMAENGINE_DFT_TX_THRESHOLD       16
> >> >> > +#define XAXIDMAENGINE_DFT_TX_USEC    5
> >> >> > +#define XAXIDMAENGINE_DFT_RX_THRESHOLD       24
> >> >> > +#define XAXIDMAENGINE_DFT_RX_USEC    16
> >> >> > +
> >> >> >  #define XAXIDMA_BD_CTRL_TXSOF_MASK   0x08000000 /* First tx
> packet
> >> */
> >> >> >  #define XAXIDMA_BD_CTRL_TXEOF_MASK   0x04000000 /* Last tx
> packet
> >> */
> >> >> >  #define XAXIDMA_BD_CTRL_ALL_MASK     0x0C000000 /* All control
> bits */
> >> >> > diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> >> >> > b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> >> >> > index 1b7a653c1f4e..f9c7d90d4ecb 100644
> >> >> > --- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> >> >> > +++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
> >> >> > @@ -1505,6 +1505,7 @@ static int axienet_init_dmaengine(struct
> >> >> > net_device *ndev)  {
> >> >> >       struct axienet_local *lp = netdev_priv(ndev);
> >> >> >       struct skbuf_dma_descriptor *skbuf_dma;
> >> >> > +     struct dma_slave_config tx_config, rx_config;
> >> >> >       int i, ret;
> >> >> >
> >> >> >       lp->tx_chan = dma_request_chan(lp->dev, "tx_chan0"); @@
> >> >> > -1520,6
> >> >> > +1521,22 @@ static int axienet_init_dmaengine(struct net_device
> >> >> > +*ndev)
> >> >> >               goto err_dma_release_tx;
> >> >> >       }
> >> >> >
> >> >> > +     tx_config.coalesce_cnt = XAXIDMAENGINE_DFT_TX_THRESHOLD;
> >> >> > +     tx_config.coalesce_usecs = XAXIDMAENGINE_DFT_TX_USEC;
> >> >> > +     rx_config.coalesce_cnt = XAXIDMAENGINE_DFT_RX_THRESHOLD;
> >> >> > +     rx_config.coalesce_usecs =  XAXIDMAENGINE_DFT_RX_USEC;
> >> >>
> >> >> I think it would be clearer to just do something like
> >> >>
> >> >>         struct dma_slave_config tx_config = {
> >> >>                 .coalesce_cnt = 16,
> >> >>                 .coalesce_usecs = 5,
> >> >>         };
> >> >>
> >> >> since these are only used once. And this ensures that you
> >> >> initialize the whole
> >> struct.
> >> >>
> >> >> But what tree are you using? I don't see these members on net-next
> >> >> or
> >> dmaengine.
> >> >
> >> > These changes are proposed in separate series in dmaengine
> >> https://lore.kernel.org/all/20250525101617.1168991-2-suraj.gupta2@amd
> >> .com/ and I described it here below my SOB.
> >>
> >> I think you should post those patches with this series to allow them
> >> to be reviewed appropriately.
> >>
> >> --Sean
> >
> > DMAengine series functionality depends on commit
> > (https://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine.git/c
> > ommit/drivers/dma/xilinx?h=next&id=7e01511443c30a55a5ae78d3debd46d4d87
> > 2517e) in dmaengine which is currently not there in net-next. So I
> > sent that to dmaengine only. Please let me know if any way to send as
> > single series.
>
> It looks like this won't cause any conflicts, so I think you can just send the whole
> series with a note in the cover letter like
>
> | This series depends on commit 7e01511443c3 ("dmaengine: xilinx_dma:
> | Set dma_device directions") currently in dmaengine/next.
>
> --Sean

Sure


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH net-next] net: xilinx: axienet: Configure and report coalesce parameters in DMAengine flow
  2025-06-03 11:07           ` Gupta, Suraj
@ 2025-06-09 16:22             ` Sean Anderson
  0 siblings, 0 replies; 13+ messages in thread
From: Sean Anderson @ 2025-06-09 16:22 UTC (permalink / raw)
  To: Gupta, Suraj, andrew+netdev@lunn.ch, davem@davemloft.net,
	edumazet@google.com, kuba@kernel.org, pabeni@redhat.com,
	vkoul@kernel.org, Simek, Michal, Pandey, Radhey Shyam,
	horms@kernel.org
  Cc: netdev@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, git (AMD-Xilinx), Katakam, Harini

On 6/3/25 07:07, Gupta, Suraj wrote:
> [Public]
> 
>> -----Original Message-----
>> From: Sean Anderson <sean.anderson@linux.dev>
>> Sent: Saturday, May 31, 2025 2:15 AM
>> To: Gupta, Suraj <Suraj.Gupta2@amd.com>; andrew+netdev@lunn.ch;
>> davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
>> pabeni@redhat.com; vkoul@kernel.org; Simek, Michal <michal.simek@amd.com>;
>> Pandey, Radhey Shyam <radhey.shyam.pandey@amd.com>; horms@kernel.org
>> Cc: netdev@vger.kernel.org; linux-arm-kernel@lists.infradead.org; linux-
>> kernel@vger.kernel.org; git (AMD-Xilinx) <git@amd.com>; Katakam, Harini
>> <harini.katakam@amd.com>
>> Subject: Re: [PATCH net-next] net: xilinx: axienet: Configure and report coalesce
>> parameters in DMAengine flow
>>
>> Caution: This message originated from an External Source. Use proper caution
>> when opening attachments, clicking links, or responding.
>>
>>
>> On 5/30/25 06:18, Gupta, Suraj wrote:
>> > [AMD Official Use Only - AMD Internal Distribution Only]
>> >
>> >> -----Original Message-----
>> >> From: Sean Anderson <sean.anderson@linux.dev>
>> >> Sent: Thursday, May 29, 2025 9:48 PM
>> >> To: Gupta, Suraj <Suraj.Gupta2@amd.com>; andrew+netdev@lunn.ch;
>> >> davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
>> >> pabeni@redhat.com; vkoul@kernel.org; Simek, Michal
>> >> <michal.simek@amd.com>; Pandey, Radhey Shyam
>> >> <radhey.shyam.pandey@amd.com>; horms@kernel.org
>> >> Cc: netdev@vger.kernel.org; linux-arm-kernel@lists.infradead.org;
>> >> linux- kernel@vger.kernel.org; git (AMD-Xilinx) <git@amd.com>;
>> >> Katakam, Harini <harini.katakam@amd.com>
>> >> Subject: Re: [PATCH net-next] net: xilinx: axienet: Configure and
>> >> report coalesce parameters in DMAengine flow
>> >>
>> >> Caution: This message originated from an External Source. Use proper
>> >> caution when opening attachments, clicking links, or responding.
>> >>
>> >>
>> >> On 5/28/25 08:00, Gupta, Suraj wrote:
>> >> > [AMD Official Use Only - AMD Internal Distribution Only]
>> >> >
>> >> >> -----Original Message-----
>> >> >> From: Sean Anderson <sean.anderson@linux.dev>
>> >> >> Sent: Tuesday, May 27, 2025 9:47 PM
>> >> >> To: Gupta, Suraj <Suraj.Gupta2@amd.com>; andrew+netdev@lunn.ch;
>> >> >> davem@davemloft.net; edumazet@google.com; kuba@kernel.org;
>> >> >> pabeni@redhat.com; vkoul@kernel.org; Simek, Michal
>> >> >> <michal.simek@amd.com>; Pandey, Radhey Shyam
>> >> >> <radhey.shyam.pandey@amd.com>; horms@kernel.org
>> >> >> Cc: netdev@vger.kernel.org; linux-arm-kernel@lists.infradead.org;
>> >> >> linux- kernel@vger.kernel.org; git (AMD-Xilinx) <git@amd.com>;
>> >> >> Katakam, Harini <harini.katakam@amd.com>
>> >> >> Subject: Re: [PATCH net-next] net: xilinx: axienet: Configure and
>> >> >> report coalesce parameters in DMAengine flow
>> >> >>
>> >> >> Caution: This message originated from an External Source. Use
>> >> >> proper caution when opening attachments, clicking links, or responding.
>> >> >>
>> >> >>
>> >> >> On 5/25/25 06:22, Suraj Gupta wrote:
>> >> >> > Add support to configure / report interrupt coalesce count and
>> >> >> > delay via ethtool in DMAEngine flow.
>> >> >> > Netperf numbers are not good when using non-dmaengine default
>> >> >> > values, so tuned coalesce count and delay and defined separate
>> >> >> > default values in dmaengine flow.
>> >> >> >
>> >> >> > Netperf numbers and CPU utilisation change in DMAengine flow
>> >> >> > after introducing coalescing with default parameters:
>> >> >> > coalesce parameters:
>> >> >> >    Transfer type        Before(w/o coalescing)  After(with coalescing)
>> >> >> > TCP Tx, CPU utilisation%      925, 27                 941, 22
>> >> >> > TCP Rx, CPU utilisation%      607, 32                 741, 36
>> >> >> > UDP Tx, CPU utilisation%      857, 31                 960, 28
>> >> >> > UDP Rx, CPU utilisation%      762, 26                 783, 18
>> >> >> >
>> >> >> > Above numbers are observed with 4x Cortex-a53.
>> >> >>
>> >> >> How does this affect latency? I would expect these RX settings to
>> >> >> increase latency around 5-10x. I only use these settings with DIM
>> >> >> since it will disable coalescing during periods of light load for better latency.
>> >> >>
>> >> >> (of course the way to fix this in general is RSS or some other
>> >> >> method involving multiple queues).
>> >> >>
>> >> >
>> >> > I took values before NAPI addition in legacy flow (rx_threshold:
>> >> > 24, rx_usec: 50) as
>> >> reference. But netperf numbers were low with them, so tried tuning
>> >> both and selected the pair which gives good numbers.
>> >>
>> >> Yeah, but the reason is that you are trading latency for throughput.
>> >> There is only one queue, so when the interface is saturated you will
>> >> not get good latency anyway (since latency-sensitive packets will get
>> >> head-of-line blocked). But when activity is sparse you can good
>> >> latency if there is no coalescing. So I think coalescing should only
>> >> be used when there is a lot of traffic. Hence why I only adjusted the
>> >> settings once I implemented DIM. I think you should be able to
>> >> implement it by calling net_dim from axienet_dma_rx_cb, but it will not be as
>> efficient without NAPI.
>> >>
>> >
>> > Ok, got it. I'll keep default values used before NAPI in legacy flow (coalesce count:
>> 24, delay: 50) for both Tx and Rx and remove perf comparisons.
>>
>> Those settings are actually probably even worse for latency. I'd leave the settings at
>> 0/0 (coalescing disabled) to match the existing behavior. I think the perf comparisons
>> are helpful, especially for people who know they are going to be throughput-limited.
>>
>> My main point is that I think extending the dmaengine API to allow for DIM will have
>> practical benefits in reduced latency.
>>
> Sure, will implement DIM for both Tx and Rx in next version. However, I noticed it's implemented for Rx only in legacy flow. Is there any specific reason for that?

There's no latency issue with sending packets. It doesn't matter when we
process Tx completions as long as we refill the ring in time to send
more packets. So we can aggressively set the Tx coalescing for maximum
throughput.

--Sean

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-06-09 16:22 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-25 10:22 [PATCH net-next] net: xilinx: axienet: Configure and report coalesce parameters in DMAengine flow Suraj Gupta
2025-05-26  0:36 ` kernel test robot
2025-05-27 16:16 ` Sean Anderson
2025-05-28 12:00   ` Gupta, Suraj
2025-05-28 13:09     ` Subbaraya Sundeep
2025-05-29 16:17     ` Sean Anderson
2025-05-29 16:29       ` Andrew Lunn
2025-05-29 16:35         ` Sean Anderson
2025-05-30 10:18       ` Gupta, Suraj
2025-05-30 11:53         ` Gupta, Suraj
2025-05-30 20:44         ` Sean Anderson
2025-06-03 11:07           ` Gupta, Suraj
2025-06-09 16:22             ` Sean Anderson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).