From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-184.mta0.migadu.com (out-184.mta0.migadu.com [91.218.175.184]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E9BC3367 for ; Tue, 27 May 2025 16:16:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.184 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748362606; cv=none; b=TmGNEEfO9uZo06LanUJWYjI/CRkpRYpO6dfrfmRnD4p5xWgme/UGYeGqUfSTlqX2zL+qod9EafQ1T8VYB2ysEXoykrKMn+u9CiWEI6jRVazEjj40nq21a1krsg0Upou5cvzrt2bdkSkkglYb2ygLxHRb+Ytscn7K0gNdRcBAH4M= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748362606; c=relaxed/simple; bh=LKZzSd4b8vDI3Z6hVkBihe5JHNwscioW3xfjEEQDhuQ=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=EqeO87iAy8x+lisxfq99R0OoPZYI84HBDG6Cqd5vWIf0e3kOoXMoRJe5/kR5WRNZP7yTLRnyCnr8V6Vw5f/SvBOaax+7OHY7TNoJsZlXqMuJG2AEYS4uOcu3NkW/xggV2q2BSywEB4eWmjtfCT3M0ez8p2cXrWJJ92shv8R4WeI= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=lJ76ReqC; arc=none smtp.client-ip=91.218.175.184 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="lJ76ReqC" Message-ID: <679d6810-9e76-425c-9d4e-d4b372928cc3@linux.dev> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1748362598; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=j3LFQ0ieKQpaqxJLyJ9iyNsYYxmUvLgA8cPK/4ZnBaA=; b=lJ76ReqCndDsN2ac38Fwh/jWjcCo2hpONkdcTz62+a+81d9l4G453nftIx12FEoP1sN/3h jcdKYutfgp+F/fQLwpRhQULLtaCY0bF39U1clrBcTwRBngWggAwVjiVkUeLjdY15PUm/0v ny5lqbRvcqcxSHnoT9BeHyefrJImKFw= Date: Tue, 27 May 2025 12:16:33 -0400 Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: Re: [PATCH net-next] net: xilinx: axienet: Configure and report coalesce parameters in DMAengine flow To: Suraj Gupta , andrew+netdev@lunn.ch, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, vkoul@kernel.org, michal.simek@amd.com, radhey.shyam.pandey@amd.com, horms@kernel.org Cc: netdev@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, git@amd.com, harini.katakam@amd.com References: <20250525102217.1181104-1-suraj.gupta2@amd.com> Content-Language: en-US X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Sean Anderson In-Reply-To: <20250525102217.1181104-1-suraj.gupta2@amd.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_OUT On 5/25/25 06:22, Suraj Gupta wrote: > Add support to configure / report interrupt coalesce count and delay via > ethtool in DMAEngine flow. > Netperf numbers are not good when using non-dmaengine default values, > so tuned coalesce count and delay and defined separate default > values in dmaengine flow. > > Netperf numbers and CPU utilisation change in DMAengine flow after > introducing coalescing with default parameters: > coalesce parameters: > Transfer type Before(w/o coalescing) After(with coalescing) > TCP Tx, CPU utilisation% 925, 27 941, 22 > TCP Rx, CPU utilisation% 607, 32 741, 36 > UDP Tx, CPU utilisation% 857, 31 960, 28 > UDP Rx, CPU utilisation% 762, 26 783, 18 > > Above numbers are observed with 4x Cortex-a53. How does this affect latency? I would expect these RX settings to increase latency around 5-10x. I only use these settings with DIM since it will disable coalescing during periods of light load for better latency. (of course the way to fix this in general is RSS or some other method involving multiple queues). > Signed-off-by: Suraj Gupta > --- > This patch depend on following AXI DMA dmengine driver changes sent to > dmaengine mailing list as pre-requisit series: > https://lore.kernel.org/all/20250525101617.1168991-1-suraj.gupta2@amd.com/ > --- > drivers/net/ethernet/xilinx/xilinx_axienet.h | 6 +++ > .../net/ethernet/xilinx/xilinx_axienet_main.c | 53 +++++++++++++++++++ > 2 files changed, 59 insertions(+) > > diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet.h b/drivers/net/ethernet/xilinx/xilinx_axienet.h > index 5ff742103beb..cdf6cbb6f2fd 100644 > --- a/drivers/net/ethernet/xilinx/xilinx_axienet.h > +++ b/drivers/net/ethernet/xilinx/xilinx_axienet.h > @@ -126,6 +126,12 @@ > #define XAXIDMA_DFT_TX_USEC 50 > #define XAXIDMA_DFT_RX_USEC 16 > > +/* Default TX/RX Threshold and delay timer values for SGDMA mode with DMAEngine */ > +#define XAXIDMAENGINE_DFT_TX_THRESHOLD 16 > +#define XAXIDMAENGINE_DFT_TX_USEC 5 > +#define XAXIDMAENGINE_DFT_RX_THRESHOLD 24 > +#define XAXIDMAENGINE_DFT_RX_USEC 16 > + > #define XAXIDMA_BD_CTRL_TXSOF_MASK 0x08000000 /* First tx packet */ > #define XAXIDMA_BD_CTRL_TXEOF_MASK 0x04000000 /* Last tx packet */ > #define XAXIDMA_BD_CTRL_ALL_MASK 0x0C000000 /* All control bits */ > diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c > index 1b7a653c1f4e..f9c7d90d4ecb 100644 > --- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c > +++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c > @@ -1505,6 +1505,7 @@ static int axienet_init_dmaengine(struct net_device *ndev) > { > struct axienet_local *lp = netdev_priv(ndev); > struct skbuf_dma_descriptor *skbuf_dma; > + struct dma_slave_config tx_config, rx_config; > int i, ret; > > lp->tx_chan = dma_request_chan(lp->dev, "tx_chan0"); > @@ -1520,6 +1521,22 @@ static int axienet_init_dmaengine(struct net_device *ndev) > goto err_dma_release_tx; > } > > + tx_config.coalesce_cnt = XAXIDMAENGINE_DFT_TX_THRESHOLD; > + tx_config.coalesce_usecs = XAXIDMAENGINE_DFT_TX_USEC; > + rx_config.coalesce_cnt = XAXIDMAENGINE_DFT_RX_THRESHOLD; > + rx_config.coalesce_usecs = XAXIDMAENGINE_DFT_RX_USEC; I think it would be clearer to just do something like struct dma_slave_config tx_config = { .coalesce_cnt = 16, .coalesce_usecs = 5, }; since these are only used once. And this ensures that you initialize the whole struct. But what tree are you using? I don't see these members on net-next or dmaengine. > + ret = dmaengine_slave_config(lp->tx_chan, &tx_config); > + if (ret) { > + dev_err(lp->dev, "Failed to configure Tx coalesce parameters\n"); > + goto err_dma_release_tx; > + } > + ret = dmaengine_slave_config(lp->rx_chan, &rx_config); > + if (ret) { > + dev_err(lp->dev, "Failed to configure Rx coalesce parameters\n"); > + goto err_dma_release_tx; > + } > + > lp->tx_ring_tail = 0; > lp->tx_ring_head = 0; > lp->rx_ring_tail = 0; > @@ -2170,6 +2187,19 @@ axienet_ethtools_get_coalesce(struct net_device *ndev, > struct axienet_local *lp = netdev_priv(ndev); > u32 cr; > > + if (lp->use_dmaengine) { > + struct dma_slave_caps tx_caps, rx_caps; > + > + dma_get_slave_caps(lp->tx_chan, &tx_caps); > + dma_get_slave_caps(lp->rx_chan, &rx_caps); > + > + ecoalesce->tx_max_coalesced_frames = tx_caps.coalesce_cnt; > + ecoalesce->tx_coalesce_usecs = tx_caps.coalesce_usecs; > + ecoalesce->rx_max_coalesced_frames = rx_caps.coalesce_cnt; > + ecoalesce->rx_coalesce_usecs = rx_caps.coalesce_usecs; > + return 0; > + } > + > ecoalesce->use_adaptive_rx_coalesce = lp->rx_dim_enabled; > > spin_lock_irq(&lp->rx_cr_lock); > @@ -2233,6 +2263,29 @@ axienet_ethtools_set_coalesce(struct net_device *ndev, > return -EINVAL; > } > > + if (lp->use_dmaengine) { > + struct dma_slave_config tx_cfg, rx_cfg; > + int ret; > + > + tx_cfg.coalesce_cnt = ecoalesce->tx_max_coalesced_frames; > + tx_cfg.coalesce_usecs = ecoalesce->tx_coalesce_usecs; > + rx_cfg.coalesce_cnt = ecoalesce->rx_max_coalesced_frames; > + rx_cfg.coalesce_usecs = ecoalesce->rx_coalesce_usecs; > + > + ret = dmaengine_slave_config(lp->tx_chan, &tx_cfg); > + if (ret) { > + NL_SET_ERR_MSG(extack, "failed to set tx coalesce parameters"); > + return ret; > + } > + > + ret = dmaengine_slave_config(lp->rx_chan, &rx_cfg); > + if (ret) { > + NL_SET_ERR_MSG(extack, "failed to set rx coalesce parameters"); > + return ret; > + } > + return 0; > + } > + > if (new_dim && !old_dim) { > cr = axienet_calc_cr(lp, axienet_dim_coalesce_count_rx(lp), > ecoalesce->rx_coalesce_usecs);