From mboxrd@z Thu Jan 1 00:00:00 1970 From: Thor Thayer Subject: Re: [PATCH net-next 7/8] net: eth: altera: tse: add msgdma prefetcher Date: Fri, 16 Nov 2018 09:20:40 -0600 Message-ID: <358811bf-0b56-4057-c1f0-1de753c0721d@linux.intel.com> References: <20181115005047.28464-1-dwesterg@gmail.com> <20181115005047.28464-8-dwesterg@gmail.com> Reply-To: thor.thayer@linux.intel.com Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Cc: Dalon Westergreen To: Dalon Westergreen , netdev@vger.kernel.org, dinguyen@kernel.org Return-path: Received: from mga06.intel.com ([134.134.136.31]:48261 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728169AbeKQBa7 (ORCPT ); Fri, 16 Nov 2018 20:30:59 -0500 In-Reply-To: <20181115005047.28464-8-dwesterg@gmail.com> Content-Language: en-US Sender: netdev-owner@vger.kernel.org List-ID: Hi Dalon, Just a few comments/questions. On 11/14/18 6:50 PM, Dalon Westergreen wrote: > From: Dalon Westergreen > > Add support for the mSGDMA prefetcher. The prefetcher adds support > for a linked list of descriptors in system memory. The prefetcher > feeds these to the mSGDMA dispatcher. > > The prefetcher is configured to poll for the next descriptor in the > list to be owned by hardware, then pass the descriptor to the > dispatcher. It will then poll the next descriptor until it is > owned by hardware. > > The dispatcher responses are written back to the appropriate > descriptor, and the owned by hardware bit is cleared. > > The driver sets up a linked list twice the tx and rx ring sizes, > with the last descriptor pointing back to the first. This ensures > that the ring of descriptors will always have inactive descriptors > preventing the prefetcher from looping over and reusing descriptors > inappropriately. The prefetcher will continuously loop over these > descriptors. The driver modifies descriptors as required to update > the skb address and length as well as the owned by hardware bit. > > In addition to the above, the mSGDMA prefetcher can be used to > handle rx and tx timestamps coming from the ethernet ip. These > can be included in the prefetcher response in the descriptor. > > Signed-off-by: Dalon Westergreen > --- > drivers/net/ethernet/altera/Makefile | 2 +- > .../altera/altera_msgdma_prefetcher.c | 433 ++++++++++++++++++ > .../altera/altera_msgdma_prefetcher.h | 30 ++ > .../altera/altera_msgdmahw_prefetcher.h | 87 ++++ > drivers/net/ethernet/altera/altera_tse.h | 14 + > drivers/net/ethernet/altera/altera_tse_main.c | 51 +++ > 6 files changed, 616 insertions(+), 1 deletion(-) > create mode 100644 drivers/net/ethernet/altera/altera_msgdma_prefetcher.c > create mode 100644 drivers/net/ethernet/altera/altera_msgdma_prefetcher.h > create mode 100644 drivers/net/ethernet/altera/altera_msgdmahw_prefetcher.h > > diff --git a/drivers/net/ethernet/altera/Makefile b/drivers/net/ethernet/altera/Makefile > index ad80be42fa26..73b32876f126 100644 > --- a/drivers/net/ethernet/altera/Makefile > +++ b/drivers/net/ethernet/altera/Makefile > @@ -5,4 +5,4 @@ > obj-$(CONFIG_ALTERA_TSE) += altera_tse.o > altera_tse-objs := altera_tse_main.o altera_tse_ethtool.o \ > altera_msgdma.o altera_sgdma.o altera_utils.o \ > - altera_ptp.o > + altera_ptp.o altera_msgdma_prefetcher.o > diff --git a/drivers/net/ethernet/altera/altera_msgdma_prefetcher.c b/drivers/net/ethernet/altera/altera_msgdma_prefetcher.c > new file mode 100644 > index 000000000000..55b475e9e15b > --- /dev/null > +++ b/drivers/net/ethernet/altera/altera_msgdma_prefetcher.c > @@ -0,0 +1,433 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* MSGDMA Prefetcher driver for Altera ethernet devices > + * > + * Copyright (C) 2018 Intel Corporation. All rights reserved. > + * Author(s): > + * Dalon Westergreen > + */ > + > +#include > +#include > +#include > +#include "altera_utils.h" > +#include "altera_tse.h" > +#include "altera_msgdma.h" > +#include "altera_msgdmahw.h" > +#include "altera_msgdma_prefetcher.h" > +#include "altera_msgdmahw_prefetcher.h" These could be alphabetized - tse and utils at the end. > + > +int msgdma_pref_initialize(struct altera_tse_private *priv) > +{ > + int i; > + struct msgdma_pref_extended_desc *rx_descs; > + struct msgdma_pref_extended_desc *tx_descs; > + dma_addr_t rx_descsphys; > + dma_addr_t tx_descsphys; > + u32 rx_ring_size; > + u32 tx_ring_size; > + > + priv->pref_rxdescphys = (dma_addr_t)0; > + priv->pref_txdescphys = (dma_addr_t)0; > + > + /* we need to allocate more pref descriptors than ringsize, for now > + * just double ringsize > + */ > + rx_ring_size = priv->rx_ring_size * 2; > + tx_ring_size = priv->tx_ring_size * 2; > + > + /* The prefetcher requires the descriptors to be aligned to the > + * descriptor read/write master's data width which worst case is > + * 512 bits. Currently we DO NOT CHECK THIS and only support 32-bit > + * prefetcher masters. > + */ > + > + /* allocate memory for rx descriptors */ > + priv->pref_rxdesc = > + dma_zalloc_coherent(priv->device, > + sizeof(struct msgdma_pref_extended_desc) > + * rx_ring_size, > + &priv->pref_rxdescphys, GFP_KERNEL); > + > + if (!priv->pref_rxdesc) > + goto err_rx; > + > + /* allocate memory for tx descriptors */ > + priv->pref_txdesc = > + dma_zalloc_coherent(priv->device, > + sizeof(struct msgdma_pref_extended_desc) > + * tx_ring_size, > + &priv->pref_txdescphys, GFP_KERNEL); > + > + if (!priv->pref_txdesc) > + goto err_tx; > + > + /* setup base descriptor ring for tx & rx */ > + rx_descs = (struct msgdma_pref_extended_desc *)priv->pref_rxdesc; > + tx_descs = (struct msgdma_pref_extended_desc *)priv->pref_txdesc; > + tx_descsphys = priv->pref_txdescphys; > + rx_descsphys = priv->pref_rxdescphys; > + > + /* setup RX descriptors */ > + priv->pref_rx_prod = 0; > + for (i = 0; i < rx_ring_size; i++) { > + rx_descsphys = priv->pref_rxdescphys + > + (((i + 1) % rx_ring_size) * > + sizeof(struct msgdma_pref_extended_desc)); > + rx_descs[i].next_desc_lo = lower_32_bits(rx_descsphys); > + rx_descs[i].next_desc_hi = upper_32_bits(rx_descsphys); > + rx_descs[i].stride = MSGDMA_DESC_RX_STRIDE; > + /* burst set to 0 so it defaults to max configured */ > + /* set seq number to desc number */ > + rx_descs[i].burst_seq_num = i; > + } > + > + /* setup TX descriptors */ > + for (i = 0; i < tx_ring_size; i++) { > + tx_descsphys = priv->pref_txdescphys + > + (((i + 1) % tx_ring_size) * > + sizeof(struct msgdma_pref_extended_desc)); > + tx_descs[i].next_desc_lo = lower_32_bits(tx_descsphys); > + tx_descs[i].next_desc_hi = upper_32_bits(tx_descsphys); > + tx_descs[i].stride = MSGDMA_DESC_TX_STRIDE; > + /* burst set to 0 so it defaults to max configured */ > + /* set seq number to desc number */ > + tx_descs[i].burst_seq_num = i; > + } > + > + if (netif_msg_ifup(priv)) > + netdev_info(priv->dev, "%s: RX Desc mem at 0x%x\n", __func__, > + priv->pref_rxdescphys); > + > + if (netif_msg_ifup(priv)) > + netdev_info(priv->dev, "%s: TX Desc mem at 0x%x\n", __func__, > + priv->pref_txdescphys); > + > + return 0; > + > +err_tx: > + dma_free_coherent(priv->device, > + sizeof(struct msgdma_pref_extended_desc) > + * rx_ring_size, > + priv->pref_rxdesc, priv->pref_rxdescphys); > +err_rx: > + return -ENOMEM; > +} > + > +void msgdma_pref_uninitialize(struct altera_tse_private *priv) > +{ > + if (priv->pref_rxdesc) > + dma_free_coherent(priv->device, > + sizeof(struct msgdma_pref_extended_desc) > + * priv->rx_ring_size * 2, > + priv->pref_rxdesc, priv->pref_rxdescphys); > + > + if (priv->pref_txdesc) > + dma_free_coherent(priv->device, > + sizeof(struct msgdma_pref_extended_desc) > + * priv->rx_ring_size * 2, > + priv->pref_txdesc, priv->pref_txdescphys); Why does this have the ring_size*2 but the error path in msgdma_pref_initialize() above (err_tx path) doesn't? > +} > + > +void msgdma_pref_enable_txirq(struct altera_tse_private *priv) > +{ > + tse_set_bit(priv->tx_pref_csr, msgdma_pref_csroffs(control), > + MSGDMA_PREF_CTL_GLOBAL_INTR); > +} > + > + > +void msgdma_pref_reset(struct altera_tse_private *priv) > +{ > + int counter; > + > + /* turn off polling */ > + tse_clear_bit(priv->rx_pref_csr, msgdma_pref_csroffs(control), > + MSGDMA_PREF_CTL_DESC_POLL_EN); > + tse_clear_bit(priv->tx_pref_csr, msgdma_pref_csroffs(control), > + MSGDMA_PREF_CTL_DESC_POLL_EN); > + > + /* Reset the RX Prefetcher */ > + csrwr32(MSGDMA_PREF_STAT_IRQ, priv->rx_pref_csr, > + msgdma_pref_csroffs(status)); > + csrwr32(MSGDMA_PREF_CTL_RESET, priv->rx_pref_csr, > + msgdma_pref_csroffs(control)); > + > + counter = 0; > + while (counter++ < ALTERA_TSE_SW_RESET_WATCHDOG_CNTR) { > + if (tse_bit_is_clear(priv->rx_pref_csr, > + msgdma_pref_csroffs(control), > + MSGDMA_PREF_CTL_RESET)) > + break; > + udelay(1); > + } > + > + if (counter >= ALTERA_TSE_SW_RESET_WATCHDOG_CNTR) > + netif_warn(priv, drv, priv->dev, > + "TSE Rx Prefetcher reset bit never cleared!\n"); > + I take it there are no negative consequences for the reset bit not clearing? Would it be useful to return an error? > + /* clear all status bits */ > + csrwr32(MSGDMA_PREF_STAT_IRQ, priv->tx_pref_csr, > + msgdma_pref_csroffs(status)); > + This looks the same as below. Are they the same or did I miss something? > + /* Reset the TX Prefetcher */ > + csrwr32(MSGDMA_PREF_STAT_IRQ, priv->tx_pref_csr, > + msgdma_pref_csroffs(status)); > + csrwr32(MSGDMA_PREF_CTL_RESET, priv->tx_pref_csr, > + msgdma_pref_csroffs(control)); > + > + counter = 0; > + while (counter++ < ALTERA_TSE_SW_RESET_WATCHDOG_CNTR) { > + if (tse_bit_is_clear(priv->tx_pref_csr, > + msgdma_pref_csroffs(control), > + MSGDMA_PREF_CTL_RESET)) > + break; > + udelay(1); > + } > + > + if (counter >= ALTERA_TSE_SW_RESET_WATCHDOG_CNTR) > + netif_warn(priv, drv, priv->dev, > + "TSE Tx Prefetcher reset bit never cleared!\n"); > + Same as above. > + /* clear all status bits */ > + csrwr32(MSGDMA_PREF_STAT_IRQ, priv->tx_pref_csr, > + msgdma_pref_csroffs(status)); > + > + /* Reset mSGDMA dispatchers*/ > + msgdma_reset(priv); > +} > + > + > +/* Add MSGDMA Prefetcher Descriptor to descriptor list > + * -> This should never be called when a descriptor isn't available > + */ > +void msgdma_pref_add_rx_desc(struct altera_tse_private *priv, > + struct tse_buffer *rxbuffer) > +{ > + struct msgdma_pref_extended_desc *rx_descs = priv->pref_rxdesc; > + u32 desc_entry = priv->pref_rx_prod % (priv->rx_ring_size * 2); > + > + /* write descriptor entries */ > + rx_descs[desc_entry].len = priv->rx_dma_buf_sz; > + rx_descs[desc_entry].write_addr_lo = lower_32_bits(rxbuffer->dma_addr); > + rx_descs[desc_entry].write_addr_hi = upper_32_bits(rxbuffer->dma_addr); > + > + /* set the control bits and set owned by hw */ > + rx_descs[desc_entry].desc_control = (MSGDMA_DESC_CTL_END_ON_EOP > + | MSGDMA_DESC_CTL_END_ON_LEN > + | MSGDMA_DESC_CTL_TR_COMP_IRQ > + | MSGDMA_DESC_CTL_EARLY_IRQ > + | MSGDMA_DESC_CTL_TR_ERR_IRQ > + | MSGDMA_DESC_CTL_GO > + | MSGDMA_PREF_DESC_CTL_OWNED_BY_HW); > + > + /* we need to keep a separate one for rx as RX_DESCRIPTORS are > + * pre-configured at startup > + */ > + priv->pref_rx_prod++; Can you explain more in the comment? What is "one"? > + > + if (netif_msg_rx_status(priv)) { > + netdev_info(priv->dev, "%s: desc: %d buf: %d control 0x%x\n", > + __func__, desc_entry, > + priv->rx_prod % priv->rx_ring_size, > + priv->pref_rxdesc[desc_entry].desc_control); > + } > +} > + Thanks for the patches!