From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.6 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_PASS,URIBL_BLOCKED,USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 63789C282CB for ; Tue, 5 Feb 2019 09:35:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 180E320844 for ; Tue, 5 Feb 2019 09:35:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=linaro.org header.i=@linaro.org header.b="B5rjAakp" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728244AbfBEJfU (ORCPT ); Tue, 5 Feb 2019 04:35:20 -0500 Received: from mail-wr1-f67.google.com ([209.85.221.67]:43146 "EHLO mail-wr1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725898AbfBEJfU (ORCPT ); Tue, 5 Feb 2019 04:35:20 -0500 Received: by mail-wr1-f67.google.com with SMTP id r2so2776619wrv.10 for ; Tue, 05 Feb 2019 01:35:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=9rPmceFoGiTsIhdJvIidsSrjwIunIsLmdbPZXPenoKs=; b=B5rjAakpcBCfkQ9Dq6bhuYN2eHdo30439UAdO8pjiUOvaIYmFk1ZAXHlOw75kldnlD zPN9Y8XCO84T+obALANm8rfk/1MSFdVuQH25yXrrGo8Nzcjzkaonnr8oZOEwMHVOGyBa qbFiTW+e6EKvbGo+OndNNQ5AzHKb95rrONmAI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=9rPmceFoGiTsIhdJvIidsSrjwIunIsLmdbPZXPenoKs=; b=T//sYgsSDWXXI8sxJssOSLrLEuln6TajJDVPG33nSu8gBxjJ5IMf2vslAy4GvoPUEO pjbua7AVDFIZ19G1lT1rFtFUENa/kWRI0qr4RpQfpv2VGY31t/ymK6K7Iy00wAWKlVU9 4t13LRT46ol7zTGV8rZXB53IJzumZ07FTg8ZMJ4GW5ldxegbs8j1Y1NO6CkTVAOyZoH8 GUlQpC+yRtEjFJhZADueDb1eKxuShyphNdTF/0eh/nZPCaM2pdrQcMLfYcItsSUWeTZr vseu0+d7Z+f8Js2fAmHAoImM4U1uu4BWjVYLbeTyjc88c/60zXmSTMUFLB0/Mlk4CVy1 sQbw== X-Gm-Message-State: AHQUAuYbj/+c0Hg+0JnevGUJqYu1OIbQ6JHhsvbYiVAjJzkvu8whCSIl GAsEqXBqV4fjmKjcE8Sz3WgRvg== X-Google-Smtp-Source: AHgI3IZ3W/Wl+oDssyRB6rRb93rLEaSoA2NqoNQNH0ofQxoMcxX5QkJ6QMvcncBD1BNCd0bkj3i4SQ== X-Received: by 2002:adf:e949:: with SMTP id m9mr1458124wrn.17.1549359317628; Tue, 05 Feb 2019 01:35:17 -0800 (PST) Received: from apalos (athedsl-4478959.home.otenet.gr. [94.71.35.247]) by smtp.gmail.com with ESMTPSA id e7sm8390697wme.15.2019.02.05.01.35.15 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 05 Feb 2019 01:35:16 -0800 (PST) Date: Tue, 5 Feb 2019 11:35:13 +0200 From: Ilias Apalodimas To: Ioana Ciocoi Radulescu Cc: "netdev@vger.kernel.org" , "davem@davemloft.net" , Ioana Ciornei , "brouer@redhat.com" Subject: Re: [PATCH net-next 1/4] dpaa2-eth: Use a single page per Rx buffer Message-ID: <20190205093513.GA31466@apalos> References: <1549299625-28399-1-git-send-email-ruxandra.radulescu@nxp.com> <1549299625-28399-2-git-send-email-ruxandra.radulescu@nxp.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1549299625-28399-2-git-send-email-ruxandra.radulescu@nxp.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Hi Ioana, Can you share any results on XDP (XDP_DROP is usually useful for the hardware capabilities). > Instead of allocating page fragments via the network stack, > use the page allocator directly. For now, we consume one page > for each Rx buffer. > > With the new memory model we are free to consider adding more > XDP support. > > Performance decreases slightly in some IP forwarding cases. > No visible effect on termination traffic. The driver memory > footprint increases as a result of this change, but it is > still small enough to not really matter. > > Another side effect is that now Rx buffer alignment requirements > are naturally satisfied without any additional actions needed. > Remove alignment related code, except in the buffer layout > information conveyed to MC, as hardware still needs to know the > alignment value we guarantee. > > Signed-off-by: Ioana Ciornei > Signed-off-by: Ioana Radulescu > --- > drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c | 61 +++++++++++++----------- > drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h | 21 +++----- > 2 files changed, 38 insertions(+), 44 deletions(-) > > diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c > index 04925c7..6e58de6 100644 > --- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c > +++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c > @@ -86,16 +86,16 @@ static void free_rx_fd(struct dpaa2_eth_priv *priv, > for (i = 1; i < DPAA2_ETH_MAX_SG_ENTRIES; i++) { > addr = dpaa2_sg_get_addr(&sgt[i]); > sg_vaddr = dpaa2_iova_to_virt(priv->iommu_domain, addr); > - dma_unmap_single(dev, addr, DPAA2_ETH_RX_BUF_SIZE, > - DMA_BIDIRECTIONAL); > + dma_unmap_page(dev, addr, DPAA2_ETH_RX_BUF_SIZE, > + DMA_BIDIRECTIONAL); > > - skb_free_frag(sg_vaddr); > + free_pages((unsigned long)sg_vaddr, 0); > if (dpaa2_sg_is_final(&sgt[i])) > break; > } > > free_buf: > - skb_free_frag(vaddr); > + free_pages((unsigned long)vaddr, 0); > } > > /* Build a linear skb based on a single-buffer frame descriptor */ > @@ -109,7 +109,7 @@ static struct sk_buff *build_linear_skb(struct dpaa2_eth_channel *ch, > > ch->buf_count--; > > - skb = build_skb(fd_vaddr, DPAA2_ETH_SKB_SIZE); > + skb = build_skb(fd_vaddr, DPAA2_ETH_RX_BUF_RAW_SIZE); > if (unlikely(!skb)) > return NULL; > > @@ -144,19 +144,19 @@ static struct sk_buff *build_frag_skb(struct dpaa2_eth_priv *priv, > /* Get the address and length from the S/G entry */ > sg_addr = dpaa2_sg_get_addr(sge); > sg_vaddr = dpaa2_iova_to_virt(priv->iommu_domain, sg_addr); > - dma_unmap_single(dev, sg_addr, DPAA2_ETH_RX_BUF_SIZE, > - DMA_BIDIRECTIONAL); > + dma_unmap_page(dev, sg_addr, DPAA2_ETH_RX_BUF_SIZE, > + DMA_BIDIRECTIONAL); > > sg_length = dpaa2_sg_get_len(sge); > > if (i == 0) { > /* We build the skb around the first data buffer */ > - skb = build_skb(sg_vaddr, DPAA2_ETH_SKB_SIZE); > + skb = build_skb(sg_vaddr, DPAA2_ETH_RX_BUF_RAW_SIZE); > if (unlikely(!skb)) { > /* Free the first SG entry now, since we already > * unmapped it and obtained the virtual address > */ > - skb_free_frag(sg_vaddr); > + free_pages((unsigned long)sg_vaddr, 0); > > /* We still need to subtract the buffers used > * by this FD from our software counter > @@ -211,9 +211,9 @@ static void free_bufs(struct dpaa2_eth_priv *priv, u64 *buf_array, int count) > > for (i = 0; i < count; i++) { > vaddr = dpaa2_iova_to_virt(priv->iommu_domain, buf_array[i]); > - dma_unmap_single(dev, buf_array[i], DPAA2_ETH_RX_BUF_SIZE, > - DMA_BIDIRECTIONAL); > - skb_free_frag(vaddr); > + dma_unmap_page(dev, buf_array[i], DPAA2_ETH_RX_BUF_SIZE, > + DMA_BIDIRECTIONAL); > + free_pages((unsigned long)vaddr, 0); I got some hardware/XDP related questions since i have no idea how the hardware works. From what i understand on the code, you are trying to manage the buffer from the hw buffer manager and if the fails you unmap and free the pages. Since XDP relies on recycling for speed (hint check xdp_return_buff()), is it possible to recycle the buffer if the hw fails ? > } > } > > @@ -378,16 +378,16 @@ static void dpaa2_eth_rx(struct dpaa2_eth_priv *priv, > return; > } > > - dma_unmap_single(dev, addr, DPAA2_ETH_RX_BUF_SIZE, > - DMA_BIDIRECTIONAL); > + dma_unmap_page(dev, addr, DPAA2_ETH_RX_BUF_SIZE, > + DMA_BIDIRECTIONAL); > skb = build_linear_skb(ch, fd, vaddr); > } else if (fd_format == dpaa2_fd_sg) { > WARN_ON(priv->xdp_prog); > > - dma_unmap_single(dev, addr, DPAA2_ETH_RX_BUF_SIZE, > - DMA_BIDIRECTIONAL); > + dma_unmap_page(dev, addr, DPAA2_ETH_RX_BUF_SIZE, > + DMA_BIDIRECTIONAL); > skb = build_frag_skb(priv, ch, buf_data); > - skb_free_frag(vaddr); > + free_pages((unsigned long)vaddr, 0); > percpu_extras->rx_sg_frames++; > percpu_extras->rx_sg_bytes += dpaa2_fd_get_len(fd); > } else { > @@ -903,7 +903,7 @@ static int add_bufs(struct dpaa2_eth_priv *priv, > { > struct device *dev = priv->net_dev->dev.parent; > u64 buf_array[DPAA2_ETH_BUFS_PER_CMD]; > - void *buf; > + struct page *page; > dma_addr_t addr; > int i, err; > > @@ -911,14 +911,16 @@ static int add_bufs(struct dpaa2_eth_priv *priv, > /* Allocate buffer visible to WRIOP + skb shared info + > * alignment padding > */ > - buf = napi_alloc_frag(dpaa2_eth_buf_raw_size(priv)); > - if (unlikely(!buf)) > + /* allocate one page for each Rx buffer. WRIOP sees > + * the entire page except for a tailroom reserved for > + * skb shared info > + */ > + page = dev_alloc_pages(0); > + if (!page) > goto err_alloc; > > - buf = PTR_ALIGN(buf, priv->rx_buf_align); > - > - addr = dma_map_single(dev, buf, DPAA2_ETH_RX_BUF_SIZE, > - DMA_BIDIRECTIONAL); > + addr = dma_map_page(dev, page, 0, DPAA2_ETH_RX_BUF_SIZE, > + DMA_BIDIRECTIONAL); > if (unlikely(dma_mapping_error(dev, addr))) > goto err_map; > > @@ -926,7 +928,7 @@ static int add_bufs(struct dpaa2_eth_priv *priv, > > /* tracing point */ > trace_dpaa2_eth_buf_seed(priv->net_dev, > - buf, dpaa2_eth_buf_raw_size(priv), > + page, DPAA2_ETH_RX_BUF_RAW_SIZE, > addr, DPAA2_ETH_RX_BUF_SIZE, > bpid); > } > @@ -948,7 +950,7 @@ static int add_bufs(struct dpaa2_eth_priv *priv, > return i; > > err_map: > - skb_free_frag(buf); > + __free_pages(page, 0); > err_alloc: > /* If we managed to allocate at least some buffers, > * release them to hardware > @@ -2134,6 +2136,7 @@ static int set_buffer_layout(struct dpaa2_eth_priv *priv) > { > struct device *dev = priv->net_dev->dev.parent; > struct dpni_buffer_layout buf_layout = {0}; > + u16 rx_buf_align; > int err; > > /* We need to check for WRIOP version 1.0.0, but depending on the MC > @@ -2142,9 +2145,9 @@ static int set_buffer_layout(struct dpaa2_eth_priv *priv) > */ > if (priv->dpni_attrs.wriop_version == DPAA2_WRIOP_VERSION(0, 0, 0) || > priv->dpni_attrs.wriop_version == DPAA2_WRIOP_VERSION(1, 0, 0)) > - priv->rx_buf_align = DPAA2_ETH_RX_BUF_ALIGN_REV1; > + rx_buf_align = DPAA2_ETH_RX_BUF_ALIGN_REV1; > else > - priv->rx_buf_align = DPAA2_ETH_RX_BUF_ALIGN; > + rx_buf_align = DPAA2_ETH_RX_BUF_ALIGN; > > /* tx buffer */ > buf_layout.private_data_size = DPAA2_ETH_SWA_SIZE; > @@ -2184,7 +2187,7 @@ static int set_buffer_layout(struct dpaa2_eth_priv *priv) > /* rx buffer */ > buf_layout.pass_frame_status = true; > buf_layout.pass_parser_result = true; > - buf_layout.data_align = priv->rx_buf_align; > + buf_layout.data_align = rx_buf_align; > buf_layout.data_head_room = dpaa2_eth_rx_head_room(priv); > buf_layout.private_data_size = 0; > buf_layout.options = DPNI_BUF_LAYOUT_OPT_PARSER_RESULT | > diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h > index 31fe486..da3d039 100644 > --- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h > +++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.h > @@ -63,9 +63,11 @@ > /* Hardware requires alignment for ingress/egress buffer addresses */ > #define DPAA2_ETH_TX_BUF_ALIGN 64 > > -#define DPAA2_ETH_RX_BUF_SIZE 2048 > -#define DPAA2_ETH_SKB_SIZE \ > - (DPAA2_ETH_RX_BUF_SIZE + SKB_DATA_ALIGN(sizeof(struct skb_shared_info))) > +#define DPAA2_ETH_RX_BUF_RAW_SIZE PAGE_SIZE > +#define DPAA2_ETH_RX_BUF_TAILROOM \ > + SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) > +#define DPAA2_ETH_RX_BUF_SIZE \ > + (DPAA2_ETH_RX_BUF_RAW_SIZE - DPAA2_ETH_RX_BUF_TAILROOM) > > /* Hardware annotation area in RX/TX buffers */ > #define DPAA2_ETH_RX_HWA_SIZE 64 > @@ -343,7 +345,6 @@ struct dpaa2_eth_priv { > bool rx_tstamp; /* Rx timestamping enabled */ > > u16 tx_qdid; > - u16 rx_buf_align; > struct fsl_mc_io *mc_io; > /* Cores which have an affine DPIO/DPCON. > * This is the cpu set on which Rx and Tx conf frames are processed > @@ -418,15 +419,6 @@ enum dpaa2_eth_rx_dist { > DPAA2_ETH_RX_DIST_CLS > }; > > -/* Hardware only sees DPAA2_ETH_RX_BUF_SIZE, but the skb built around > - * the buffer also needs space for its shared info struct, and we need > - * to allocate enough to accommodate hardware alignment restrictions > - */ > -static inline unsigned int dpaa2_eth_buf_raw_size(struct dpaa2_eth_priv *priv) > -{ > - return DPAA2_ETH_SKB_SIZE + priv->rx_buf_align; > -} > - > static inline > unsigned int dpaa2_eth_needed_headroom(struct dpaa2_eth_priv *priv, > struct sk_buff *skb) > @@ -451,8 +443,7 @@ unsigned int dpaa2_eth_needed_headroom(struct dpaa2_eth_priv *priv, > */ > static inline unsigned int dpaa2_eth_rx_head_room(struct dpaa2_eth_priv *priv) > { > - return priv->tx_data_offset + DPAA2_ETH_TX_BUF_ALIGN - > - DPAA2_ETH_RX_HWA_SIZE; > + return priv->tx_data_offset - DPAA2_ETH_RX_HWA_SIZE; > } > > int dpaa2_eth_set_hash(struct net_device *net_dev, u64 flags); > -- > 2.7.4 >