From: Jesper Dangaard Brouer <brouer@redhat.com>
To: netdev@vger.kernel.org
Cc: kafai@fb.com, daniel@iogearbox.net, tom@herbertland.com,
bblanco@plumgrid.com, Jesper Dangaard Brouer <brouer@redhat.com>,
john.fastabend@gmail.com, gerlitz.or@gmail.com,
hannes@stressinduktion.org, rana.shahot@gmail.com, tgraf@suug.ch,
"David S. Miller" <davem@davemloft.net>,
as754m@att.com
Subject: [net-next PATCH RFC] mlx4: RX prefetch loop
Date: Fri, 08 Jul 2016 18:02:20 +0200 [thread overview]
Message-ID: <20160708160135.27507.77711.stgit@firesoul> (raw)
In-Reply-To: <20160708172024.4a849b7a@redhat.com>
This patch is about prefetching without being opportunistic.
The idea is only to start prefetching on packets that are marked as
ready/completed in the RX ring.
This is acheived by splitting the napi_poll call mlx4_en_process_rx_cq()
loop into two. The first loop extract completed CQEs and start
prefetching on data and RX descriptors. The second loop process the
real packets.
Details: The batching of CQEs are limited to 8 in-order to avoid
stressing the LFB (Line Fill Buffer) and cache usage.
I've left some opportunities for prefetching CQE descriptors.
The performance improvements on my platform are huge, as I tested this
on a CPU without DDIO. The performance for XDP is the same as with
Brendens prefetch hack.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
drivers/net/ethernet/mellanox/mlx4/en_rx.c | 70 +++++++++++++++++++++++++---
1 file changed, 62 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 41c76fe00a7f..c5efe03e31ce 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -782,7 +782,7 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
int doorbell_pending;
struct sk_buff *skb;
int tx_index;
- int index;
+ int index, saved_index, i;
int nr;
unsigned int length;
int polled = 0;
@@ -790,6 +790,10 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
int factor = priv->cqe_factor;
u64 timestamp;
bool l2_tunnel;
+#define PREFETCH_BATCH 8
+ struct mlx4_cqe *cqe_array[PREFETCH_BATCH];
+ int cqe_idx;
+ bool cqe_more;
if (!priv->port_up)
return 0;
@@ -801,24 +805,75 @@ int mlx4_en_process_rx_cq(struct net_device *dev, struct mlx4_en_cq *cq, int bud
doorbell_pending = 0;
tx_index = (priv->tx_ring_num - priv->rsv_tx_rings) + cq->ring;
+next_prefetch_batch:
+ cqe_idx = 0;
+ cqe_more = false;
+
/* We assume a 1:1 mapping between CQEs and Rx descriptors, so Rx
* descriptor offset can be deduced from the CQE index instead of
* reading 'cqe->index' */
index = cq->mcq.cons_index & ring->size_mask;
+ saved_index = index;
cqe = mlx4_en_get_cqe(cq->buf, index, priv->cqe_size) + factor;
- /* Process all completed CQEs */
+ /* Extract and prefetch completed CQEs */
while (XNOR(cqe->owner_sr_opcode & MLX4_CQE_OWNER_MASK,
cq->mcq.cons_index & cq->size)) {
+ void *data;
frags = ring->rx_info + (index << priv->log_rx_info);
rx_desc = ring->buf + (index << ring->log_stride);
+ prefetch(rx_desc);
/*
* make sure we read the CQE after we read the ownership bit
*/
dma_rmb();
+ cqe_array[cqe_idx++] = cqe;
+
+ /* Base error handling here, free handled in next loop */
+ if (unlikely((cqe->owner_sr_opcode & MLX4_CQE_OPCODE_MASK) ==
+ MLX4_CQE_OPCODE_ERROR))
+ goto skip;
+
+ data = page_address(frags[0].page) + frags[0].page_offset;
+ prefetch(data);
+ skip:
+ ++cq->mcq.cons_index;
+ index = (cq->mcq.cons_index) & ring->size_mask;
+ cqe = mlx4_en_get_cqe(cq->buf, index, priv->cqe_size) + factor;
+ /* likely too slow prefetching CQE here ... do look-a-head ? */
+ //prefetch(cqe + priv->cqe_size * 3);
+
+ if (++polled == budget) {
+ cqe_more = false;
+ break;
+ }
+ if (cqe_idx == PREFETCH_BATCH) {
+ cqe_more = true;
+ // IDEA: Opportunistic prefetch CQEs for next_prefetch_batch?
+ //for (i = 0; i < PREFETCH_BATCH; i++) {
+ // prefetch(cqe + priv->cqe_size * i);
+ //}
+ break;
+ }
+ }
+ /* Hint: The cqe_idx will be number of packets, it can be used
+ * for bulk allocating SKBs
+ */
+
+ /* Now, index function as index for rx_desc */
+ index = saved_index;
+
+ /* Process completed CQEs in cqe_array */
+ for (i = 0; i < cqe_idx; i++) {
+
+ cqe = cqe_array[i];
+
+ frags = ring->rx_info + (index << priv->log_rx_info);
+ rx_desc = ring->buf + (index << ring->log_stride);
+
/* Drop packet on bad receive or bad checksum */
if (unlikely((cqe->owner_sr_opcode & MLX4_CQE_OPCODE_MASK) ==
MLX4_CQE_OPCODE_ERROR)) {
@@ -1065,14 +1120,13 @@ next:
mlx4_en_free_frag(priv, frags, nr);
consumed:
- ++cq->mcq.cons_index;
- index = (cq->mcq.cons_index) & ring->size_mask;
- cqe = mlx4_en_get_cqe(cq->buf, index, priv->cqe_size) + factor;
- if (++polled == budget)
- goto out;
+ ++index;
+ index = index & ring->size_mask;
}
+ /* Check for more completed CQEs */
+ if (cqe_more)
+ goto next_prefetch_batch;
-out:
if (doorbell_pending)
mlx4_en_xmit_doorbell(priv->tx_ring[tx_index]);
next prev parent reply other threads:[~2016-07-08 16:02 UTC|newest]
Thread overview: 59+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-08 2:15 [PATCH v6 00/12] Add driver bpf hook for early packet drop and forwarding Brenden Blanco
2016-07-08 2:15 ` [PATCH v6 01/12] bpf: add XDP prog type for early driver filter Brenden Blanco
2016-07-09 8:14 ` Jesper Dangaard Brouer
2016-07-09 13:47 ` Tom Herbert
2016-07-10 13:37 ` Jesper Dangaard Brouer
2016-07-10 17:09 ` Brenden Blanco
2016-07-10 20:30 ` Tom Herbert
2016-07-11 10:15 ` Daniel Borkmann
2016-07-11 12:58 ` Jesper Dangaard Brouer
2016-07-10 20:27 ` Tom Herbert
2016-07-11 11:36 ` Jesper Dangaard Brouer
2016-07-10 20:56 ` Tom Herbert
2016-07-11 16:51 ` Brenden Blanco
2016-07-11 21:21 ` Daniel Borkmann
2016-07-10 21:04 ` Tom Herbert
2016-07-11 13:53 ` Jesper Dangaard Brouer
2016-07-08 2:15 ` [PATCH v6 02/12] net: add ndo to set xdp prog in adapter rx Brenden Blanco
2016-07-10 20:59 ` Tom Herbert
2016-07-11 10:35 ` Daniel Borkmann
2016-07-08 2:15 ` [PATCH v6 03/12] rtnl: add option for setting link xdp prog Brenden Blanco
2016-07-08 2:15 ` [PATCH v6 04/12] net/mlx4_en: add support for fast rx drop bpf program Brenden Blanco
2016-07-09 14:07 ` Or Gerlitz
2016-07-10 15:40 ` Brenden Blanco
2016-07-10 16:38 ` Tariq Toukan
2016-07-09 19:58 ` Saeed Mahameed
2016-07-09 21:37 ` Or Gerlitz
2016-07-10 15:25 ` Tariq Toukan
2016-07-10 16:05 ` Brenden Blanco
2016-07-11 11:48 ` Saeed Mahameed
2016-07-11 21:49 ` Brenden Blanco
2016-07-08 2:15 ` [PATCH v6 05/12] Add sample for adding simple drop program to link Brenden Blanco
2016-07-09 20:21 ` Saeed Mahameed
2016-07-11 11:09 ` Jamal Hadi Salim
2016-07-11 13:37 ` Jesper Dangaard Brouer
2016-07-16 14:55 ` Jamal Hadi Salim
2016-07-08 2:15 ` [PATCH v6 06/12] net/mlx4_en: add page recycle to prepare rx ring for tx support Brenden Blanco
2016-07-08 2:15 ` [PATCH v6 07/12] bpf: add XDP_TX xdp_action for direct forwarding Brenden Blanco
2016-07-08 2:15 ` [PATCH v6 08/12] net/mlx4_en: break out tx_desc write into separate function Brenden Blanco
2016-07-08 2:15 ` [PATCH v6 09/12] net/mlx4_en: add xdp forwarding and data write support Brenden Blanco
2016-07-08 2:15 ` [PATCH v6 10/12] bpf: enable direct packet data write for xdp progs Brenden Blanco
2016-07-08 2:15 ` [PATCH v6 11/12] bpf: add sample for xdp forwarding and rewrite Brenden Blanco
2016-07-08 2:15 ` [PATCH v6 12/12] net/mlx4_en: add prefetch in xdp rx path Brenden Blanco
2016-07-08 3:56 ` Eric Dumazet
2016-07-08 4:16 ` Alexei Starovoitov
2016-07-08 6:56 ` Eric Dumazet
2016-07-08 16:49 ` Brenden Blanco
2016-07-10 20:48 ` Tom Herbert
2016-07-10 20:50 ` Tom Herbert
2016-07-11 14:54 ` Jesper Dangaard Brouer
2016-07-08 15:20 ` Jesper Dangaard Brouer
2016-07-08 16:02 ` Jesper Dangaard Brouer [this message]
2016-07-11 11:09 ` [net-next PATCH RFC] mlx4: RX prefetch loop Jesper Dangaard Brouer
2016-07-11 16:00 ` Brenden Blanco
2016-07-11 23:05 ` Alexei Starovoitov
2016-07-12 12:45 ` Jesper Dangaard Brouer
2016-07-12 16:46 ` Alexander Duyck
2016-07-12 19:52 ` Jesper Dangaard Brouer
2016-07-13 1:37 ` Alexei Starovoitov
2016-07-10 16:14 ` [PATCH v6 00/12] Add driver bpf hook for early packet drop and forwarding Tariq Toukan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160708160135.27507.77711.stgit@firesoul \
--to=brouer@redhat.com \
--cc=as754m@att.com \
--cc=bblanco@plumgrid.com \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=gerlitz.or@gmail.com \
--cc=hannes@stressinduktion.org \
--cc=john.fastabend@gmail.com \
--cc=kafai@fb.com \
--cc=netdev@vger.kernel.org \
--cc=rana.shahot@gmail.com \
--cc=tgraf@suug.ch \
--cc=tom@herbertland.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).