All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
To: "Mário Kuka" <kuka@cesnet.cz>
Cc: dev@dpdk.org, orika@nvidia.com, bingz@nvidia.com, viktorin@cesnet.cz
Subject: Re: Hairpin Queues Throughput ConnectX-6
Date: Tue, 25 Jun 2024 03:22:24 +0300	[thread overview]
Message-ID: <20240625032224.45b65339@sovereign> (raw)
In-Reply-To: <3d746dbc-330e-403f-b87f-bf495cac3437@cesnet.cz>

Hi Mário,

2024-06-19 08:45 (UTC+0200), Mário Kuka:
> Hello,
> 
> I want to use hairpin queues to forward high priority traffic (such as 
> LACP).
> My goal is to ensure that this traffic is not dropped in case the 
> software pipeline is overwhelmed.
> But during testing with dpdk-testpmd I can't achieve full throughput for 
> hairpin queues.

For maintainers: I'd like to express interest in this use case too.

> 
> The best result I have been able to achieve for 64B packets is 83 Gbps 
> in this configuration:
> $ sudo dpdk-testpmd -l 0-1 -n 4 -a 0000:17:00.0,hp_buf_log_sz=19 -- 
> --rxq=1 --txq=1 --rxd=4096 --txd=4096 --hairpinq=2
> testpmd> flow create 0 ingress pattern eth src is 00:10:94:00:00:03 /   
> end actions rss queues 1 2 end / end

Try enabling "Explicit Tx rule" mode if possible.
I was able to achieve 137 Mpps @ 64B with the following command:

dpdk-testpmd -a 21:00.0 -a c1:00.0 --in-memory -- \
    -i --rxq=1 --txq=1 --hairpinq=8 --hairpin-mode=0x10

You might get even better speed, because my flow rules were more complicated
(RTE Flow based "router on-a-stick"):

flow create 0 ingress group 1 pattern eth / vlan vid is 721 / end actions of_set_vlan_vid vlan_vid 722 / rss queues 1 2 3 4 5 6 7 8 end / end
flow create 1 ingress group 1 pattern eth / vlan vid is 721 / end actions of_set_vlan_vid vlan_vid 722 / rss queues 1 2 3 4 5 6 7 8 end / end
flow create 0 ingress group 1 pattern eth / vlan vid is 722 / end actions of_set_vlan_vid vlan_vid 721 / rss queues 1 2 3 4 5 6 7 8 end / end
flow create 1 ingress group 1 pattern eth / vlan vid is 722 / end actions of_set_vlan_vid vlan_vid 721 / rss queues 1 2 3 4 5 6 7 8 end / end
flow create 0 ingress group 0 pattern end actions jump group 1 / end
flow create 1 ingress group 0 pattern end actions jump group 1 / end

> 
> For packets in the range 68-80B I measured even lower throughput.
> Full throughput I measured only from packets larger than 112B
> 
> For only one queue, I didn't get more than 55Gbps:
> $ sudo dpdk-testpmd -l 0-1 -n 4 -a 0000:17:00.0,hp_buf_log_sz=19 -- 
> --rxq=1 --txq=1 --rxd=4096 --txd=4096 --hairpinq=1 -i
> testpmd> flow create 0 ingress pattern eth src is 00:10:94:00:00:03 /   
> end actions queue index 1 / end
> 
> I tried to use locked device memory for TX and RX queues, but it seems 
> that this is not supported:
> "--hairpin-mode=0x011000" (bit 16 - hairpin TX queues will use locked 
> device memory, bit 12 - hairpin RX queues will use locked device memory)

RxQ pinned in device memory requires firmware configuration [1]:

mlxconfig -y -d $pci_addr set MEMIC_SIZE_LIMIT=0 HAIRPIN_DATA_BUFFER_LOCK=1
mlxfwreset -y -d $pci_addr reset

[1]: https://doc.dpdk.org/guides/platform/mlx5.html?highlight=hairpin_data_buffer_lock

However, pinned RxQ didn't improve anything for me.

TxQ pinned in device memory is not supported by net/mlx5.
TxQ pinned to DPDK memory made performance awful (predictably).

> I was expecting that achieving full throughput with hairpin queues would 
> not be a problem.
> Is my expectation too optimistic?
> 
> What other parameters besides 'hp_buf_log_sz' can I use to achieve full 
> throughput?

In my experiments, default "hp_buf_log_sz" of 16 is optimal.
The most influential parameter appears to be the number of hairpin queues.

> I tried combining the following parameters: mprq_en=, rxqs_min_mprq=, 
> mprq_log_stride_num=, txq_inline_mpw=, rxq_pkt_pad_en=,
> but with no positive impact on throughput.

  reply	other threads:[~2024-06-25  0:22 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <fbfd6dd8-2cfc-406e-be90-350dc2fea02e@cesnet.cz>
2024-06-19  6:45 ` Hairpin Queues Throughput ConnectX-6 Mário Kuka
2024-06-25  0:22   ` Dmitry Kozlyuk [this message]
2024-06-27 11:42     ` Mário Kuka
2024-07-04 11:08       ` Mário Kuka
2024-07-04 21:03         ` Dmitry Kozlyuk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240625032224.45b65339@sovereign \
    --to=dmitry.kozliuk@gmail.com \
    --cc=bingz@nvidia.com \
    --cc=dev@dpdk.org \
    --cc=kuka@cesnet.cz \
    --cc=orika@nvidia.com \
    --cc=viktorin@cesnet.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.