From: "Daniel Xu" <dxu@dxuuu.xyz>
To: "Alexander Lobakin" <aleksander.lobakin@intel.com>
Cc: "Jakub Kicinski" <kuba@kernel.org>,
"Lorenzo Bianconi" <lorenzo.bianconi@redhat.com>,
"Lorenzo Bianconi" <lorenzo@kernel.org>,
"bpf@vger.kernel.org" <bpf@vger.kernel.org>,
"Alexei Starovoitov" <ast@kernel.org>,
"Daniel Borkmann" <daniel@iogearbox.net>,
"Andrii Nakryiko" <andrii@kernel.org>,
"John Fastabend" <john.fastabend@gmail.com>,
"Jesper Dangaard Brouer" <hawk@kernel.org>,
"Martin KaFai Lau" <martin.lau@linux.dev>,
"David Miller" <davem@davemloft.net>,
"Eric Dumazet" <edumazet@google.com>,
"Paolo Abeni" <pabeni@redhat.com>,
netdev@vger.kernel.org
Subject: Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase
Date: Fri, 06 Dec 2024 15:36:08 -0800 [thread overview]
Message-ID: <7b3bbb6f-5533-40ee-a7c5-c68ea7718fbe@app.fastmail.com> (raw)
In-Reply-To: <012d8975-13a4-4056-a6bf-f9140878cbdb@intel.com>
On Fri, Dec 6, 2024, at 7:06 AM, Alexander Lobakin wrote:
> From: Daniel Xu <dxu@dxuuu.xyz>
> Date: Thu, 5 Dec 2024 17:41:27 -0700
>
>> On Thu, Dec 05, 2024 at 12:06:29PM GMT, Alexander Lobakin wrote:
>>> From: Alexander Lobakin <aleksander.lobakin@intel.com>
>>> Date: Thu, 5 Dec 2024 11:38:11 +0100
>>>
>>>> From: Daniel Xu <dxu@dxuuu.xyz>
>>>> Date: Wed, 04 Dec 2024 13:51:08 -0800
>>>>
>>>>>
>>>>>
>>>>> On Wed, Dec 4, 2024, at 8:42 AM, Alexander Lobakin wrote:
>>>>>> From: Jakub Kicinski <kuba@kernel.org>
>>>>>> Date: Tue, 3 Dec 2024 16:51:57 -0800
>>>>>>
>>>>>>> On Tue, 3 Dec 2024 12:01:16 +0100 Alexander Lobakin wrote:
>>>>>>>>>> @ Jakub,
>>>>>>>>>
>>>>>>>>> Context? What doesn't work and why?
>>>>>>>>
>>>>>>>> My tests show the same perf as on Lorenzo's series, but I test with UDP
>>>>>>>> trafficgen. Daniel tests TCP and the results are much worse than with
>>>>>>>> Lorenzo's implementation.
>>>>>>>> I suspect this is related to that how NAPI performs flushes / decides
>>>>>>>> whether to repoll again or exit vs how kthread does that (even though I
>>>>>>>> also try to flush only every 64 frames or when the ring is empty). Or
>>>>>>>> maybe to that part of the kthread happens in process context outside any
>>>>>>>> softirq, while when using NAPI, the whole loop is inside RX softirq.
>>>>>>>>
>>>>>>>> Jesper said that he'd like to see cpumap still using own kthread, so
>>>>>>>> that its priority can be boosted separately from the backlog. That's why
>>>>>>>> we asked you whether it would be fine to have cpumap as threaded NAPI in
>>>>>>>> regards to all this :D
>>>>>>>
>>>>>>> Certainly not without a clear understanding what the problem with
>>>>>>> a kthread is.
>>>>>>
>>>>>> Yes, sure thing.
>>>>>>
>>>>>> Bad thing's that I can't reproduce Daniel's problem >_< Previously, I
>>>>>> was testing with the UDP trafficgen and got up to 80% improvement over
>>>>>> the baseline. Now I tested TCP and got up to 70% improvement, no
>>>>>> regressions whatsoever =\
>>>>>>
>>>>>> I don't know where this regression on Daniel's setup comes from. Is it
>>>>>> multi-thread or single-thread test?
>>>>>
>>>>> 8 threads with 16 flows over them (-T8 -F16)
>>>>>
>>>>>> What app do you use: iperf, netperf,
>>>>>> neper, Microsoft's app (forgot the name)?
>>>>>
>>>>> neper, tcp_stream.
>>>>
>>>> Let me recheck with neper -T8 -F16, I'll post my results soon.
>>>
>>> kernel direct T1 direct T8F16 cpumap cpumap T8F16
>>> clean 28 51 13 9 Gbps
>>> GRO 28 51 26 18 Gbps
>>>
>>> 100% gain, no regressions =\
>>>
>>> My XDP prog is simple (upstream xdp-tools repo with no changes):
>>>
>>> numactl -N 0 xdp-tools/xdp-bench/xdp-bench redirect-cpu -c 23 -s -p
>>> no-touch ens802f0np0
>>>
>>> IOW it simply redirects everything to CPU 23 (same NUMA node) from any
>>> Rx queue without looking into headers or packet.
>>> Do you test with more sophisticated XDP prog?
>>
>> Great reminder... my prog is a bit more sophisticated. I forgot we were
>> doing latency tracking by inserting a timestamp into frame metadata. But
>> not clearing it after it was read on remote CPU, which disables GRO. So
>> previous test was paying the penalty of fixed GRO overhead without
>> getting any packet merges.
>>
>> Once I fixed up prog to reset metadata pointer I could see the wins.
>> Went from 21621.126 Mbps -> 25546.47 Mbps for a ~18% win in tput. No
>> latency changes.
>>
>> Sorry about the churn.
>
> No problem, crap happens sometimes :)
>
> Let me send my implementation on Monday-Wednesday. I'll include my UDP
> and TCP test results, as well as yours (+18%).
>
> BTW would be great if you could give me a Tested-by tag, as I assume the
> tests were fine and it works for you?
Yep, worked great for me.
Tested-by: Daniel Xu <dxu@dxuuu.xyz>
prev parent reply other threads:[~2024-12-06 23:36 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-16 10:13 [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase Lorenzo Bianconi
2024-09-16 10:13 ` [RFC/RFT v2 1/3] net: Add napi_init_for_gro routine Lorenzo Bianconi
2024-09-16 10:13 ` [RFC/RFT v2 2/3] net: add napi_threaded_poll to netdevice.h Lorenzo Bianconi
2024-09-16 10:13 ` [RFC/RFT v2 3/3] bpf: cpumap: Add gro support Lorenzo Bianconi
2024-09-16 15:10 ` [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase Alexander Lobakin
2024-10-08 22:39 ` Daniel Xu
2024-10-09 10:46 ` Lorenzo Bianconi
2024-10-09 12:27 ` Alexander Lobakin
2024-10-09 12:47 ` Lorenzo Bianconi
2024-10-09 12:50 ` Alexander Lobakin
2024-10-22 15:51 ` Alexander Lobakin
2024-11-12 17:43 ` Alexander Lobakin
2024-11-13 23:39 ` Daniel Xu
2024-11-23 0:10 ` Daniel Xu
2024-11-25 15:12 ` Alexander Lobakin
2024-11-25 17:03 ` Daniel Xu
2024-11-25 18:50 ` Jesper Dangaard Brouer
2024-11-25 21:53 ` Daniel Xu
2024-11-25 22:19 ` Lorenzo Bianconi
2024-11-25 22:56 ` Daniel Xu
2024-11-26 10:36 ` Alexander Lobakin
2024-11-26 17:02 ` Lorenzo Bianconi
2024-11-26 17:12 ` Jesper Dangaard Brouer
2024-11-28 10:41 ` Alexander Lobakin
2024-11-28 10:56 ` Lorenzo Bianconi
2024-11-28 10:57 ` Alexander Lobakin
2024-12-02 22:47 ` Jakub Kicinski
2024-12-03 11:01 ` Alexander Lobakin
2024-12-04 0:51 ` Jakub Kicinski
2024-12-04 16:42 ` Alexander Lobakin
2024-12-04 21:51 ` Daniel Xu
2024-12-05 10:38 ` Alexander Lobakin
2024-12-05 11:06 ` Alexander Lobakin
2024-12-06 0:41 ` Daniel Xu
2024-12-06 15:06 ` Alexander Lobakin
2024-12-06 23:36 ` Daniel Xu [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7b3bbb6f-5533-40ee-a7c5-c68ea7718fbe@app.fastmail.com \
--to=dxu@dxuuu.xyz \
--cc=aleksander.lobakin@intel.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hawk@kernel.org \
--cc=john.fastabend@gmail.com \
--cc=kuba@kernel.org \
--cc=lorenzo.bianconi@redhat.com \
--cc=lorenzo@kernel.org \
--cc=martin.lau@linux.dev \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox