* help with performance loss question
@ 2023-09-20 19:54 Bob Pearson
2023-09-20 20:52 ` Jason Gunthorpe
0 siblings, 1 reply; 3+ messages in thread
From: Bob Pearson @ 2023-09-20 19:54 UTC (permalink / raw)
To: Jason Gunthorpe, linux-rdma@vger.kernel.org
Jason,
I am trying to figure out what caused a big drop in performance in the rxe driver between
v6.5-rc5 and v6.5-rc6. The maximum performance for 'ib_send_bw -F -a' in local loopback mode
dropped from about 1.9GB/sec to 1.1GB/sec between these two tags. I have also measured the performance
of a 6.5 kernel with the 6.4 rxe driver and 6.4 infiniband/core drivers and that also shows the lower
performance so it is not something in the rdma subsystem. (In fact there were no changes in the rxe
driver from 6.5-rc5 to 6.5-rc6.)
If I type 'git log --oneline v6.5-rc6 ^v6.5-rc5' I get about 360 lines but many of them are merge sets
that can contain many patches. Is there a way to list all the patches contained between these two
tags?
Thanks,
Bob
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: help with performance loss question
2023-09-20 19:54 help with performance loss question Bob Pearson
@ 2023-09-20 20:52 ` Jason Gunthorpe
2023-09-20 23:43 ` Bob Pearson
0 siblings, 1 reply; 3+ messages in thread
From: Jason Gunthorpe @ 2023-09-20 20:52 UTC (permalink / raw)
To: Bob Pearson; +Cc: linux-rdma@vger.kernel.org
On Wed, Sep 20, 2023 at 02:54:42PM -0500, Bob Pearson wrote:
> Jason,
>
> I am trying to figure out what caused a big drop in performance in the rxe driver between
> v6.5-rc5 and v6.5-rc6. The maximum performance for 'ib_send_bw -F -a' in local loopback mode
> dropped from about 1.9GB/sec to 1.1GB/sec between these two tags. I have also measured the performance
> of a 6.5 kernel with the 6.4 rxe driver and 6.4 infiniband/core drivers and that also shows the lower
> performance so it is not something in the rdma subsystem. (In fact there were no changes in the rxe
> driver from 6.5-rc5 to 6.5-rc6.)
>
> If I type 'git log --oneline v6.5-rc6 ^v6.5-rc5' I get about 360 lines but many of them are merge sets
> that can contain many patches. Is there a way to list all the patches contained between these two
> tags?
I recommend you just do a git bisection, it will be more robust and
360 patches will not take many steps
Jason
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: help with performance loss question
2023-09-20 20:52 ` Jason Gunthorpe
@ 2023-09-20 23:43 ` Bob Pearson
0 siblings, 0 replies; 3+ messages in thread
From: Bob Pearson @ 2023-09-20 23:43 UTC (permalink / raw)
To: Jason Gunthorpe; +Cc: linux-rdma@vger.kernel.org
On 9/20/23 15:52, Jason Gunthorpe wrote:
> On Wed, Sep 20, 2023 at 02:54:42PM -0500, Bob Pearson wrote:
>> Jason,
>>
>> I am trying to figure out what caused a big drop in performance in the rxe driver between
>> v6.5-rc5 and v6.5-rc6. The maximum performance for 'ib_send_bw -F -a' in local loopback mode
>> dropped from about 1.9GB/sec to 1.1GB/sec between these two tags. I have also measured the performance
>> of a 6.5 kernel with the 6.4 rxe driver and 6.4 infiniband/core drivers and that also shows the lower
>> performance so it is not something in the rdma subsystem. (In fact there were no changes in the rxe
>> driver from 6.5-rc5 to 6.5-rc6.)
>>
>> If I type 'git log --oneline v6.5-rc6 ^v6.5-rc5' I get about 360 lines but many of them are merge sets
>> that can contain many patches. Is there a way to list all the patches contained between these two
>> tags?
>
> I recommend you just do a git bisection, it will be more robust and
> 360 patches will not take many steps
>
> Jason
Thanks, I narrowed it down to the mitigation for the AMD/Inception vuln. that got added in v6.5-rc6.
It's a huge performance hit. I think there is a way to turn it off.
commit fb3bd914b3ec28f5fb697ac55c4846ac2d542855
Author: Borislav Petkov (AMD) <bp@alien8.de>
Date: Wed Jun 28 11:02:39 2023 +0200
x86/srso: Add a Speculative RAS Overflow mitigation
Add a mitigation for the speculative return address stack overflow
vulnerability found on AMD processors.
The mitigation works by ensuring all RET instructions speculate to
a controlled location, similar to how speculation is controlled in the
retpoline sequence. To accomplish this, the __x86_return_thunk forces
the CPU to mispredict every function return using a 'safe return'
sequence.
To ensure the safety of this mitigation, the kernel must ensure that the
safe return sequence is itself free from attacker interference. In Zen3
and Zen4, this is accomplished by creating a BTB alias between the
untraining function srso_untrain_ret_alias() and the safe return
function srso_safe_ret_alias() which results in evicting a potentially
poisoned BTB entry and using that safe one for all function returns.
In older Zen1 and Zen2, this is accomplished using a reinterpretation
technique similar to Retbleed one: srso_untrain_ret() and
srso_safe_ret().
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Apparently it requires a kernel fix for zen 1/2 but can be fixed with updated microcode
for zen 3/4. Since I am doing dev on a zen 2 (3900X) cpu. I'll replicate the perf testing
on my second system which is a zen 3 box to see if it is better.
Bob
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-09-20 23:43 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-20 19:54 help with performance loss question Bob Pearson
2023-09-20 20:52 ` Jason Gunthorpe
2023-09-20 23:43 ` Bob Pearson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox