[QUESTION] Performance deterioration caused by commit 85f726a35e504418

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* [QUESTION] Performance deterioration caused by commit 85f726a35e504418
@ 2021-10-18  3:23 Yang Jihong
  2021-10-18 13:37 ` Steven Rostedt
  0 siblings, 1 reply; 8+ messages in thread
From: Yang Jihong @ 2021-10-18  3:23 UTC (permalink / raw)
  To: tom.zanussi, Steven Rostedt; +Cc: linux-kernel

Hi Tom and Steven,

commit 85f726a35e504418 use strncpy instead of memcpy when copying comm,
on ARM64 machine, this commit causes performance degradation.

I test the number of instructions executed by invoking the 
trace_sched_switch function once on an arm64 machine:
1. Use memcpy, the number of instructions executed is 850.
2. Use strncpy, the number of instructions executed 1100.
That is, use strncpy is almost 250 more instructions than memcpy.

Has the impact on performance been considered in this commit? :)
What is the impact of revert the patch?

Kind regards,
Jihong

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [QUESTION] Performance deterioration caused by commit 85f726a35e504418
  2021-10-18  3:23 [QUESTION] Performance deterioration caused by commit 85f726a35e504418 Yang Jihong
@ 2021-10-18 13:37 ` Steven Rostedt
  2021-10-19  2:39   ` Yang Jihong
  0 siblings, 1 reply; 8+ messages in thread
From: Steven Rostedt @ 2021-10-18 13:37 UTC (permalink / raw)
  To: Yang Jihong; +Cc: tom.zanussi, linux-kernel

On Mon, 18 Oct 2021 11:23:14 +0800
Yang Jihong <yangjihong1@huawei.com> wrote:

> Hi Tom and Steven,
> 
> commit 85f726a35e504418 use strncpy instead of memcpy when copying comm,
> on ARM64 machine, this commit causes performance degradation.
> 
> I test the number of instructions executed by invoking the 
> trace_sched_switch function once on an arm64 machine:
> 1. Use memcpy, the number of instructions executed is 850.
> 2. Use strncpy, the number of instructions executed 1100.
> That is, use strncpy is almost 250 more instructions than memcpy.
> 
> Has the impact on performance been considered in this commit? :)
> What is the impact of revert the patch?
>

It's a security issue. And like everything security, there's always going
to be a performance impact. Look at the performance impact due to spectre
and meltdown!

That said, although memcpy() may not be used, we don't need strncpy.
strncpy() will pad the rest of the string with nul bytes. But since the
memory the string is being recorded into is already initialized (or can be
if it isn't), we could use the faster strlcpy().

Have you tried testing it by switching strncpy() with strlcpy()?

-- Steve

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [QUESTION] Performance deterioration caused by commit 85f726a35e504418
  2021-10-18 13:37 ` Steven Rostedt
@ 2021-10-19  2:39   ` Yang Jihong
  2021-10-19  2:51     ` Steven Rostedt
  0 siblings, 1 reply; 8+ messages in thread
From: Yang Jihong @ 2021-10-19  2:39 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: tom.zanussi, linux-kernel

Hi Steve,

On 2021/10/18 21:37, Steven Rostedt wrote:
> On Mon, 18 Oct 2021 11:23:14 +0800
> Yang Jihong <yangjihong1@huawei.com> wrote:
> 
>> Hi Tom and Steven,
>>
>> commit 85f726a35e504418 use strncpy instead of memcpy when copying comm,
>> on ARM64 machine, this commit causes performance degradation.
>>
>> I test the number of instructions executed by invoking the
>> trace_sched_switch function once on an arm64 machine:
>> 1. Use memcpy, the number of instructions executed is 850.
>> 2. Use strncpy, the number of instructions executed 1100.
>> That is, use strncpy is almost 250 more instructions than memcpy.
>>
>> Has the impact on performance been considered in this commit? :)
>> What is the impact of revert the patch?
>>
> 
> It's a security issue. And like everything security, there's always going
> to be a performance impact. Look at the performance impact due to spectre
> and meltdown!
> 
> That said, although memcpy() may not be used, we don't need strncpy.
> strncpy() will pad the rest of the string with nul bytes. But since the
> memory the string is being recorded into is already initialized (or can be
> if it isn't), we could use the faster strlcpy().
> 
> Have you tried testing it by switching strncpy() with strlcpy()?
> 
I have tried testing it by switching strncpy() with strlcpy(), there is 
no performance improvement, probably because the strlen function is 
called in strlpy and the string is traversed each time.

Kind regards,
Jihong
> -- Steve
> .
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [QUESTION] Performance deterioration caused by commit 85f726a35e504418
  2021-10-19  2:39   ` Yang Jihong
@ 2021-10-19  2:51     ` Steven Rostedt
  2021-10-19 17:30       ` Zanussi, Tom
  0 siblings, 1 reply; 8+ messages in thread
From: Steven Rostedt @ 2021-10-19  2:51 UTC (permalink / raw)
  To: Yang Jihong; +Cc: tom.zanussi, linux-kernel

On Tue, 19 Oct 2021 10:39:47 +0800
Yang Jihong <yangjihong1@huawei.com> wrote:

> Hi Steve,
> 
> On 2021/10/18 21:37, Steven Rostedt wrote:
> > On Mon, 18 Oct 2021 11:23:14 +0800
> > Yang Jihong <yangjihong1@huawei.com> wrote:
> >   
> >> Hi Tom and Steven,
> >>
> >> commit 85f726a35e504418 use strncpy instead of memcpy when copying comm,
> >> on ARM64 machine, this commit causes performance degradation.
> >>
> >> I test the number of instructions executed by invoking the
> >> trace_sched_switch function once on an arm64 machine:
> >> 1. Use memcpy, the number of instructions executed is 850.
> >> 2. Use strncpy, the number of instructions executed 1100.
> >> That is, use strncpy is almost 250 more instructions than memcpy.
> >>
> >> Has the impact on performance been considered in this commit? :)
> >> What is the impact of revert the patch?
> >>  
> > 
> > It's a security issue. And like everything security, there's always going
> > to be a performance impact. Look at the performance impact due to spectre
> > and meltdown!
> > 
> > That said, although memcpy() may not be used, we don't need strncpy.
> > strncpy() will pad the rest of the string with nul bytes. But since the
> > memory the string is being recorded into is already initialized (or can be
> > if it isn't), we could use the faster strlcpy().
> > 
> > Have you tried testing it by switching strncpy() with strlcpy()?
> >   
> I have tried testing it by switching strncpy() with strlcpy(), there is 
> no performance improvement, probably because the strlen function is 
> called in strlpy and the string is traversed each time.

Then there's not much we can do. Security trumps performance. Not to
mention, the garbage in the comm after the '\0' causes the histograms to
produce strange results.

Now for the saved_cmdlines, since it isn't exported directly to user space,
that one may be put back to memcpy().

Tom, was there a reason to change saved_cmdlines(), as I'm not sure that is
leaked. It looks like it is printed with the normal seq_printf() in
saved_cmdlines_show().

And it doesn't even look like the saved_cmdlines() is even initialized to
zero, so it itself could leak memory if it was exposed.

-- Steve

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [QUESTION] Performance deterioration caused by commit 85f726a35e504418
  2021-10-19  2:51     ` Steven Rostedt
@ 2021-10-19 17:30       ` Zanussi, Tom
  2021-10-19 18:10         ` Steven Rostedt
  0 siblings, 1 reply; 8+ messages in thread
From: Zanussi, Tom @ 2021-10-19 17:30 UTC (permalink / raw)
  To: Steven Rostedt, Yang Jihong; +Cc: linux-kernel

Hi Steve,

On 10/18/2021 9:51 PM, Steven Rostedt wrote:
> On Tue, 19 Oct 2021 10:39:47 +0800
> Yang Jihong <yangjihong1@huawei.com> wrote:
> 
>> Hi Steve,
>>
>> On 2021/10/18 21:37, Steven Rostedt wrote:
>>> On Mon, 18 Oct 2021 11:23:14 +0800
>>> Yang Jihong <yangjihong1@huawei.com> wrote:
>>>    
>>>> Hi Tom and Steven,
>>>>
>>>> commit 85f726a35e504418 use strncpy instead of memcpy when copying comm,
>>>> on ARM64 machine, this commit causes performance degradation.
>>>>
>>>> I test the number of instructions executed by invoking the
>>>> trace_sched_switch function once on an arm64 machine:
>>>> 1. Use memcpy, the number of instructions executed is 850.
>>>> 2. Use strncpy, the number of instructions executed 1100.
>>>> That is, use strncpy is almost 250 more instructions than memcpy.
>>>>
>>>> Has the impact on performance been considered in this commit? :)
>>>> What is the impact of revert the patch?
>>>>   
>>>
>>> It's a security issue. And like everything security, there's always going
>>> to be a performance impact. Look at the performance impact due to spectre
>>> and meltdown!
>>>
>>> That said, although memcpy() may not be used, we don't need strncpy.
>>> strncpy() will pad the rest of the string with nul bytes. But since the
>>> memory the string is being recorded into is already initialized (or can be
>>> if it isn't), we could use the faster strlcpy().
>>>
>>> Have you tried testing it by switching strncpy() with strlcpy()?
>>>    
>> I have tried testing it by switching strncpy() with strlcpy(), there is
>> no performance improvement, probably because the strlen function is
>> called in strlpy and the string is traversed each time.
> 
> Then there's not much we can do. Security trumps performance. Not to
> mention, the garbage in the comm after the '\0' causes the histograms to
> produce strange results.
> 
> Now for the saved_cmdlines, since it isn't exported directly to user space,
> that one may be put back to memcpy().
> 
> Tom, was there a reason to change saved_cmdlines(), as I'm not sure that is
> leaked. It looks like it is printed with the normal seq_printf() in
> saved_cmdlines_show().
> 

I don't think either of the changes in commit 85f726a35e504418 are directly related to the original problem [1] and therefore changing them back to memcpy or whatever shouldn't affect the histograms since that data is never used in keys.

Commit 85f726a35e504418 was basically a follow-on to commit 9f0bbf3115ca (tracing: Use strncpy instead of memcpy for string keys in hist triggers) and was added for completeness after examining other uses of memcpy in the tracing code (there's even a comment in there from you about possible performance hits from changing it ;-)

So anyway, as far as the histograms go, I think optimizing the two changes in 85f726a35e504418 while ignoring trailing garbage can be done without affecting the histogram correctness.

Tom

[1] https://lore.kernel.org/all/50c35ae1267d64eee975b8125e151e600071d4dc.1549309756.git.tom.zanussi@linux.intel.com/

> And it doesn't even look like the saved_cmdlines() is even initialized to
> zero, so it itself could leak memory if it was exposed.
> 
> -- Steve
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [QUESTION] Performance deterioration caused by commit 85f726a35e504418
  2021-10-19 17:30       ` Zanussi, Tom
@ 2021-10-19 18:10         ` Steven Rostedt
  2021-10-19 18:38           ` Zanussi, Tom
  0 siblings, 1 reply; 8+ messages in thread
From: Steven Rostedt @ 2021-10-19 18:10 UTC (permalink / raw)
  To: Zanussi, Tom; +Cc: Yang Jihong, linux-kernel

On Tue, 19 Oct 2021 12:30:28 -0500
"Zanussi, Tom" <tom.zanussi@linux.intel.com> wrote:

> So anyway, as far as the histograms go, I think optimizing the two
> changes in 85f726a35e504418 while ignoring trailing garbage can be done
> without affecting the histogram correctness.

So, none of that is exported to user space?

-- Steve

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [QUESTION] Performance deterioration caused by commit 85f726a35e504418
  2021-10-19 18:10         ` Steven Rostedt
@ 2021-10-19 18:38           ` Zanussi, Tom
  2021-10-20  2:00             ` Yang Jihong
  0 siblings, 1 reply; 8+ messages in thread
From: Zanussi, Tom @ 2021-10-19 18:38 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Yang Jihong, linux-kernel

On 10/19/2021 1:10 PM, Steven Rostedt wrote:
> On Tue, 19 Oct 2021 12:30:28 -0500
> "Zanussi, Tom" <tom.zanussi@linux.intel.com> wrote:
> 
>> So anyway, as far as the histograms go, I think optimizing the two
>> changes in 85f726a35e504418 while ignoring trailing garbage can be done
>> without affecting the histogram correctness.
> 
> So, none of that is exported to user space?

I meant just that any optimization of those two things that ignored the
trailing garbage wouldn't affect the histogram keys.

But yeah I think you're correct that ignoring it in the case of
saved_cmdlines wouldn't be a problem either but it would be in the case of
max_buffer since that's exported by the ring buffer.

Tom

> 
> -- Steve
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [QUESTION] Performance deterioration caused by commit 85f726a35e504418
  2021-10-19 18:38           ` Zanussi, Tom
@ 2021-10-20  2:00             ` Yang Jihong
  0 siblings, 0 replies; 8+ messages in thread
From: Yang Jihong @ 2021-10-20  2:00 UTC (permalink / raw)
  To: Zanussi, Tom, Steven Rostedt; +Cc: linux-kernel

Hi, Steve and Tom

On 2021/10/20 2:38, Zanussi, Tom wrote:
> On 10/19/2021 1:10 PM, Steven Rostedt wrote:
>> On Tue, 19 Oct 2021 12:30:28 -0500
>> "Zanussi, Tom" <tom.zanussi@linux.intel.com> wrote:
>>
>>> So anyway, as far as the histograms go, I think optimizing the two
>>> changes in 85f726a35e504418 while ignoring trailing garbage can be done
>>> without affecting the histogram correctness.
>>
>> So, none of that is exported to user space?
> 
> I meant just that any optimization of those two things that ignored the
> trailing garbage wouldn't affect the histogram keys.
> 
> But yeah I think you're correct that ignoring it in the case of
> saved_cmdlines wouldn't be a problem either but it would be in the case of
> max_buffer since that's exported by the ring buffer.

OK. Thanks very much for your patience. :)

Kind regards,
Jihong
> 
> Tom
> 
>>
>> -- Steve
>>
> .

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-10-20  2:00 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-10-18  3:23 [QUESTION] Performance deterioration caused by commit 85f726a35e504418 Yang Jihong
2021-10-18 13:37 ` Steven Rostedt
2021-10-19  2:39   ` Yang Jihong
2021-10-19  2:51     ` Steven Rostedt
2021-10-19 17:30       ` Zanussi, Tom
2021-10-19 18:10         ` Steven Rostedt
2021-10-19 18:38           ` Zanussi, Tom
2021-10-20  2:00             ` Yang Jihong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox