* [QUESTION] Performance deterioration caused by commit 85f726a35e504418 @ 2021-10-18 3:23 Yang Jihong 2021-10-18 13:37 ` Steven Rostedt 0 siblings, 1 reply; 8+ messages in thread From: Yang Jihong @ 2021-10-18 3:23 UTC (permalink / raw) To: tom.zanussi, Steven Rostedt; +Cc: linux-kernel Hi Tom and Steven, commit 85f726a35e504418 use strncpy instead of memcpy when copying comm, on ARM64 machine, this commit causes performance degradation. I test the number of instructions executed by invoking the trace_sched_switch function once on an arm64 machine: 1. Use memcpy, the number of instructions executed is 850. 2. Use strncpy, the number of instructions executed 1100. That is, use strncpy is almost 250 more instructions than memcpy. Has the impact on performance been considered in this commit? :) What is the impact of revert the patch? Kind regards, Jihong ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [QUESTION] Performance deterioration caused by commit 85f726a35e504418 2021-10-18 3:23 [QUESTION] Performance deterioration caused by commit 85f726a35e504418 Yang Jihong @ 2021-10-18 13:37 ` Steven Rostedt 2021-10-19 2:39 ` Yang Jihong 0 siblings, 1 reply; 8+ messages in thread From: Steven Rostedt @ 2021-10-18 13:37 UTC (permalink / raw) To: Yang Jihong; +Cc: tom.zanussi, linux-kernel On Mon, 18 Oct 2021 11:23:14 +0800 Yang Jihong <yangjihong1@huawei.com> wrote: > Hi Tom and Steven, > > commit 85f726a35e504418 use strncpy instead of memcpy when copying comm, > on ARM64 machine, this commit causes performance degradation. > > I test the number of instructions executed by invoking the > trace_sched_switch function once on an arm64 machine: > 1. Use memcpy, the number of instructions executed is 850. > 2. Use strncpy, the number of instructions executed 1100. > That is, use strncpy is almost 250 more instructions than memcpy. > > Has the impact on performance been considered in this commit? :) > What is the impact of revert the patch? > It's a security issue. And like everything security, there's always going to be a performance impact. Look at the performance impact due to spectre and meltdown! That said, although memcpy() may not be used, we don't need strncpy. strncpy() will pad the rest of the string with nul bytes. But since the memory the string is being recorded into is already initialized (or can be if it isn't), we could use the faster strlcpy(). Have you tried testing it by switching strncpy() with strlcpy()? -- Steve ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [QUESTION] Performance deterioration caused by commit 85f726a35e504418 2021-10-18 13:37 ` Steven Rostedt @ 2021-10-19 2:39 ` Yang Jihong 2021-10-19 2:51 ` Steven Rostedt 0 siblings, 1 reply; 8+ messages in thread From: Yang Jihong @ 2021-10-19 2:39 UTC (permalink / raw) To: Steven Rostedt; +Cc: tom.zanussi, linux-kernel Hi Steve, On 2021/10/18 21:37, Steven Rostedt wrote: > On Mon, 18 Oct 2021 11:23:14 +0800 > Yang Jihong <yangjihong1@huawei.com> wrote: > >> Hi Tom and Steven, >> >> commit 85f726a35e504418 use strncpy instead of memcpy when copying comm, >> on ARM64 machine, this commit causes performance degradation. >> >> I test the number of instructions executed by invoking the >> trace_sched_switch function once on an arm64 machine: >> 1. Use memcpy, the number of instructions executed is 850. >> 2. Use strncpy, the number of instructions executed 1100. >> That is, use strncpy is almost 250 more instructions than memcpy. >> >> Has the impact on performance been considered in this commit? :) >> What is the impact of revert the patch? >> > > It's a security issue. And like everything security, there's always going > to be a performance impact. Look at the performance impact due to spectre > and meltdown! > > That said, although memcpy() may not be used, we don't need strncpy. > strncpy() will pad the rest of the string with nul bytes. But since the > memory the string is being recorded into is already initialized (or can be > if it isn't), we could use the faster strlcpy(). > > Have you tried testing it by switching strncpy() with strlcpy()? > I have tried testing it by switching strncpy() with strlcpy(), there is no performance improvement, probably because the strlen function is called in strlpy and the string is traversed each time. Kind regards, Jihong > -- Steve > . > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [QUESTION] Performance deterioration caused by commit 85f726a35e504418 2021-10-19 2:39 ` Yang Jihong @ 2021-10-19 2:51 ` Steven Rostedt 2021-10-19 17:30 ` Zanussi, Tom 0 siblings, 1 reply; 8+ messages in thread From: Steven Rostedt @ 2021-10-19 2:51 UTC (permalink / raw) To: Yang Jihong; +Cc: tom.zanussi, linux-kernel On Tue, 19 Oct 2021 10:39:47 +0800 Yang Jihong <yangjihong1@huawei.com> wrote: > Hi Steve, > > On 2021/10/18 21:37, Steven Rostedt wrote: > > On Mon, 18 Oct 2021 11:23:14 +0800 > > Yang Jihong <yangjihong1@huawei.com> wrote: > > > >> Hi Tom and Steven, > >> > >> commit 85f726a35e504418 use strncpy instead of memcpy when copying comm, > >> on ARM64 machine, this commit causes performance degradation. > >> > >> I test the number of instructions executed by invoking the > >> trace_sched_switch function once on an arm64 machine: > >> 1. Use memcpy, the number of instructions executed is 850. > >> 2. Use strncpy, the number of instructions executed 1100. > >> That is, use strncpy is almost 250 more instructions than memcpy. > >> > >> Has the impact on performance been considered in this commit? :) > >> What is the impact of revert the patch? > >> > > > > It's a security issue. And like everything security, there's always going > > to be a performance impact. Look at the performance impact due to spectre > > and meltdown! > > > > That said, although memcpy() may not be used, we don't need strncpy. > > strncpy() will pad the rest of the string with nul bytes. But since the > > memory the string is being recorded into is already initialized (or can be > > if it isn't), we could use the faster strlcpy(). > > > > Have you tried testing it by switching strncpy() with strlcpy()? > > > I have tried testing it by switching strncpy() with strlcpy(), there is > no performance improvement, probably because the strlen function is > called in strlpy and the string is traversed each time. Then there's not much we can do. Security trumps performance. Not to mention, the garbage in the comm after the '\0' causes the histograms to produce strange results. Now for the saved_cmdlines, since it isn't exported directly to user space, that one may be put back to memcpy(). Tom, was there a reason to change saved_cmdlines(), as I'm not sure that is leaked. It looks like it is printed with the normal seq_printf() in saved_cmdlines_show(). And it doesn't even look like the saved_cmdlines() is even initialized to zero, so it itself could leak memory if it was exposed. -- Steve ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [QUESTION] Performance deterioration caused by commit 85f726a35e504418 2021-10-19 2:51 ` Steven Rostedt @ 2021-10-19 17:30 ` Zanussi, Tom 2021-10-19 18:10 ` Steven Rostedt 0 siblings, 1 reply; 8+ messages in thread From: Zanussi, Tom @ 2021-10-19 17:30 UTC (permalink / raw) To: Steven Rostedt, Yang Jihong; +Cc: linux-kernel Hi Steve, On 10/18/2021 9:51 PM, Steven Rostedt wrote: > On Tue, 19 Oct 2021 10:39:47 +0800 > Yang Jihong <yangjihong1@huawei.com> wrote: > >> Hi Steve, >> >> On 2021/10/18 21:37, Steven Rostedt wrote: >>> On Mon, 18 Oct 2021 11:23:14 +0800 >>> Yang Jihong <yangjihong1@huawei.com> wrote: >>> >>>> Hi Tom and Steven, >>>> >>>> commit 85f726a35e504418 use strncpy instead of memcpy when copying comm, >>>> on ARM64 machine, this commit causes performance degradation. >>>> >>>> I test the number of instructions executed by invoking the >>>> trace_sched_switch function once on an arm64 machine: >>>> 1. Use memcpy, the number of instructions executed is 850. >>>> 2. Use strncpy, the number of instructions executed 1100. >>>> That is, use strncpy is almost 250 more instructions than memcpy. >>>> >>>> Has the impact on performance been considered in this commit? :) >>>> What is the impact of revert the patch? >>>> >>> >>> It's a security issue. And like everything security, there's always going >>> to be a performance impact. Look at the performance impact due to spectre >>> and meltdown! >>> >>> That said, although memcpy() may not be used, we don't need strncpy. >>> strncpy() will pad the rest of the string with nul bytes. But since the >>> memory the string is being recorded into is already initialized (or can be >>> if it isn't), we could use the faster strlcpy(). >>> >>> Have you tried testing it by switching strncpy() with strlcpy()? >>> >> I have tried testing it by switching strncpy() with strlcpy(), there is >> no performance improvement, probably because the strlen function is >> called in strlpy and the string is traversed each time. > > Then there's not much we can do. Security trumps performance. Not to > mention, the garbage in the comm after the '\0' causes the histograms to > produce strange results. > > Now for the saved_cmdlines, since it isn't exported directly to user space, > that one may be put back to memcpy(). > > Tom, was there a reason to change saved_cmdlines(), as I'm not sure that is > leaked. It looks like it is printed with the normal seq_printf() in > saved_cmdlines_show(). > I don't think either of the changes in commit 85f726a35e504418 are directly related to the original problem [1] and therefore changing them back to memcpy or whatever shouldn't affect the histograms since that data is never used in keys. Commit 85f726a35e504418 was basically a follow-on to commit 9f0bbf3115ca (tracing: Use strncpy instead of memcpy for string keys in hist triggers) and was added for completeness after examining other uses of memcpy in the tracing code (there's even a comment in there from you about possible performance hits from changing it ;-) So anyway, as far as the histograms go, I think optimizing the two changes in 85f726a35e504418 while ignoring trailing garbage can be done without affecting the histogram correctness. Tom [1] https://lore.kernel.org/all/50c35ae1267d64eee975b8125e151e600071d4dc.1549309756.git.tom.zanussi@linux.intel.com/ > And it doesn't even look like the saved_cmdlines() is even initialized to > zero, so it itself could leak memory if it was exposed. > > -- Steve > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [QUESTION] Performance deterioration caused by commit 85f726a35e504418 2021-10-19 17:30 ` Zanussi, Tom @ 2021-10-19 18:10 ` Steven Rostedt 2021-10-19 18:38 ` Zanussi, Tom 0 siblings, 1 reply; 8+ messages in thread From: Steven Rostedt @ 2021-10-19 18:10 UTC (permalink / raw) To: Zanussi, Tom; +Cc: Yang Jihong, linux-kernel On Tue, 19 Oct 2021 12:30:28 -0500 "Zanussi, Tom" <tom.zanussi@linux.intel.com> wrote: > So anyway, as far as the histograms go, I think optimizing the two > changes in 85f726a35e504418 while ignoring trailing garbage can be done > without affecting the histogram correctness. So, none of that is exported to user space? -- Steve ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [QUESTION] Performance deterioration caused by commit 85f726a35e504418 2021-10-19 18:10 ` Steven Rostedt @ 2021-10-19 18:38 ` Zanussi, Tom 2021-10-20 2:00 ` Yang Jihong 0 siblings, 1 reply; 8+ messages in thread From: Zanussi, Tom @ 2021-10-19 18:38 UTC (permalink / raw) To: Steven Rostedt; +Cc: Yang Jihong, linux-kernel On 10/19/2021 1:10 PM, Steven Rostedt wrote: > On Tue, 19 Oct 2021 12:30:28 -0500 > "Zanussi, Tom" <tom.zanussi@linux.intel.com> wrote: > >> So anyway, as far as the histograms go, I think optimizing the two >> changes in 85f726a35e504418 while ignoring trailing garbage can be done >> without affecting the histogram correctness. > > So, none of that is exported to user space? I meant just that any optimization of those two things that ignored the trailing garbage wouldn't affect the histogram keys. But yeah I think you're correct that ignoring it in the case of saved_cmdlines wouldn't be a problem either but it would be in the case of max_buffer since that's exported by the ring buffer. Tom > > -- Steve > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [QUESTION] Performance deterioration caused by commit 85f726a35e504418 2021-10-19 18:38 ` Zanussi, Tom @ 2021-10-20 2:00 ` Yang Jihong 0 siblings, 0 replies; 8+ messages in thread From: Yang Jihong @ 2021-10-20 2:00 UTC (permalink / raw) To: Zanussi, Tom, Steven Rostedt; +Cc: linux-kernel Hi, Steve and Tom On 2021/10/20 2:38, Zanussi, Tom wrote: > On 10/19/2021 1:10 PM, Steven Rostedt wrote: >> On Tue, 19 Oct 2021 12:30:28 -0500 >> "Zanussi, Tom" <tom.zanussi@linux.intel.com> wrote: >> >>> So anyway, as far as the histograms go, I think optimizing the two >>> changes in 85f726a35e504418 while ignoring trailing garbage can be done >>> without affecting the histogram correctness. >> >> So, none of that is exported to user space? > > I meant just that any optimization of those two things that ignored the > trailing garbage wouldn't affect the histogram keys. > > But yeah I think you're correct that ignoring it in the case of > saved_cmdlines wouldn't be a problem either but it would be in the case of > max_buffer since that's exported by the ring buffer. OK. Thanks very much for your patience. :) Kind regards, Jihong > > Tom > >> >> -- Steve >> > . ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-10-20 2:00 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-10-18 3:23 [QUESTION] Performance deterioration caused by commit 85f726a35e504418 Yang Jihong 2021-10-18 13:37 ` Steven Rostedt 2021-10-19 2:39 ` Yang Jihong 2021-10-19 2:51 ` Steven Rostedt 2021-10-19 17:30 ` Zanussi, Tom 2021-10-19 18:10 ` Steven Rostedt 2021-10-19 18:38 ` Zanussi, Tom 2021-10-20 2:00 ` Yang Jihong
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox