linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] perf/x86/intel/pt: Fix sampling using single range output
@ 2022-11-12 15:15 Adrian Hunter
  2022-11-14 10:51 ` Peter Zijlstra
  0 siblings, 1 reply; 6+ messages in thread
From: Adrian Hunter @ 2022-11-12 15:15 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Alexander Shishkin, linux-kernel, Ingo Molnar, linux-perf-users

Deal with errata TGL052, ADL037 and RPL017 "Trace May Contain Incorrect
Data When Configured With Single Range Output Larger Than 4KB" by
disabling single range output whenever larger than 4KB.

Fixes: 670638477aed ("perf/x86/intel/pt: Opportunistically use single range output mode")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
---
 arch/x86/events/intel/pt.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
index 82ef87e9a897..42a55794004a 100644
--- a/arch/x86/events/intel/pt.c
+++ b/arch/x86/events/intel/pt.c
@@ -1263,6 +1263,15 @@ static int pt_buffer_try_single(struct pt_buffer *buf, int nr_pages)
 	if (1 << order != nr_pages)
 		goto out;
 
+	/*
+	 * Some processors cannot always support single range for more than
+	 * 4KB - refer errata TGL052, ADL037 and RPL017. Future processors might
+	 * also be affected, so for now rather than trying to keep track of
+	 * which ones, just disable it for all.
+	 */
+	if (nr_pages > 1)
+		goto out;
+
 	buf->single = true;
 	buf->nr_pages = nr_pages;
 	ret = 0;
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] perf/x86/intel/pt: Fix sampling using single range output
  2022-11-12 15:15 [PATCH] perf/x86/intel/pt: Fix sampling using single range output Adrian Hunter
@ 2022-11-14 10:51 ` Peter Zijlstra
  2022-11-14 11:10   ` Adrian Hunter
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2022-11-14 10:51 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Alexander Shishkin, linux-kernel, Ingo Molnar, linux-perf-users

On Sat, Nov 12, 2022 at 05:15:08PM +0200, Adrian Hunter wrote:
> Deal with errata TGL052, ADL037 and RPL017 "Trace May Contain Incorrect
> Data When Configured With Single Range Output Larger Than 4KB" by
> disabling single range output whenever larger than 4KB.
> 
> Fixes: 670638477aed ("perf/x86/intel/pt: Opportunistically use single range output mode")
> Cc: stable@vger.kernel.org
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> ---
>  arch/x86/events/intel/pt.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
> index 82ef87e9a897..42a55794004a 100644
> --- a/arch/x86/events/intel/pt.c
> +++ b/arch/x86/events/intel/pt.c
> @@ -1263,6 +1263,15 @@ static int pt_buffer_try_single(struct pt_buffer *buf, int nr_pages)
>  	if (1 << order != nr_pages)
>  		goto out;
>  
> +	/*
> +	 * Some processors cannot always support single range for more than
> +	 * 4KB - refer errata TGL052, ADL037 and RPL017. Future processors might
> +	 * also be affected, so for now rather than trying to keep track of
> +	 * which ones, just disable it for all.
> +	 */
> +	if (nr_pages > 1)
> +		goto out;

This effectively declares single-output-mode dead? Because I don't think
anybody uses PT with a single 4K buffer.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] perf/x86/intel/pt: Fix sampling using single range output
  2022-11-14 10:51 ` Peter Zijlstra
@ 2022-11-14 11:10   ` Adrian Hunter
  2022-11-14 12:59     ` Peter Zijlstra
  0 siblings, 1 reply; 6+ messages in thread
From: Adrian Hunter @ 2022-11-14 11:10 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Alexander Shishkin, linux-kernel, Ingo Molnar, linux-perf-users

On 14/11/22 12:51, Peter Zijlstra wrote:
> On Sat, Nov 12, 2022 at 05:15:08PM +0200, Adrian Hunter wrote:
>> Deal with errata TGL052, ADL037 and RPL017 "Trace May Contain Incorrect
>> Data When Configured With Single Range Output Larger Than 4KB" by
>> disabling single range output whenever larger than 4KB.
>>
>> Fixes: 670638477aed ("perf/x86/intel/pt: Opportunistically use single range output mode")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
>> ---
>>  arch/x86/events/intel/pt.c | 9 +++++++++
>>  1 file changed, 9 insertions(+)
>>
>> diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
>> index 82ef87e9a897..42a55794004a 100644
>> --- a/arch/x86/events/intel/pt.c
>> +++ b/arch/x86/events/intel/pt.c
>> @@ -1263,6 +1263,15 @@ static int pt_buffer_try_single(struct pt_buffer *buf, int nr_pages)
>>  	if (1 << order != nr_pages)
>>  		goto out;
>>  
>> +	/*
>> +	 * Some processors cannot always support single range for more than
>> +	 * 4KB - refer errata TGL052, ADL037 and RPL017. Future processors might
>> +	 * also be affected, so for now rather than trying to keep track of
>> +	 * which ones, just disable it for all.
>> +	 */
>> +	if (nr_pages > 1)
>> +		goto out;
> 
> This effectively declares single-output-mode dead? Because I don't think
> anybody uses PT with a single 4K buffer.

4K is the default size for "sample mode" i.e. stuffing 4KB of Intel PT trace
data into a PERF_RECORD_SAMPLE record that has sample_type bit PERF_SAMPLE_AUX

e.g.

$ perf record -vv --aux-sample -e '{intel_pt//u,cycles:u}' uname 2>err.txt
Linux
$ grep aux_sample_size err.txt
  aux_sample_size                  4096
$ 


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] perf/x86/intel/pt: Fix sampling using single range output
  2022-11-14 11:10   ` Adrian Hunter
@ 2022-11-14 12:59     ` Peter Zijlstra
  2022-11-15 19:46       ` Andi Kleen
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Zijlstra @ 2022-11-14 12:59 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Alexander Shishkin, linux-kernel, Ingo Molnar, linux-perf-users

On Mon, Nov 14, 2022 at 01:10:38PM +0200, Adrian Hunter wrote:
> On 14/11/22 12:51, Peter Zijlstra wrote:
> > On Sat, Nov 12, 2022 at 05:15:08PM +0200, Adrian Hunter wrote:
> >> Deal with errata TGL052, ADL037 and RPL017 "Trace May Contain Incorrect
> >> Data When Configured With Single Range Output Larger Than 4KB" by
> >> disabling single range output whenever larger than 4KB.
> >>
> >> Fixes: 670638477aed ("perf/x86/intel/pt: Opportunistically use single range output mode")
> >> Cc: stable@vger.kernel.org
> >> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> >> ---
> >>  arch/x86/events/intel/pt.c | 9 +++++++++
> >>  1 file changed, 9 insertions(+)
> >>
> >> diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
> >> index 82ef87e9a897..42a55794004a 100644
> >> --- a/arch/x86/events/intel/pt.c
> >> +++ b/arch/x86/events/intel/pt.c
> >> @@ -1263,6 +1263,15 @@ static int pt_buffer_try_single(struct pt_buffer *buf, int nr_pages)
> >>  	if (1 << order != nr_pages)
> >>  		goto out;
> >>  
> >> +	/*
> >> +	 * Some processors cannot always support single range for more than
> >> +	 * 4KB - refer errata TGL052, ADL037 and RPL017. Future processors might
> >> +	 * also be affected, so for now rather than trying to keep track of
> >> +	 * which ones, just disable it for all.
> >> +	 */
> >> +	if (nr_pages > 1)
> >> +		goto out;
> > 
> > This effectively declares single-output-mode dead? Because I don't think
> > anybody uses PT with a single 4K buffer.
> 
> 4K is the default size for "sample mode" i.e. stuffing 4KB of Intel PT trace
> data into a PERF_RECORD_SAMPLE record that has sample_type bit PERF_SAMPLE_AUX
> 
> e.g.
> 
> $ perf record -vv --aux-sample -e '{intel_pt//u,cycles:u}' uname 2>err.txt
> Linux
> $ grep aux_sample_size err.txt
>   aux_sample_size                  4096

Ah, ok. Not as bad then. Anyway, I'll go queue it for perf/urgent I
suppose.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] perf/x86/intel/pt: Fix sampling using single range output
  2022-11-14 12:59     ` Peter Zijlstra
@ 2022-11-15 19:46       ` Andi Kleen
  2022-11-16  6:26         ` Adrian Hunter
  0 siblings, 1 reply; 6+ messages in thread
From: Andi Kleen @ 2022-11-15 19:46 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Adrian Hunter, Alexander Shishkin, linux-kernel, Ingo Molnar,
	linux-perf-users

Peter Zijlstra <peterz@infradead.org> writes:

> On Mon, Nov 14, 2022 at 01:10:38PM +0200, Adrian Hunter wrote:
>> On 14/11/22 12:51, Peter Zijlstra wrote:
>> > On Sat, Nov 12, 2022 at 05:15:08PM +0200, Adrian Hunter wrote:
>> >> Deal with errata TGL052, ADL037 and RPL017 "Trace May Contain Incorrect
>> >> Data When Configured With Single Range Output Larger Than 4KB" by
>> >> disabling single range output whenever larger than 4KB.
>> >>
>> >> Fixes: 670638477aed ("perf/x86/intel/pt: Opportunistically use single range output mode")
>> >> Cc: stable@vger.kernel.org
>> >> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
>> >> ---
>> >>  arch/x86/events/intel/pt.c | 9 +++++++++
>> >>  1 file changed, 9 insertions(+)
>> >>
>> >> diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
>> >> index 82ef87e9a897..42a55794004a 100644
>> >> --- a/arch/x86/events/intel/pt.c
>> >> +++ b/arch/x86/events/intel/pt.c
>> >> @@ -1263,6 +1263,15 @@ static int pt_buffer_try_single(struct pt_buffer *buf, int nr_pages)
>> >>  	if (1 << order != nr_pages)
>> >>  		goto out;
>> >>  
>> >> +	/*
>> >> +	 * Some processors cannot always support single range for more than
>> >> +	 * 4KB - refer errata TGL052, ADL037 and RPL017. Future processors might
>> >> +	 * also be affected, so for now rather than trying to keep track of
>> >> +	 * which ones, just disable it for all.
>> >> +	 */
>> >> +	if (nr_pages > 1)
>> >> +		goto out;
>> > 
>> > This effectively declares single-output-mode dead? Because I don't think
>> > anybody uses PT with a single 4K buffer.
>> 
>> 4K is the default size for "sample mode" i.e. stuffing 4KB of Intel PT trace
>> data into a PERF_RECORD_SAMPLE record that has sample_type bit PERF_SAMPLE_AUX
>> 
>> e.g.
>> 
>> $ perf record -vv --aux-sample -e '{intel_pt//u,cycles:u}' uname 2>err.txt
>> Linux
>> $ grep aux_sample_size err.txt
>>   aux_sample_size                  4096
>
> Ah, ok. Not as bad then. Anyway, I'll go queue it for perf/urgent I
> suppose.

It would be better to only limit on the CPUs with the bug because
switching buffers causes some extra latencies. So this patch may regress
PT overhead or tail latencies.

-Andi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] perf/x86/intel/pt: Fix sampling using single range output
  2022-11-15 19:46       ` Andi Kleen
@ 2022-11-16  6:26         ` Adrian Hunter
  0 siblings, 0 replies; 6+ messages in thread
From: Adrian Hunter @ 2022-11-16  6:26 UTC (permalink / raw)
  To: Andi Kleen, Peter Zijlstra
  Cc: Alexander Shishkin, linux-kernel, Ingo Molnar, linux-perf-users

On 15/11/22 21:46, Andi Kleen wrote:
> Peter Zijlstra <peterz@infradead.org> writes:
> 
>> On Mon, Nov 14, 2022 at 01:10:38PM +0200, Adrian Hunter wrote:
>>> On 14/11/22 12:51, Peter Zijlstra wrote:
>>>> On Sat, Nov 12, 2022 at 05:15:08PM +0200, Adrian Hunter wrote:
>>>>> Deal with errata TGL052, ADL037 and RPL017 "Trace May Contain Incorrect
>>>>> Data When Configured With Single Range Output Larger Than 4KB" by
>>>>> disabling single range output whenever larger than 4KB.
>>>>>
>>>>> Fixes: 670638477aed ("perf/x86/intel/pt: Opportunistically use single range output mode")
>>>>> Cc: stable@vger.kernel.org
>>>>> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
>>>>> ---
>>>>>  arch/x86/events/intel/pt.c | 9 +++++++++
>>>>>  1 file changed, 9 insertions(+)
>>>>>
>>>>> diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
>>>>> index 82ef87e9a897..42a55794004a 100644
>>>>> --- a/arch/x86/events/intel/pt.c
>>>>> +++ b/arch/x86/events/intel/pt.c
>>>>> @@ -1263,6 +1263,15 @@ static int pt_buffer_try_single(struct pt_buffer *buf, int nr_pages)
>>>>>  	if (1 << order != nr_pages)
>>>>>  		goto out;
>>>>>  
>>>>> +	/*
>>>>> +	 * Some processors cannot always support single range for more than
>>>>> +	 * 4KB - refer errata TGL052, ADL037 and RPL017. Future processors might
>>>>> +	 * also be affected, so for now rather than trying to keep track of
>>>>> +	 * which ones, just disable it for all.
>>>>> +	 */
>>>>> +	if (nr_pages > 1)
>>>>> +		goto out;
>>>>
>>>> This effectively declares single-output-mode dead? Because I don't think
>>>> anybody uses PT with a single 4K buffer.
>>>
>>> 4K is the default size for "sample mode" i.e. stuffing 4KB of Intel PT trace
>>> data into a PERF_RECORD_SAMPLE record that has sample_type bit PERF_SAMPLE_AUX
>>>
>>> e.g.
>>>
>>> $ perf record -vv --aux-sample -e '{intel_pt//u,cycles:u}' uname 2>err.txt
>>> Linux
>>> $ grep aux_sample_size err.txt
>>>   aux_sample_size                  4096
>>
>> Ah, ok. Not as bad then. Anyway, I'll go queue it for perf/urgent I
>> suppose.
> 
> It would be better to only limit on the CPUs with the bug because
> switching buffers causes some extra latencies. So this patch may regress
> PT overhead or tail latencies.

I could whitelist CPUs that do not have the issue, because a blacklist
would keep expanding, which would be a bit of a pain to maintain.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-11-16  6:27 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-11-12 15:15 [PATCH] perf/x86/intel/pt: Fix sampling using single range output Adrian Hunter
2022-11-14 10:51 ` Peter Zijlstra
2022-11-14 11:10   ` Adrian Hunter
2022-11-14 12:59     ` Peter Zijlstra
2022-11-15 19:46       ` Andi Kleen
2022-11-16  6:26         ` Adrian Hunter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).