linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Li Huafei <lihuafei1@huawei.com>, peterz@infradead.org, mingo@redhat.com
Cc: acme@kernel.org, namhyung@kernel.org, mark.rutland@arm.com,
	alexander.shishkin@linux.intel.com, jolsa@kernel.org,
	irogers@google.com, adrian.hunter@intel.com,
	kan.liang@linux.intel.com, bp@alien8.de,
	dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com,
	linux-perf-users@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] perf/x86/intel: Restrict period on Haswell
Date: Wed, 14 Aug 2024 16:52:19 +0200	[thread overview]
Message-ID: <87sev7nom4.ffs@tglx> (raw)
In-Reply-To: <fb87cc82-94b7-31aa-0374-a1d7fa49470e@huawei.com>

Li!

On Tue, Aug 13 2024 at 21:13, Li Huafei wrote:
> On 2024/8/1 3:20, Thomas Gleixner wrote:
>>> My machine has 32 physical cores, each with two logical cores. During
>>> testing, it executes the CVE-2015-3290 test case 100 times concurrently.
>>>
>>> This warning was already present in [1] and a patch was given there to
>>> limit period to 128 on Haswell, but that patch was not merged into the
>>> mainline.  In [2] the period on Nehalem was limited to 32. I tested 16
>>> and 32 period on my machine and found that the problem could be
>>> reproduced with a limit of 16, but the problem did not reproduce when
>>> set to 32. It looks like we can limit the cycles to 32 on Haswell as
>>> well.
>> 
>> It looks like? Either it works or not.
>
> It worked for my test scenario. I say "looks like" because I'm not sure
> how it circumvents the problem, and if the limit of 32 no longer works
> if I increase the number of test cases executed in parallel. Any
> suggestions?

If you read back through the email history of these limits, then you can
see that too short periods cause that problem on Broadwell due to a
erratum, which is explained on top of the BDW limit.

Now looking at the HSW specification update specifically erratum HSW11:

  Performance Monitor Precise Instruction Retired Event May Present
  Wrong Indications

  Problem:
         When the Precise Distribution for Instructions Retired (PDIR)
         mechanism is activated (INST_RETIRED.ALL (event C0H, umask
         value 00H) on Counter 1 programmed in PEBS mode), the processor
         may return wrong PEBS or Performance Monitoring Interrupt (PMI)
         interrupts and/or incorrect counter values if the counter is
         reset with a Sample- After-Value (SAV) below 100 (the SAV is
         the counter reset value software programs in the MSR
         IA32_PMC1[47:0] in order to control interrupt frequency).

  Implication:
         Due to this erratum, when using low SAV values, the program may
         get incorrect PEBS or PMI interrupts and/or an invalid counter
         state.

  Workaround:
         The sampling driver should avoid using SAV<100.

IOW, that's exactly the same issue as the BDM11 erratum.

Kan: Can you please go through the various specification updates and
identify which generations are affected by this and fix it once and
forever in a sane way instead of relying on 'tried until it works by
some definition of works' hacks. These errata are there for a reason.


But that does not explain the fallout with that cve test because that
does not use PEBS. It's using fixed counter 0.

Li, you added that huge useless backtrace but cut off the output of
perf_event_print_debug() after it. Can you please provide that
information so we can see what the counter states are?

>>> +static void hsw_limit_period(struct perf_event *event, s64 *left)
>>> +{
>>> +	*left = max(*left, 32LL);
>>> +}
>> 
>> And why do we need a copy of nhm_limit_period() ?
>> 
>
> Do you mean why the period is limited to 32 like nhm_limit_period()?

No. If 32 is the correct limit, then we don't need another function
which does exactly the same. So you can assign exactly that function for
HSW, no?

Thanks,

        tglx

  parent reply	other threads:[~2024-08-14 14:52 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-29 22:33 [PATCH] perf/x86/intel: Restrict period on Haswell Li Huafei
2024-07-31 19:20 ` Thomas Gleixner
2024-08-13 13:13   ` Li Huafei
2024-08-14 14:43     ` Thomas Gleixner
2024-08-14 14:52     ` Thomas Gleixner [this message]
2024-08-14 18:15       ` Liang, Kan
2024-08-14 19:01         ` Thomas Gleixner
2024-08-14 19:37           ` Liang, Kan
2024-08-14 22:47             ` Thomas Gleixner
2024-08-15 15:39               ` Liang, Kan
2024-08-15 18:26                 ` Thomas Gleixner
2024-08-15 20:15                   ` Liang, Kan
2024-08-15 23:43                     ` Thomas Gleixner
2024-08-16 19:27                       ` Liang, Kan
2024-08-17 12:22                         ` Liang, Kan
2024-08-17 12:23                         ` Thomas Gleixner
2024-08-15 19:01                 ` Vince Weaver

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87sev7nom4.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=lihuafei1@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).