public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: "Zhang, Rui" <rui.zhang@intel.com>
To: "ricardo.neri-calderon@linux.intel.com"
	<ricardo.neri-calderon@linux.intel.com>,
	"gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>
Cc: "regressions@leemhuis.info" <regressions@leemhuis.info>,
	"Neri, Ricardo" <ricardo.neri@intel.com>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"bp@alien8.de" <bp@alien8.de>,
	"Gupta, Pawan Kumar" <pawan.kumar.gupta@intel.com>,
	"regressions@lists.linux.dev" <regressions@lists.linux.dev>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Luck, Tony" <tony.luck@intel.com>,
	"thomas.lindroth@gmail.com" <thomas.lindroth@gmail.com>,
	"stable@vger.kernel.org" <stable@vger.kernel.org>
Subject: Re: [STABLE REGRESSION] Possible missing backport of x86_match_cpu() change in v6.1.96
Date: Wed, 25 Sep 2024 05:20:41 +0000	[thread overview]
Message-ID: <c20149f35be104c0aa8e995b0f3c7727e095323a.camel@intel.com> (raw)
In-Reply-To: <20240924024551.GA13538@ranerica-svr.sc.intel.com>

On Mon, 2024-09-23 at 19:45 -0700, Ricardo Neri wrote:
> On Thu, Sep 19, 2024 at 01:19:27PM +0200,
> gregkh@linuxfoundation.org wrote:
> > On Wed, Sep 18, 2024 at 06:54:33AM +0000, Zhang, Rui wrote:
> > > On Mon, 2024-08-12 at 14:11 +0200, Greg KH wrote:
> > > > On Wed, Aug 07, 2024 at 10:15:23AM +0200, Thorsten Leemhuis
> > > > wrote:
> > > > > [CCing the x86 folks, Greg, and the regressions list]
> > > > > 
> > > > > Hi, Thorsten here, the Linux kernel's regression tracker.
> > > > > 
> > > > > On 30.07.24 18:41, Thomas Lindroth wrote:
> > > > > > I upgraded from kernel 6.1.94 to 6.1.99 on one of my
> > > > > > machines and
> > > > > > noticed that
> > > > > > the dmesg line "Incomplete global flushes, disabling PCID"
> > > > > > had
> > > > > > disappeared from
> > > > > > the log.
> > > > > 
> > > > > Thomas, thx for the report. FWIW, mainline developers like
> > > > > the x86
> > > > > folks
> > > > > or Tony are free to focus on mainline and leave
> > > > > stable/longterm
> > > > > series
> > > > > to other people -- some nevertheless help out regularly or
> > > > > occasionally.
> > > > > So with a bit of luck this mail will make one of them care
> > > > > enough
> > > > > to
> > > > > provide a 6.1 version of what you afaics called the "existing
> > > > > fix"
> > > > > in
> > > > > mainline (2eda374e883ad2 ("x86/mm: Switch to new Intel CPU
> > > > > model
> > > > > defines") [v6.10-rc1]) that seems to be missing in 6.1.y. But
> > > > > if
> > > > > not I
> > > > > suspect it might be up to you to prepare and submit a 6.1.y
> > > > > variant
> > > > > of
> > > > > that fix, as you seem to care and are able to test the patch.
> > > > 
> > > > Needs to go to 6.6.y first, right?  But even then, it does not
> > > > apply
> > > > to
> > > > 6.1.y cleanly, so someone needs to send a backported (and
> > > > tested)
> > > > series
> > > > to us at stable@vger.kernel.org and we will be glad to queue
> > > > them up
> > > > then.
> > > > 
> > > > thanks,
> > > > 
> > > > greg k-h
> > > 
> > > There are three commits involved.
> > > 
> > > commit A:
> > >    4db64279bc2b (""x86/cpu: Switch to new Intel CPU model
> > > defines"") 
> > >    This commit replaces
> > >       X86_MATCH_INTEL_FAM6_MODEL(ANY, 1),             /* SNC */
> > >    with
> > >       X86_MATCH_VFM(INTEL_ANY,         1),    /* SNC */
> > >    This is a functional change because the family info is
> > > replaced with
> > > 0. And this exposes a x86_match_cpu() problem that it breaks when
> > > the
> > > vendor/family/model/stepping/feature fields are all zeros.
> > > 
> > > commit B:
> > >    93022482b294 ("x86/cpu: Fix x86_match_cpu() to match just
> > > X86_VENDOR_INTEL")
> > >    It addresses the x86_match_cpu() problem by introducing a
> > > valid flag
> > > and set the flag in the Intel CPU model defines.
> > >    This fixes commit A, but it actually breaks the x86_cpu_id
> > > structures that are constructed without using the Intel CPU model
> > > defines, like arch/x86/mm/init.c.
> > > 
> > > commit C:
> > >    2eda374e883a ("x86/mm: Switch to new Intel CPU model defines")
> > >    arch/x86/mm/init.c: broke by commit B but fixed by using the
> > > new
> > > Intel CPU model defines
> > > 
> > > In 6.1.99,
> > > commit A is missing
> > > commit B is there
> > > commit C is missing
> > > 
> > > In 6.6.50,
> > > commit A is missing
> > > commit B is there
> > > commit C is missing
> > > 
> > > Now we can fix the problem in stable kernel, by converting
> > > arch/x86/mm/init.c to use the CPU model defines (even the old
> > > style
> > > ones). But before that, I'm wondering if we need to backport
> > > commit B
> > > in 6.1 and 6.6 stable kernel because only commit A can expose
> > > this
> > > problem.
> > 
> > If so, can you submit the needed backports for us to apply?  That's
> > the
> > easiest way for us to take them, thanks.
> 
> I audited all the uses of x86_match_cpu(match). All callers that
> construct
> the `match` argument using the family of X86_MATCH_* macros from
> arch/x86/
> include/asm/cpu_device_id.h function correctly because the commit B
> has
> been backported to v6.1.99 and to v6.6.50 -- 93022482b294 ("x86/cpu:
> Fix
> x86_match_cpu() to match just X86_VENDOR_INTEL").
> 
> Only those callers that use their own thing to compose the `match`
> argument
> are buggy:
>     * arch/x86/mm/init.c
>     * drivers/powercap/intel_rapl_msr.c (only in 6.1.99)

Thanks for auditing this. I overlooked the intel_rapl driver case.
> 
> Summarizing, v6.1.99 needs these two commits from mainline
>     * d05b5e0baf42 ("powercap: RAPL: fix invalid initialization for
>       pl4_supported field")
>     * 2eda374e883a ("x86/mm: Switch to new Intel CPU model defines")
> 
> v6.6.50 only needs the second commit.

Well, commit B 93022482b294 ("x86/cpu: Fix x86_match_cpu() to match
just X86_VENDOR_INTEL") is backported to all stable kernels. And the
above two broken cases are also there.

So I suppose we need to backport all of them to 5.x stable kernel as
well.

thanks,
rui
> 
> I will submit these backports.
> 
> Thanks and BR,
> Ricardo


  reply	other threads:[~2024-09-25  5:20 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-30 16:41 [STABLE REGRESSION] Possible missing backport of x86_match_cpu() change in v6.1.96 Thomas Lindroth
2024-08-07  8:15 ` Thorsten Leemhuis
2024-08-12 12:11   ` Greg KH
2024-09-18  6:54     ` Zhang, Rui
2024-09-19 11:19       ` gregkh
2024-09-24  2:45         ` Ricardo Neri
2024-09-25  5:20           ` Zhang, Rui [this message]
2024-09-25 19:51             ` Ricardo Neri

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c20149f35be104c0aa8e995b0f3c7727e095323a.camel@intel.com \
    --to=rui.zhang@intel.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pawan.kumar.gupta@intel.com \
    --cc=regressions@leemhuis.info \
    --cc=regressions@lists.linux.dev \
    --cc=ricardo.neri-calderon@linux.intel.com \
    --cc=ricardo.neri@intel.com \
    --cc=stable@vger.kernel.org \
    --cc=thomas.lindroth@gmail.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox