* Re: [STABLE REGRESSION] Possible missing backport of x86_match_cpu() change in v6.1.96 [not found] <eb709d67-2a8d-412f-905d-f3777d897bfa@gmail.com> @ 2024-08-07 8:15 ` Thorsten Leemhuis 2024-08-12 12:11 ` Greg KH 0 siblings, 1 reply; 7+ messages in thread From: Thorsten Leemhuis @ 2024-08-07 8:15 UTC (permalink / raw) To: Thomas Lindroth Cc: stable, tony.luck, Greg KH, Dave Hansen, Borislav Petkov (AMD), LKML, Linux kernel regressions list [CCing the x86 folks, Greg, and the regressions list] Hi, Thorsten here, the Linux kernel's regression tracker. On 30.07.24 18:41, Thomas Lindroth wrote: > I upgraded from kernel 6.1.94 to 6.1.99 on one of my machines and > noticed that > the dmesg line "Incomplete global flushes, disabling PCID" had > disappeared from > the log. Thomas, thx for the report. FWIW, mainline developers like the x86 folks or Tony are free to focus on mainline and leave stable/longterm series to other people -- some nevertheless help out regularly or occasionally. So with a bit of luck this mail will make one of them care enough to provide a 6.1 version of what you afaics called the "existing fix" in mainline (2eda374e883ad2 ("x86/mm: Switch to new Intel CPU model defines") [v6.10-rc1]) that seems to be missing in 6.1.y. But if not I suspect it might be up to you to prepare and submit a 6.1.y variant of that fix, as you seem to care and are able to test the patch. Ciao, Thorsten > That message comes from commit c26b9e193172f48cd0ccc64285337106fb8aa804, > which > disables PCID support on some broken hardware in arch/x86/mm/init.c: > > #define INTEL_MATCH(_model) { .vendor = X86_VENDOR_INTEL, \ > .family = 6, \ > .model = _model, \ > } > /* > * INVLPG may not properly flush Global entries > * on these CPUs when PCIDs are enabled. > */ > static const struct x86_cpu_id invlpg_miss_ids[] = { > INTEL_MATCH(INTEL_FAM6_ALDERLAKE ), > INTEL_MATCH(INTEL_FAM6_ALDERLAKE_L ), > INTEL_MATCH(INTEL_FAM6_ALDERLAKE_N ), > INTEL_MATCH(INTEL_FAM6_RAPTORLAKE ), > INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_P), > INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_S), > {} > > ... > > if (x86_match_cpu(invlpg_miss_ids)) { > pr_info("Incomplete global flushes, disabling PCID"); > setup_clear_cpu_cap(X86_FEATURE_PCID); > return; > } > > arch/x86/mm/init.c, which has that code, hasn't changed in 6.1.94 -> > 6.1.99. > However I found a commit changing how x86_match_cpu() behaves in 6.1.96: > > commit 8ab1361b2eae44077fef4adea16228d44ffb860c > Author: Tony Luck <tony.luck@intel.com> > Date: Mon May 20 15:45:33 2024 -0700 > > x86/cpu: Fix x86_match_cpu() to match just X86_VENDOR_INTEL > > I suspect this broke the PCID disabling code in arch/x86/mm/init.c. > The commit message says: > > "Add a new flags field to struct x86_cpu_id that has a bit set to > indicate that > this entry in the array is valid. Update X86_MATCH*() macros to set that > bit. > Change the end-marker check in x86_match_cpu() to just check the flags > field > for this bit." > > But the PCID disabling code in 6.1.99 does not make use of the > X86_MATCH*() macros; instead, it defines a new INTEL_MATCH() macro > without the > X86_CPU_ID_FLAG_ENTRY_VALID flag. > > I looked in upstream git and found an existing fix: > commit 2eda374e883ad297bd9fe575a16c1dc850346075 > Author: Tony Luck <tony.luck@intel.com> > Date: Wed Apr 24 11:15:18 2024 -0700 > > x86/mm: Switch to new Intel CPU model defines > > New CPU #defines encode vendor and family as well as model. > > [ dhansen: vertically align 0's in invlpg_miss_ids[] ] > > Signed-off-by: Tony Luck <tony.luck@intel.com> > Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> > Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> > Link: > https://lore.kernel.org/all/20240424181518.41946-1-tony.luck%40intel.com > > diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c > index 679893ea5e68..6b43b6480354 100644 > --- a/arch/x86/mm/init.c > +++ b/arch/x86/mm/init.c > @@ -261,21 +261,17 @@ static void __init probe_page_size_mask(void) > } > } > > -#define INTEL_MATCH(_model) { .vendor = X86_VENDOR_INTEL, \ > - .family = 6, \ > - .model = _model, \ > - } > /* > * INVLPG may not properly flush Global entries > * on these CPUs when PCIDs are enabled. > */ > static const struct x86_cpu_id invlpg_miss_ids[] = { > - INTEL_MATCH(INTEL_FAM6_ALDERLAKE ), > - INTEL_MATCH(INTEL_FAM6_ALDERLAKE_L ), > - INTEL_MATCH(INTEL_FAM6_ATOM_GRACEMONT ), > - INTEL_MATCH(INTEL_FAM6_RAPTORLAKE ), > - INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_P), > - INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_S), > + X86_MATCH_VFM(INTEL_ALDERLAKE, 0), > + X86_MATCH_VFM(INTEL_ALDERLAKE_L, 0), > + X86_MATCH_VFM(INTEL_ATOM_GRACEMONT, 0), > + X86_MATCH_VFM(INTEL_RAPTORLAKE, 0), > + X86_MATCH_VFM(INTEL_RAPTORLAKE_P, 0), > + X86_MATCH_VFM(INTEL_RAPTORLAKE_S, 0), > {} > }; > > The fix removed the custom INTEL_MATCH macro and uses the X86_MATCH*() > macros > with X86_CPU_ID_FLAG_ENTRY_VALID. This fixed commit was never backported > to 6.1, > so it looks like a stable series regression due to a missing backport. > > If I apply the fix patch on 6.1.99, the PCID disabling code activates > again. > I had to change all the INTEL_* definitions to the old definitions to > make it > build: > > static const struct x86_cpu_id invlpg_miss_ids[] = { > - INTEL_MATCH(INTEL_FAM6_ALDERLAKE ), > - INTEL_MATCH(INTEL_FAM6_ALDERLAKE_L ), > - INTEL_MATCH(INTEL_FAM6_ALDERLAKE_N ), > - INTEL_MATCH(INTEL_FAM6_RAPTORLAKE ), > - INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_P), > - INTEL_MATCH(INTEL_FAM6_RAPTORLAKE_S), > + X86_MATCH_VFM(INTEL_FAM6_ALDERLAKE, 0), > + X86_MATCH_VFM(INTEL_FAM6_ALDERLAKE_L, 0), > + X86_MATCH_VFM(INTEL_FAM6_ALDERLAKE_N, 0), > + X86_MATCH_VFM(INTEL_FAM6_RAPTORLAKE, 0), > + X86_MATCH_VFM(INTEL_FAM6_RAPTORLAKE_P, 0), > + X86_MATCH_VFM(INTEL_FAM6_RAPTORLAKE_S, 0), > {} > }; > > I only looked at the code in arch/x86/mm/init.c, so there may be other > uses of > x86_match_cpu() in the kernel that are also broken in 6.1.99. > This email is meant as a bug report, not a pull request. Someone else > should > confirm the problem and submit the appropriate fix. P.S.: #regzbot ^introduced 8ab1361b2eae44 #regzbot title x86: Possible missing backport of x86_match_cpu() change #regzbot ignore-activity ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [STABLE REGRESSION] Possible missing backport of x86_match_cpu() change in v6.1.96 2024-08-07 8:15 ` [STABLE REGRESSION] Possible missing backport of x86_match_cpu() change in v6.1.96 Thorsten Leemhuis @ 2024-08-12 12:11 ` Greg KH 2024-09-18 6:54 ` Zhang, Rui 0 siblings, 1 reply; 7+ messages in thread From: Greg KH @ 2024-08-12 12:11 UTC (permalink / raw) To: Thorsten Leemhuis Cc: Thomas Lindroth, stable, tony.luck, Dave Hansen, Borislav Petkov (AMD), LKML, Linux kernel regressions list On Wed, Aug 07, 2024 at 10:15:23AM +0200, Thorsten Leemhuis wrote: > [CCing the x86 folks, Greg, and the regressions list] > > Hi, Thorsten here, the Linux kernel's regression tracker. > > On 30.07.24 18:41, Thomas Lindroth wrote: > > I upgraded from kernel 6.1.94 to 6.1.99 on one of my machines and > > noticed that > > the dmesg line "Incomplete global flushes, disabling PCID" had > > disappeared from > > the log. > > Thomas, thx for the report. FWIW, mainline developers like the x86 folks > or Tony are free to focus on mainline and leave stable/longterm series > to other people -- some nevertheless help out regularly or occasionally. > So with a bit of luck this mail will make one of them care enough to > provide a 6.1 version of what you afaics called the "existing fix" in > mainline (2eda374e883ad2 ("x86/mm: Switch to new Intel CPU model > defines") [v6.10-rc1]) that seems to be missing in 6.1.y. But if not I > suspect it might be up to you to prepare and submit a 6.1.y variant of > that fix, as you seem to care and are able to test the patch. Needs to go to 6.6.y first, right? But even then, it does not apply to 6.1.y cleanly, so someone needs to send a backported (and tested) series to us at stable@vger.kernel.org and we will be glad to queue them up then. thanks, greg k-h ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [STABLE REGRESSION] Possible missing backport of x86_match_cpu() change in v6.1.96 2024-08-12 12:11 ` Greg KH @ 2024-09-18 6:54 ` Zhang, Rui 2024-09-19 11:19 ` gregkh 0 siblings, 1 reply; 7+ messages in thread From: Zhang, Rui @ 2024-09-18 6:54 UTC (permalink / raw) To: regressions@leemhuis.info, gregkh@linuxfoundation.org Cc: Neri, Ricardo, dave.hansen@linux.intel.com, bp@alien8.de, Gupta, Pawan Kumar, regressions@lists.linux.dev, linux-kernel@vger.kernel.org, Luck, Tony, thomas.lindroth@gmail.com, stable@vger.kernel.org On Mon, 2024-08-12 at 14:11 +0200, Greg KH wrote: > On Wed, Aug 07, 2024 at 10:15:23AM +0200, Thorsten Leemhuis wrote: > > [CCing the x86 folks, Greg, and the regressions list] > > > > Hi, Thorsten here, the Linux kernel's regression tracker. > > > > On 30.07.24 18:41, Thomas Lindroth wrote: > > > I upgraded from kernel 6.1.94 to 6.1.99 on one of my machines and > > > noticed that > > > the dmesg line "Incomplete global flushes, disabling PCID" had > > > disappeared from > > > the log. > > > > Thomas, thx for the report. FWIW, mainline developers like the x86 > > folks > > or Tony are free to focus on mainline and leave stable/longterm > > series > > to other people -- some nevertheless help out regularly or > > occasionally. > > So with a bit of luck this mail will make one of them care enough > > to > > provide a 6.1 version of what you afaics called the "existing fix" > > in > > mainline (2eda374e883ad2 ("x86/mm: Switch to new Intel CPU model > > defines") [v6.10-rc1]) that seems to be missing in 6.1.y. But if > > not I > > suspect it might be up to you to prepare and submit a 6.1.y variant > > of > > that fix, as you seem to care and are able to test the patch. > > Needs to go to 6.6.y first, right? But even then, it does not apply > to > 6.1.y cleanly, so someone needs to send a backported (and tested) > series > to us at stable@vger.kernel.org and we will be glad to queue them up > then. > > thanks, > > greg k-h There are three commits involved. commit A: 4db64279bc2b (""x86/cpu: Switch to new Intel CPU model defines"") This commit replaces X86_MATCH_INTEL_FAM6_MODEL(ANY, 1), /* SNC */ with X86_MATCH_VFM(INTEL_ANY, 1), /* SNC */ This is a functional change because the family info is replaced with 0. And this exposes a x86_match_cpu() problem that it breaks when the vendor/family/model/stepping/feature fields are all zeros. commit B: 93022482b294 ("x86/cpu: Fix x86_match_cpu() to match just X86_VENDOR_INTEL") It addresses the x86_match_cpu() problem by introducing a valid flag and set the flag in the Intel CPU model defines. This fixes commit A, but it actually breaks the x86_cpu_id structures that are constructed without using the Intel CPU model defines, like arch/x86/mm/init.c. commit C: 2eda374e883a ("x86/mm: Switch to new Intel CPU model defines") arch/x86/mm/init.c: broke by commit B but fixed by using the new Intel CPU model defines In 6.1.99, commit A is missing commit B is there commit C is missing In 6.6.50, commit A is missing commit B is there commit C is missing Now we can fix the problem in stable kernel, by converting arch/x86/mm/init.c to use the CPU model defines (even the old style ones). But before that, I'm wondering if we need to backport commit B in 6.1 and 6.6 stable kernel because only commit A can expose this problem. thanks, rui ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [STABLE REGRESSION] Possible missing backport of x86_match_cpu() change in v6.1.96 2024-09-18 6:54 ` Zhang, Rui @ 2024-09-19 11:19 ` gregkh 2024-09-24 2:45 ` Ricardo Neri 0 siblings, 1 reply; 7+ messages in thread From: gregkh @ 2024-09-19 11:19 UTC (permalink / raw) To: Zhang, Rui Cc: regressions@leemhuis.info, Neri, Ricardo, dave.hansen@linux.intel.com, bp@alien8.de, Gupta, Pawan Kumar, regressions@lists.linux.dev, linux-kernel@vger.kernel.org, Luck, Tony, thomas.lindroth@gmail.com, stable@vger.kernel.org On Wed, Sep 18, 2024 at 06:54:33AM +0000, Zhang, Rui wrote: > On Mon, 2024-08-12 at 14:11 +0200, Greg KH wrote: > > On Wed, Aug 07, 2024 at 10:15:23AM +0200, Thorsten Leemhuis wrote: > > > [CCing the x86 folks, Greg, and the regressions list] > > > > > > Hi, Thorsten here, the Linux kernel's regression tracker. > > > > > > On 30.07.24 18:41, Thomas Lindroth wrote: > > > > I upgraded from kernel 6.1.94 to 6.1.99 on one of my machines and > > > > noticed that > > > > the dmesg line "Incomplete global flushes, disabling PCID" had > > > > disappeared from > > > > the log. > > > > > > Thomas, thx for the report. FWIW, mainline developers like the x86 > > > folks > > > or Tony are free to focus on mainline and leave stable/longterm > > > series > > > to other people -- some nevertheless help out regularly or > > > occasionally. > > > So with a bit of luck this mail will make one of them care enough > > > to > > > provide a 6.1 version of what you afaics called the "existing fix" > > > in > > > mainline (2eda374e883ad2 ("x86/mm: Switch to new Intel CPU model > > > defines") [v6.10-rc1]) that seems to be missing in 6.1.y. But if > > > not I > > > suspect it might be up to you to prepare and submit a 6.1.y variant > > > of > > > that fix, as you seem to care and are able to test the patch. > > > > Needs to go to 6.6.y first, right? But even then, it does not apply > > to > > 6.1.y cleanly, so someone needs to send a backported (and tested) > > series > > to us at stable@vger.kernel.org and we will be glad to queue them up > > then. > > > > thanks, > > > > greg k-h > > There are three commits involved. > > commit A: > 4db64279bc2b (""x86/cpu: Switch to new Intel CPU model defines"") > This commit replaces > X86_MATCH_INTEL_FAM6_MODEL(ANY, 1), /* SNC */ > with > X86_MATCH_VFM(INTEL_ANY, 1), /* SNC */ > This is a functional change because the family info is replaced with > 0. And this exposes a x86_match_cpu() problem that it breaks when the > vendor/family/model/stepping/feature fields are all zeros. > > commit B: > 93022482b294 ("x86/cpu: Fix x86_match_cpu() to match just > X86_VENDOR_INTEL") > It addresses the x86_match_cpu() problem by introducing a valid flag > and set the flag in the Intel CPU model defines. > This fixes commit A, but it actually breaks the x86_cpu_id > structures that are constructed without using the Intel CPU model > defines, like arch/x86/mm/init.c. > > commit C: > 2eda374e883a ("x86/mm: Switch to new Intel CPU model defines") > arch/x86/mm/init.c: broke by commit B but fixed by using the new > Intel CPU model defines > > In 6.1.99, > commit A is missing > commit B is there > commit C is missing > > In 6.6.50, > commit A is missing > commit B is there > commit C is missing > > Now we can fix the problem in stable kernel, by converting > arch/x86/mm/init.c to use the CPU model defines (even the old style > ones). But before that, I'm wondering if we need to backport commit B > in 6.1 and 6.6 stable kernel because only commit A can expose this > problem. If so, can you submit the needed backports for us to apply? That's the easiest way for us to take them, thanks. greg k-h ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [STABLE REGRESSION] Possible missing backport of x86_match_cpu() change in v6.1.96 2024-09-19 11:19 ` gregkh @ 2024-09-24 2:45 ` Ricardo Neri 2024-09-25 5:20 ` Zhang, Rui 0 siblings, 1 reply; 7+ messages in thread From: Ricardo Neri @ 2024-09-24 2:45 UTC (permalink / raw) To: gregkh@linuxfoundation.org Cc: Zhang, Rui, regressions@leemhuis.info, Neri, Ricardo, dave.hansen@linux.intel.com, bp@alien8.de, Gupta, Pawan Kumar, regressions@lists.linux.dev, linux-kernel@vger.kernel.org, Luck, Tony, thomas.lindroth@gmail.com, stable@vger.kernel.org On Thu, Sep 19, 2024 at 01:19:27PM +0200, gregkh@linuxfoundation.org wrote: > On Wed, Sep 18, 2024 at 06:54:33AM +0000, Zhang, Rui wrote: > > On Mon, 2024-08-12 at 14:11 +0200, Greg KH wrote: > > > On Wed, Aug 07, 2024 at 10:15:23AM +0200, Thorsten Leemhuis wrote: > > > > [CCing the x86 folks, Greg, and the regressions list] > > > > > > > > Hi, Thorsten here, the Linux kernel's regression tracker. > > > > > > > > On 30.07.24 18:41, Thomas Lindroth wrote: > > > > > I upgraded from kernel 6.1.94 to 6.1.99 on one of my machines and > > > > > noticed that > > > > > the dmesg line "Incomplete global flushes, disabling PCID" had > > > > > disappeared from > > > > > the log. > > > > > > > > Thomas, thx for the report. FWIW, mainline developers like the x86 > > > > folks > > > > or Tony are free to focus on mainline and leave stable/longterm > > > > series > > > > to other people -- some nevertheless help out regularly or > > > > occasionally. > > > > So with a bit of luck this mail will make one of them care enough > > > > to > > > > provide a 6.1 version of what you afaics called the "existing fix" > > > > in > > > > mainline (2eda374e883ad2 ("x86/mm: Switch to new Intel CPU model > > > > defines") [v6.10-rc1]) that seems to be missing in 6.1.y. But if > > > > not I > > > > suspect it might be up to you to prepare and submit a 6.1.y variant > > > > of > > > > that fix, as you seem to care and are able to test the patch. > > > > > > Needs to go to 6.6.y first, right? But even then, it does not apply > > > to > > > 6.1.y cleanly, so someone needs to send a backported (and tested) > > > series > > > to us at stable@vger.kernel.org and we will be glad to queue them up > > > then. > > > > > > thanks, > > > > > > greg k-h > > > > There are three commits involved. > > > > commit A: > > 4db64279bc2b (""x86/cpu: Switch to new Intel CPU model defines"") > > This commit replaces > > X86_MATCH_INTEL_FAM6_MODEL(ANY, 1), /* SNC */ > > with > > X86_MATCH_VFM(INTEL_ANY, 1), /* SNC */ > > This is a functional change because the family info is replaced with > > 0. And this exposes a x86_match_cpu() problem that it breaks when the > > vendor/family/model/stepping/feature fields are all zeros. > > > > commit B: > > 93022482b294 ("x86/cpu: Fix x86_match_cpu() to match just > > X86_VENDOR_INTEL") > > It addresses the x86_match_cpu() problem by introducing a valid flag > > and set the flag in the Intel CPU model defines. > > This fixes commit A, but it actually breaks the x86_cpu_id > > structures that are constructed without using the Intel CPU model > > defines, like arch/x86/mm/init.c. > > > > commit C: > > 2eda374e883a ("x86/mm: Switch to new Intel CPU model defines") > > arch/x86/mm/init.c: broke by commit B but fixed by using the new > > Intel CPU model defines > > > > In 6.1.99, > > commit A is missing > > commit B is there > > commit C is missing > > > > In 6.6.50, > > commit A is missing > > commit B is there > > commit C is missing > > > > Now we can fix the problem in stable kernel, by converting > > arch/x86/mm/init.c to use the CPU model defines (even the old style > > ones). But before that, I'm wondering if we need to backport commit B > > in 6.1 and 6.6 stable kernel because only commit A can expose this > > problem. > > If so, can you submit the needed backports for us to apply? That's the > easiest way for us to take them, thanks. I audited all the uses of x86_match_cpu(match). All callers that construct the `match` argument using the family of X86_MATCH_* macros from arch/x86/ include/asm/cpu_device_id.h function correctly because the commit B has been backported to v6.1.99 and to v6.6.50 -- 93022482b294 ("x86/cpu: Fix x86_match_cpu() to match just X86_VENDOR_INTEL"). Only those callers that use their own thing to compose the `match` argument are buggy: * arch/x86/mm/init.c * drivers/powercap/intel_rapl_msr.c (only in 6.1.99) Summarizing, v6.1.99 needs these two commits from mainline * d05b5e0baf42 ("powercap: RAPL: fix invalid initialization for pl4_supported field") * 2eda374e883a ("x86/mm: Switch to new Intel CPU model defines") v6.6.50 only needs the second commit. I will submit these backports. Thanks and BR, Ricardo ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [STABLE REGRESSION] Possible missing backport of x86_match_cpu() change in v6.1.96 2024-09-24 2:45 ` Ricardo Neri @ 2024-09-25 5:20 ` Zhang, Rui 2024-09-25 19:51 ` Ricardo Neri 0 siblings, 1 reply; 7+ messages in thread From: Zhang, Rui @ 2024-09-25 5:20 UTC (permalink / raw) To: ricardo.neri-calderon@linux.intel.com, gregkh@linuxfoundation.org Cc: regressions@leemhuis.info, Neri, Ricardo, dave.hansen@linux.intel.com, bp@alien8.de, Gupta, Pawan Kumar, regressions@lists.linux.dev, linux-kernel@vger.kernel.org, Luck, Tony, thomas.lindroth@gmail.com, stable@vger.kernel.org On Mon, 2024-09-23 at 19:45 -0700, Ricardo Neri wrote: > On Thu, Sep 19, 2024 at 01:19:27PM +0200, > gregkh@linuxfoundation.org wrote: > > On Wed, Sep 18, 2024 at 06:54:33AM +0000, Zhang, Rui wrote: > > > On Mon, 2024-08-12 at 14:11 +0200, Greg KH wrote: > > > > On Wed, Aug 07, 2024 at 10:15:23AM +0200, Thorsten Leemhuis > > > > wrote: > > > > > [CCing the x86 folks, Greg, and the regressions list] > > > > > > > > > > Hi, Thorsten here, the Linux kernel's regression tracker. > > > > > > > > > > On 30.07.24 18:41, Thomas Lindroth wrote: > > > > > > I upgraded from kernel 6.1.94 to 6.1.99 on one of my > > > > > > machines and > > > > > > noticed that > > > > > > the dmesg line "Incomplete global flushes, disabling PCID" > > > > > > had > > > > > > disappeared from > > > > > > the log. > > > > > > > > > > Thomas, thx for the report. FWIW, mainline developers like > > > > > the x86 > > > > > folks > > > > > or Tony are free to focus on mainline and leave > > > > > stable/longterm > > > > > series > > > > > to other people -- some nevertheless help out regularly or > > > > > occasionally. > > > > > So with a bit of luck this mail will make one of them care > > > > > enough > > > > > to > > > > > provide a 6.1 version of what you afaics called the "existing > > > > > fix" > > > > > in > > > > > mainline (2eda374e883ad2 ("x86/mm: Switch to new Intel CPU > > > > > model > > > > > defines") [v6.10-rc1]) that seems to be missing in 6.1.y. But > > > > > if > > > > > not I > > > > > suspect it might be up to you to prepare and submit a 6.1.y > > > > > variant > > > > > of > > > > > that fix, as you seem to care and are able to test the patch. > > > > > > > > Needs to go to 6.6.y first, right? But even then, it does not > > > > apply > > > > to > > > > 6.1.y cleanly, so someone needs to send a backported (and > > > > tested) > > > > series > > > > to us at stable@vger.kernel.org and we will be glad to queue > > > > them up > > > > then. > > > > > > > > thanks, > > > > > > > > greg k-h > > > > > > There are three commits involved. > > > > > > commit A: > > > 4db64279bc2b (""x86/cpu: Switch to new Intel CPU model > > > defines"") > > > This commit replaces > > > X86_MATCH_INTEL_FAM6_MODEL(ANY, 1), /* SNC */ > > > with > > > X86_MATCH_VFM(INTEL_ANY, 1), /* SNC */ > > > This is a functional change because the family info is > > > replaced with > > > 0. And this exposes a x86_match_cpu() problem that it breaks when > > > the > > > vendor/family/model/stepping/feature fields are all zeros. > > > > > > commit B: > > > 93022482b294 ("x86/cpu: Fix x86_match_cpu() to match just > > > X86_VENDOR_INTEL") > > > It addresses the x86_match_cpu() problem by introducing a > > > valid flag > > > and set the flag in the Intel CPU model defines. > > > This fixes commit A, but it actually breaks the x86_cpu_id > > > structures that are constructed without using the Intel CPU model > > > defines, like arch/x86/mm/init.c. > > > > > > commit C: > > > 2eda374e883a ("x86/mm: Switch to new Intel CPU model defines") > > > arch/x86/mm/init.c: broke by commit B but fixed by using the > > > new > > > Intel CPU model defines > > > > > > In 6.1.99, > > > commit A is missing > > > commit B is there > > > commit C is missing > > > > > > In 6.6.50, > > > commit A is missing > > > commit B is there > > > commit C is missing > > > > > > Now we can fix the problem in stable kernel, by converting > > > arch/x86/mm/init.c to use the CPU model defines (even the old > > > style > > > ones). But before that, I'm wondering if we need to backport > > > commit B > > > in 6.1 and 6.6 stable kernel because only commit A can expose > > > this > > > problem. > > > > If so, can you submit the needed backports for us to apply? That's > > the > > easiest way for us to take them, thanks. > > I audited all the uses of x86_match_cpu(match). All callers that > construct > the `match` argument using the family of X86_MATCH_* macros from > arch/x86/ > include/asm/cpu_device_id.h function correctly because the commit B > has > been backported to v6.1.99 and to v6.6.50 -- 93022482b294 ("x86/cpu: > Fix > x86_match_cpu() to match just X86_VENDOR_INTEL"). > > Only those callers that use their own thing to compose the `match` > argument > are buggy: > * arch/x86/mm/init.c > * drivers/powercap/intel_rapl_msr.c (only in 6.1.99) Thanks for auditing this. I overlooked the intel_rapl driver case. > > Summarizing, v6.1.99 needs these two commits from mainline > * d05b5e0baf42 ("powercap: RAPL: fix invalid initialization for > pl4_supported field") > * 2eda374e883a ("x86/mm: Switch to new Intel CPU model defines") > > v6.6.50 only needs the second commit. Well, commit B 93022482b294 ("x86/cpu: Fix x86_match_cpu() to match just X86_VENDOR_INTEL") is backported to all stable kernels. And the above two broken cases are also there. So I suppose we need to backport all of them to 5.x stable kernel as well. thanks, rui > > I will submit these backports. > > Thanks and BR, > Ricardo ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [STABLE REGRESSION] Possible missing backport of x86_match_cpu() change in v6.1.96 2024-09-25 5:20 ` Zhang, Rui @ 2024-09-25 19:51 ` Ricardo Neri 0 siblings, 0 replies; 7+ messages in thread From: Ricardo Neri @ 2024-09-25 19:51 UTC (permalink / raw) To: Zhang, Rui Cc: gregkh@linuxfoundation.org, regressions@leemhuis.info, Neri, Ricardo, dave.hansen@linux.intel.com, bp@alien8.de, Gupta, Pawan Kumar, regressions@lists.linux.dev, linux-kernel@vger.kernel.org, Luck, Tony, thomas.lindroth@gmail.com, stable@vger.kernel.org On Wed, Sep 25, 2024 at 05:20:41AM +0000, Zhang, Rui wrote: > > > > > > If so, can you submit the needed backports for us to apply? That's > > > the > > > easiest way for us to take them, thanks. > > > > I audited all the uses of x86_match_cpu(match). All callers that > > construct > > the `match` argument using the family of X86_MATCH_* macros from > > arch/x86/ > > include/asm/cpu_device_id.h function correctly because the commit B > > has > > been backported to v6.1.99 and to v6.6.50 -- 93022482b294 ("x86/cpu: > > Fix > > x86_match_cpu() to match just X86_VENDOR_INTEL"). > > > > Only those callers that use their own thing to compose the `match` > > argument > > are buggy: > > * arch/x86/mm/init.c > > * drivers/powercap/intel_rapl_msr.c (only in 6.1.99) > > Thanks for auditing this. I overlooked the intel_rapl driver case. > > > > Summarizing, v6.1.99 needs these two commits from mainline > > * d05b5e0baf42 ("powercap: RAPL: fix invalid initialization for > > pl4_supported field") > > * 2eda374e883a ("x86/mm: Switch to new Intel CPU model defines") > > > > v6.6.50 only needs the second commit. > > Well, commit B 93022482b294 ("x86/cpu: Fix x86_match_cpu() to match > just X86_VENDOR_INTEL") is backported to all stable kernels. And the > above two broken cases are also there. > > So I suppose we need to backport all of them to 5.x stable kernel as > well. Indeed, this the case. It has been backported to v5.15.y and v5.10.y, but not to v5.4.y nor 4.19.y. I found one more case in those two v5.x versions. I will post the backports. ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-09-25 19:45 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <eb709d67-2a8d-412f-905d-f3777d897bfa@gmail.com>
2024-08-07 8:15 ` [STABLE REGRESSION] Possible missing backport of x86_match_cpu() change in v6.1.96 Thorsten Leemhuis
2024-08-12 12:11 ` Greg KH
2024-09-18 6:54 ` Zhang, Rui
2024-09-19 11:19 ` gregkh
2024-09-24 2:45 ` Ricardo Neri
2024-09-25 5:20 ` Zhang, Rui
2024-09-25 19:51 ` Ricardo Neri
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox