* Re: [PATCH 6.6 000/122] 6.6.28-rc1 review
[not found] <20240415141953.365222063@linuxfoundation.org>
@ 2024-04-16 10:34 ` Mark Brown
2024-04-16 11:04 ` Marc Zyngier
2024-04-16 13:07 ` Naresh Kamboju
0 siblings, 2 replies; 13+ messages in thread
From: Mark Brown @ 2024-04-16 10:34 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: stable, patches, linux-kernel, torvalds, akpm, linux, shuah,
patches, lkft-triage, pavel, jonathanh, f.fainelli,
sudipm.mukherjee, srw, rwarsow, conor, allen.lkml, Yihuang Yu,
Marc Zyngier, Gavin Shan, Catalin Marinas, Ryan Roberts,
Anshuman Khandual, Shaoqin Huang, Will Deacon, linux-arm-kernel
[-- Attachment #1.1: Type: text/plain, Size: 2298 bytes --]
On Mon, Apr 15, 2024 at 04:19:25PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 6.6.28 release.
> There are 122 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
The bisect of the boot issue that's affecting the FVP in v6.6 (only)
landed on c9ad150ed8dd988 (arm64: tlb: Fix TLBI RANGE operand),
e3ba51ab24fdd in mainline, as being the first bad commit - it's also in
the -rc for v6.8 but that seems fine. I've done no investigation beyond
the bisect and looking at the commit log to pull out people to CC and
note that the fix was explicitly targeted at v6.6.
Bisect log:
# bad: [a4e5ff3532873150dc32d20f5c214ec59f98bcd2] Linux 6.6.28-rc1
# good: [5e828009c8b380739e13da92be847f10461c38b1] Linux 6.6.27
git bisect start 'a4e5ff3532873150dc32d20f5c214ec59f98bcd2' '5e828009c8b380739e13da92be847f10461c38b1'
# bad: [a4e5ff3532873150dc32d20f5c214ec59f98bcd2] Linux 6.6.28-rc1
git bisect bad a4e5ff3532873150dc32d20f5c214ec59f98bcd2
# bad: [f95afc8867d1f2e18e0c6abd16ca76c99a2839be] net/mlx5e: HTB, Fix inconsistencies with QoS SQs number
git bisect bad f95afc8867d1f2e18e0c6abd16ca76c99a2839be
# bad: [06e82fe83cc671df58a956cd0cf8ba64c15a6d0d] scsi: qla2xxx: Fix off by one in qla_edif_app_getstats()
git bisect bad 06e82fe83cc671df58a956cd0cf8ba64c15a6d0d
# bad: [d2b5692676e7a204487546699cd5511baad5e9b6] ARM: OMAP2+: fix bogus MMC GPIO labels on Nokia N8x0
git bisect bad d2b5692676e7a204487546699cd5511baad5e9b6
# bad: [a438d050bf7ba5e3462dd61d90897569e7892c80] raid1: fix use-after-free for original bio in raid1_write_request()
git bisect bad a438d050bf7ba5e3462dd61d90897569e7892c80
# good: [6e869ee886dead911b2411c7cba816be52dffb19] ata: libata-scsi: Fix ata_scsi_dev_rescan() error path
git bisect good 6e869ee886dead911b2411c7cba816be52dffb19
# bad: [c9ad150ed8dd988d1cefc1a8e19df53d46990e76] arm64: tlb: Fix TLBI RANGE operand
git bisect bad c9ad150ed8dd988d1cefc1a8e19df53d46990e76
# good: [56a6896c1f107d519c0045dd6575648745bcba21] batman-adv: Avoid infinite loop trying to resize local TT
git bisect good 56a6896c1f107d519c0045dd6575648745bcba21
# first bad commit: [c9ad150ed8dd988d1cefc1a8e19df53d46990e76] arm64: tlb: Fix TLBI RANGE operand
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
[-- Attachment #2: Type: text/plain, Size: 176 bytes --]
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 6.6 000/122] 6.6.28-rc1 review
2024-04-16 10:34 ` [PATCH 6.6 000/122] 6.6.28-rc1 review Mark Brown
@ 2024-04-16 11:04 ` Marc Zyngier
2024-04-16 11:14 ` Mark Brown
2024-04-16 13:07 ` Naresh Kamboju
1 sibling, 1 reply; 13+ messages in thread
From: Marc Zyngier @ 2024-04-16 11:04 UTC (permalink / raw)
To: Mark Brown
Cc: Greg Kroah-Hartman, stable, patches, linux-kernel, torvalds, akpm,
linux, shuah, patches, lkft-triage, pavel, jonathanh, f.fainelli,
sudipm.mukherjee, srw, rwarsow, conor, allen.lkml, Yihuang Yu,
Gavin Shan, Catalin Marinas, Ryan Roberts, Anshuman Khandual,
Shaoqin Huang, Will Deacon, linux-arm-kernel
On Tue, 16 Apr 2024 11:34:14 +0100,
Mark Brown <broonie@kernel.org> wrote:
>
> On Mon, Apr 15, 2024 at 04:19:25PM +0200, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 6.6.28 release.
> > There are 122 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
>
> The bisect of the boot issue that's affecting the FVP in v6.6 (only)
> landed on c9ad150ed8dd988 (arm64: tlb: Fix TLBI RANGE operand),
> e3ba51ab24fdd in mainline, as being the first bad commit - it's also in
> the -rc for v6.8 but that seems fine. I've done no investigation beyond
> the bisect and looking at the commit log to pull out people to CC and
> note that the fix was explicitly targeted at v6.6.
What are the configurations of the kernel and the FVP?
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 6.6 000/122] 6.6.28-rc1 review
2024-04-16 11:04 ` Marc Zyngier
@ 2024-04-16 11:14 ` Mark Brown
0 siblings, 0 replies; 13+ messages in thread
From: Mark Brown @ 2024-04-16 11:14 UTC (permalink / raw)
To: Marc Zyngier
Cc: Greg Kroah-Hartman, stable, patches, linux-kernel, torvalds, akpm,
linux, shuah, patches, lkft-triage, pavel, jonathanh, f.fainelli,
sudipm.mukherjee, srw, rwarsow, conor, allen.lkml, Yihuang Yu,
Gavin Shan, Catalin Marinas, Ryan Roberts, Anshuman Khandual,
Shaoqin Huang, Will Deacon, linux-arm-kernel
[-- Attachment #1.1: Type: text/plain, Size: 785 bytes --]
On Tue, Apr 16, 2024 at 12:04:29PM +0100, Marc Zyngier wrote:
> Mark Brown <broonie@kernel.org> wrote:
> > The bisect of the boot issue that's affecting the FVP in v6.6 (only)
> > landed on c9ad150ed8dd988 (arm64: tlb: Fix TLBI RANGE operand),
> > e3ba51ab24fdd in mainline, as being the first bad commit - it's also in
> > the -rc for v6.8 but that seems fine. I've done no investigation beyond
> > the bisect and looking at the commit log to pull out people to CC and
> > note that the fix was explicitly targeted at v6.6.
> What are the configurations of the kernel and the FVP?
The kernel is a defconfig, the FVP arguments can be seen in the log from
the job here:
https://lava.sirena.org.uk/scheduler/job/148281#L233
(sorry, should've included that in the earlier mail.)
[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
[-- Attachment #2: Type: text/plain, Size: 176 bytes --]
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 6.6 000/122] 6.6.28-rc1 review
2024-04-16 10:34 ` [PATCH 6.6 000/122] 6.6.28-rc1 review Mark Brown
2024-04-16 11:04 ` Marc Zyngier
@ 2024-04-16 13:07 ` Naresh Kamboju
2024-04-16 13:22 ` Marc Zyngier
1 sibling, 1 reply; 13+ messages in thread
From: Naresh Kamboju @ 2024-04-16 13:07 UTC (permalink / raw)
To: Greg Kroah-Hartman, Mark Brown
Cc: stable, patches, linux-kernel, torvalds, akpm, linux, shuah,
patches, lkft-triage, pavel, jonathanh, f.fainelli,
sudipm.mukherjee, srw, rwarsow, conor, allen.lkml, Yihuang Yu,
Marc Zyngier, Gavin Shan, Catalin Marinas, Ryan Roberts,
Anshuman Khandual, Shaoqin Huang, Will Deacon, linux-arm-kernel,
Anders Roxell
On Tue, 16 Apr 2024 at 16:04, Mark Brown <broonie@kernel.org> wrote:
>
> On Mon, Apr 15, 2024 at 04:19:25PM +0200, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 6.6.28 release.
> > There are 122 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
>
> The bisect of the boot issue that's affecting the FVP in v6.6 (only)
> landed on c9ad150ed8dd988 (arm64: tlb: Fix TLBI RANGE operand),
> e3ba51ab24fdd in mainline, as being the first bad commit - it's also in
> the -rc for v6.8 but that seems fine. I've done no investigation beyond
> the bisect and looking at the commit log to pull out people to CC and
> note that the fix was explicitly targeted at v6.6.
Anders investigated this reported issues and bisected and also found
the missing commit for stable-rc 6.6 is
e2768b798a19 ("arm64/mm: Modify range-based tlbi to decrement scale")
--
Linaro LKFT
https://lkft.linaro.org
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 6.6 000/122] 6.6.28-rc1 review
2024-04-16 13:07 ` Naresh Kamboju
@ 2024-04-16 13:22 ` Marc Zyngier
2024-04-16 17:28 ` Catalin Marinas
0 siblings, 1 reply; 13+ messages in thread
From: Marc Zyngier @ 2024-04-16 13:22 UTC (permalink / raw)
To: Naresh Kamboju
Cc: Greg Kroah-Hartman, Mark Brown, stable, patches, linux-kernel,
torvalds, akpm, linux, shuah, patches, lkft-triage, pavel,
jonathanh, f.fainelli, sudipm.mukherjee, srw, rwarsow, conor,
allen.lkml, Yihuang Yu, Gavin Shan, Catalin Marinas, Ryan Roberts,
Anshuman Khandual, Shaoqin Huang, Will Deacon, linux-arm-kernel,
Anders Roxell
On Tue, 16 Apr 2024 14:07:30 +0100,
Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
>
> On Tue, 16 Apr 2024 at 16:04, Mark Brown <broonie@kernel.org> wrote:
> >
> > On Mon, Apr 15, 2024 at 04:19:25PM +0200, Greg Kroah-Hartman wrote:
> > > This is the start of the stable review cycle for the 6.6.28 release.
> > > There are 122 patches in this series, all will be posted as a response
> > > to this one. If anyone has any issues with these being applied, please
> > > let me know.
> >
> > The bisect of the boot issue that's affecting the FVP in v6.6 (only)
> > landed on c9ad150ed8dd988 (arm64: tlb: Fix TLBI RANGE operand),
> > e3ba51ab24fdd in mainline, as being the first bad commit - it's also in
> > the -rc for v6.8 but that seems fine. I've done no investigation beyond
> > the bisect and looking at the commit log to pull out people to CC and
> > note that the fix was explicitly targeted at v6.6.
>
> Anders investigated this reported issues and bisected and also found
> the missing commit for stable-rc 6.6 is
> e2768b798a19 ("arm64/mm: Modify range-based tlbi to decrement scale")
Which is definitely *not* stable candidate. We need to understand why
the invalidation goes south when the scale go up instead of down.
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 6.6 000/122] 6.6.28-rc1 review
2024-04-16 13:22 ` Marc Zyngier
@ 2024-04-16 17:28 ` Catalin Marinas
2024-04-17 7:05 ` Greg Kroah-Hartman
2024-04-18 11:07 ` Marc Zyngier
0 siblings, 2 replies; 13+ messages in thread
From: Catalin Marinas @ 2024-04-16 17:28 UTC (permalink / raw)
To: Marc Zyngier
Cc: Naresh Kamboju, Greg Kroah-Hartman, Mark Brown, stable, patches,
linux-kernel, torvalds, akpm, linux, shuah, patches, lkft-triage,
pavel, jonathanh, f.fainelli, sudipm.mukherjee, srw, rwarsow,
conor, allen.lkml, Yihuang Yu, Gavin Shan, Ryan Roberts,
Anshuman Khandual, Shaoqin Huang, Will Deacon, linux-arm-kernel,
Anders Roxell
On Tue, Apr 16, 2024 at 02:22:07PM +0100, Marc Zyngier wrote:
> On Tue, 16 Apr 2024 14:07:30 +0100,
> Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> > On Tue, 16 Apr 2024 at 16:04, Mark Brown <broonie@kernel.org> wrote:
> > > On Mon, Apr 15, 2024 at 04:19:25PM +0200, Greg Kroah-Hartman wrote:
> > > > This is the start of the stable review cycle for the 6.6.28 release.
> > > > There are 122 patches in this series, all will be posted as a response
> > > > to this one. If anyone has any issues with these being applied, please
> > > > let me know.
> > >
> > > The bisect of the boot issue that's affecting the FVP in v6.6 (only)
> > > landed on c9ad150ed8dd988 (arm64: tlb: Fix TLBI RANGE operand),
> > > e3ba51ab24fdd in mainline, as being the first bad commit - it's also in
> > > the -rc for v6.8 but that seems fine. I've done no investigation beyond
> > > the bisect and looking at the commit log to pull out people to CC and
> > > note that the fix was explicitly targeted at v6.6.
> >
> > Anders investigated this reported issues and bisected and also found
> > the missing commit for stable-rc 6.6 is
> > e2768b798a19 ("arm64/mm: Modify range-based tlbi to decrement scale")
>
> Which is definitely *not* stable candidate. We need to understand why
> the invalidation goes south when the scale go up instead of down.
If you backport e3ba51ab24fd ("arm64: tlb: Fix TLBI RANGE operand")
which fixes 117940aa6e5f ("KVM: arm64: Define
kvm_tlb_flush_vmid_range()") but without the newer e2768b798a19
("arm64/mm: Modify range-based tlbi to decrement scale"), it looks like
"scale" in __flush_tlb_range_op() goes out of range to 4. Tested on my
CBMC model, not on the actual kernel. It may be worth adding some
WARN_ONs in __flush_tlb_range_op() if scale is outside the 0..3 range or
num greater than 31.
I haven't investigated properly (and I'm off tomorrow, back on Thu) but
it's likely the original code was not very friendly to the maximum
range, never tested. Anyway, if one figures out why it goes out of
range, I think the solution is to also backport e2768b798a19 to stable.
--
Catalin
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 6.6 000/122] 6.6.28-rc1 review
2024-04-16 17:28 ` Catalin Marinas
@ 2024-04-17 7:05 ` Greg Kroah-Hartman
2024-04-17 20:06 ` Catalin Marinas
2024-04-18 11:07 ` Marc Zyngier
1 sibling, 1 reply; 13+ messages in thread
From: Greg Kroah-Hartman @ 2024-04-17 7:05 UTC (permalink / raw)
To: Catalin Marinas
Cc: Marc Zyngier, Naresh Kamboju, Mark Brown, stable, patches,
linux-kernel, torvalds, akpm, linux, shuah, patches, lkft-triage,
pavel, jonathanh, f.fainelli, sudipm.mukherjee, srw, rwarsow,
conor, allen.lkml, Yihuang Yu, Gavin Shan, Ryan Roberts,
Anshuman Khandual, Shaoqin Huang, Will Deacon, linux-arm-kernel,
Anders Roxell
On Tue, Apr 16, 2024 at 06:28:10PM +0100, Catalin Marinas wrote:
> On Tue, Apr 16, 2024 at 02:22:07PM +0100, Marc Zyngier wrote:
> > On Tue, 16 Apr 2024 14:07:30 +0100,
> > Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> > > On Tue, 16 Apr 2024 at 16:04, Mark Brown <broonie@kernel.org> wrote:
> > > > On Mon, Apr 15, 2024 at 04:19:25PM +0200, Greg Kroah-Hartman wrote:
> > > > > This is the start of the stable review cycle for the 6.6.28 release.
> > > > > There are 122 patches in this series, all will be posted as a response
> > > > > to this one. If anyone has any issues with these being applied, please
> > > > > let me know.
> > > >
> > > > The bisect of the boot issue that's affecting the FVP in v6.6 (only)
> > > > landed on c9ad150ed8dd988 (arm64: tlb: Fix TLBI RANGE operand),
> > > > e3ba51ab24fdd in mainline, as being the first bad commit - it's also in
> > > > the -rc for v6.8 but that seems fine. I've done no investigation beyond
> > > > the bisect and looking at the commit log to pull out people to CC and
> > > > note that the fix was explicitly targeted at v6.6.
> > >
> > > Anders investigated this reported issues and bisected and also found
> > > the missing commit for stable-rc 6.6 is
> > > e2768b798a19 ("arm64/mm: Modify range-based tlbi to decrement scale")
> >
> > Which is definitely *not* stable candidate. We need to understand why
> > the invalidation goes south when the scale go up instead of down.
>
> If you backport e3ba51ab24fd ("arm64: tlb: Fix TLBI RANGE operand")
> which fixes 117940aa6e5f ("KVM: arm64: Define
> kvm_tlb_flush_vmid_range()") but without the newer e2768b798a19
> ("arm64/mm: Modify range-based tlbi to decrement scale"), it looks like
> "scale" in __flush_tlb_range_op() goes out of range to 4. Tested on my
> CBMC model, not on the actual kernel. It may be worth adding some
> WARN_ONs in __flush_tlb_range_op() if scale is outside the 0..3 range or
> num greater than 31.
>
> I haven't investigated properly (and I'm off tomorrow, back on Thu) but
> it's likely the original code was not very friendly to the maximum
> range, never tested. Anyway, if one figures out why it goes out of
> range, I think the solution is to also backport e2768b798a19 to stable.
How about I drop the offending commit from stable and let you all figure
out what needs to be added before applying anything else :)
thanks,
greg k-h
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 6.6 000/122] 6.6.28-rc1 review
2024-04-17 7:05 ` Greg Kroah-Hartman
@ 2024-04-17 20:06 ` Catalin Marinas
0 siblings, 0 replies; 13+ messages in thread
From: Catalin Marinas @ 2024-04-17 20:06 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Marc Zyngier, Naresh Kamboju, Mark Brown, stable, patches,
linux-kernel, torvalds, akpm, linux, shuah, patches, lkft-triage,
pavel, jonathanh, f.fainelli, sudipm.mukherjee, srw, rwarsow,
conor, allen.lkml, Yihuang Yu, Gavin Shan, Ryan Roberts,
Anshuman Khandual, Shaoqin Huang, Will Deacon, linux-arm-kernel,
Anders Roxell
On Wed, Apr 17, 2024 at 09:05:12AM +0200, Greg Kroah-Hartman wrote:
> On Tue, Apr 16, 2024 at 06:28:10PM +0100, Catalin Marinas wrote:
> > On Tue, Apr 16, 2024 at 02:22:07PM +0100, Marc Zyngier wrote:
> > > On Tue, 16 Apr 2024 14:07:30 +0100,
> > > Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> > > > On Tue, 16 Apr 2024 at 16:04, Mark Brown <broonie@kernel.org> wrote:
> > > > > On Mon, Apr 15, 2024 at 04:19:25PM +0200, Greg Kroah-Hartman wrote:
> > > > > > This is the start of the stable review cycle for the 6.6.28 release.
> > > > > > There are 122 patches in this series, all will be posted as a response
> > > > > > to this one. If anyone has any issues with these being applied, please
> > > > > > let me know.
> > > > >
> > > > > The bisect of the boot issue that's affecting the FVP in v6.6 (only)
> > > > > landed on c9ad150ed8dd988 (arm64: tlb: Fix TLBI RANGE operand),
> > > > > e3ba51ab24fdd in mainline, as being the first bad commit - it's also in
> > > > > the -rc for v6.8 but that seems fine. I've done no investigation beyond
> > > > > the bisect and looking at the commit log to pull out people to CC and
> > > > > note that the fix was explicitly targeted at v6.6.
> > > >
> > > > Anders investigated this reported issues and bisected and also found
> > > > the missing commit for stable-rc 6.6 is
> > > > e2768b798a19 ("arm64/mm: Modify range-based tlbi to decrement scale")
> > >
> > > Which is definitely *not* stable candidate. We need to understand why
> > > the invalidation goes south when the scale go up instead of down.
> >
> > If you backport e3ba51ab24fd ("arm64: tlb: Fix TLBI RANGE operand")
> > which fixes 117940aa6e5f ("KVM: arm64: Define
> > kvm_tlb_flush_vmid_range()") but without the newer e2768b798a19
> > ("arm64/mm: Modify range-based tlbi to decrement scale"), it looks like
> > "scale" in __flush_tlb_range_op() goes out of range to 4. Tested on my
> > CBMC model, not on the actual kernel. It may be worth adding some
> > WARN_ONs in __flush_tlb_range_op() if scale is outside the 0..3 range or
> > num greater than 31.
> >
> > I haven't investigated properly (and I'm off tomorrow, back on Thu) but
> > it's likely the original code was not very friendly to the maximum
> > range, never tested. Anyway, if one figures out why it goes out of
> > range, I think the solution is to also backport e2768b798a19 to stable.
>
> How about I drop the offending commit from stable and let you all figure
> out what needs to be added before applying anything else :)
It makes sense ;). We'll send them to stable once sorted.
--
Catalin
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 6.6 000/122] 6.6.28-rc1 review
2024-04-16 17:28 ` Catalin Marinas
2024-04-17 7:05 ` Greg Kroah-Hartman
@ 2024-04-18 11:07 ` Marc Zyngier
2024-04-18 11:21 ` Catalin Marinas
1 sibling, 1 reply; 13+ messages in thread
From: Marc Zyngier @ 2024-04-18 11:07 UTC (permalink / raw)
To: Catalin Marinas
Cc: Naresh Kamboju, Greg Kroah-Hartman, Mark Brown, stable, patches,
linux-kernel, torvalds, akpm, linux, shuah, patches, lkft-triage,
pavel, jonathanh, f.fainelli, sudipm.mukherjee, srw, rwarsow,
conor, allen.lkml, Yihuang Yu, Gavin Shan, Ryan Roberts,
Anshuman Khandual, Shaoqin Huang, Will Deacon, linux-arm-kernel,
Anders Roxell
On Tue, 16 Apr 2024 18:28:10 +0100,
Catalin Marinas <catalin.marinas@arm.com> wrote:
>
> On Tue, Apr 16, 2024 at 02:22:07PM +0100, Marc Zyngier wrote:
> > On Tue, 16 Apr 2024 14:07:30 +0100,
> > Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> > > On Tue, 16 Apr 2024 at 16:04, Mark Brown <broonie@kernel.org> wrote:
> > > > On Mon, Apr 15, 2024 at 04:19:25PM +0200, Greg Kroah-Hartman wrote:
> > > > > This is the start of the stable review cycle for the 6.6.28 release.
> > > > > There are 122 patches in this series, all will be posted as a response
> > > > > to this one. If anyone has any issues with these being applied, please
> > > > > let me know.
> > > >
> > > > The bisect of the boot issue that's affecting the FVP in v6.6 (only)
> > > > landed on c9ad150ed8dd988 (arm64: tlb: Fix TLBI RANGE operand),
> > > > e3ba51ab24fdd in mainline, as being the first bad commit - it's also in
> > > > the -rc for v6.8 but that seems fine. I've done no investigation beyond
> > > > the bisect and looking at the commit log to pull out people to CC and
> > > > note that the fix was explicitly targeted at v6.6.
> > >
> > > Anders investigated this reported issues and bisected and also found
> > > the missing commit for stable-rc 6.6 is
> > > e2768b798a19 ("arm64/mm: Modify range-based tlbi to decrement scale")
> >
> > Which is definitely *not* stable candidate. We need to understand why
> > the invalidation goes south when the scale go up instead of down.
>
> If you backport e3ba51ab24fd ("arm64: tlb: Fix TLBI RANGE operand")
> which fixes 117940aa6e5f ("KVM: arm64: Define
> kvm_tlb_flush_vmid_range()") but without the newer e2768b798a19
> ("arm64/mm: Modify range-based tlbi to decrement scale"), it looks like
> "scale" in __flush_tlb_range_op() goes out of range to 4. Tested on my
> CBMC model, not on the actual kernel. It may be worth adding some
> WARN_ONs in __flush_tlb_range_op() if scale is outside the 0..3 range or
> num greater than 31.
>
> I haven't investigated properly (and I'm off tomorrow, back on Thu) but
> it's likely the original code was not very friendly to the maximum
> range, never tested. Anyway, if one figures out why it goes out of
> range, I think the solution is to also backport e2768b798a19 to stable.
I looked into this, and I came to the conclusion that this patch is
pretty much incompatible with the increasing scale (even if you cap
num to 30).
The number of pages to invalidate is a 20 bit quantity, a 5 bit slice
per scale. With the 6.6 approach (limit of num=30 and increasing
scale), we invalidate each 5 bit slice independently. After each
scale round, the corresponding slice is guaranteed to be 0.
With the 6.9 method, we invalidate the maximum possible for a given
scale. With a decreasing scale, we converge towards 0 or 1 on each
round. With an increasing scale, this breaks spectacularly, because
the strong guarantee that the remaining page count is "aligned" to
2^(5*scale+1) is not valid anymore (the low bits may not be 0).
As a result, we don't converge because we never consider these low
bits anymore, the page count doesn't decrease, scale goes past 3, and
everything catches fire.
So despite my earlier comment, it looks like picking e2768b798a19 is
the right thing to do *if* we're taking e3ba51ab24fd into 6.6-stable.
Otherwise, we need a separate fix, which Ryan initially advocating for
initially.
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 6.6 000/122] 6.6.28-rc1 review
2024-04-18 11:07 ` Marc Zyngier
@ 2024-04-18 11:21 ` Catalin Marinas
2024-04-19 10:40 ` Greg Kroah-Hartman
0 siblings, 1 reply; 13+ messages in thread
From: Catalin Marinas @ 2024-04-18 11:21 UTC (permalink / raw)
To: Marc Zyngier
Cc: Naresh Kamboju, Greg Kroah-Hartman, Mark Brown, stable, patches,
linux-kernel, torvalds, akpm, linux, shuah, patches, lkft-triage,
pavel, jonathanh, f.fainelli, sudipm.mukherjee, srw, rwarsow,
conor, allen.lkml, Yihuang Yu, Gavin Shan, Ryan Roberts,
Anshuman Khandual, Shaoqin Huang, Will Deacon, linux-arm-kernel,
Anders Roxell
On Thu, Apr 18, 2024 at 12:07:35PM +0100, Marc Zyngier wrote:
> On Tue, 16 Apr 2024 18:28:10 +0100,
> Catalin Marinas <catalin.marinas@arm.com> wrote:
> > On Tue, Apr 16, 2024 at 02:22:07PM +0100, Marc Zyngier wrote:
> > > On Tue, 16 Apr 2024 14:07:30 +0100,
> > > Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> > > > On Tue, 16 Apr 2024 at 16:04, Mark Brown <broonie@kernel.org> wrote:
> > > > > On Mon, Apr 15, 2024 at 04:19:25PM +0200, Greg Kroah-Hartman wrote:
> > > > > > This is the start of the stable review cycle for the 6.6.28 release.
> > > > > > There are 122 patches in this series, all will be posted as a response
> > > > > > to this one. If anyone has any issues with these being applied, please
> > > > > > let me know.
> > > > >
> > > > > The bisect of the boot issue that's affecting the FVP in v6.6 (only)
> > > > > landed on c9ad150ed8dd988 (arm64: tlb: Fix TLBI RANGE operand),
> > > > > e3ba51ab24fdd in mainline, as being the first bad commit - it's also in
> > > > > the -rc for v6.8 but that seems fine. I've done no investigation beyond
> > > > > the bisect and looking at the commit log to pull out people to CC and
> > > > > note that the fix was explicitly targeted at v6.6.
> > > >
> > > > Anders investigated this reported issues and bisected and also found
> > > > the missing commit for stable-rc 6.6 is
> > > > e2768b798a19 ("arm64/mm: Modify range-based tlbi to decrement scale")
> > >
> > > Which is definitely *not* stable candidate. We need to understand why
> > > the invalidation goes south when the scale go up instead of down.
> >
> > If you backport e3ba51ab24fd ("arm64: tlb: Fix TLBI RANGE operand")
> > which fixes 117940aa6e5f ("KVM: arm64: Define
> > kvm_tlb_flush_vmid_range()") but without the newer e2768b798a19
> > ("arm64/mm: Modify range-based tlbi to decrement scale"), it looks like
> > "scale" in __flush_tlb_range_op() goes out of range to 4. Tested on my
> > CBMC model, not on the actual kernel. It may be worth adding some
> > WARN_ONs in __flush_tlb_range_op() if scale is outside the 0..3 range or
> > num greater than 31.
> >
> > I haven't investigated properly (and I'm off tomorrow, back on Thu) but
> > it's likely the original code was not very friendly to the maximum
> > range, never tested. Anyway, if one figures out why it goes out of
> > range, I think the solution is to also backport e2768b798a19 to stable.
>
> I looked into this, and I came to the conclusion that this patch is
> pretty much incompatible with the increasing scale (even if you cap
> num to 30).
Thanks Marc for digging into this.
> So despite my earlier comment, it looks like picking e2768b798a19 is
> the right thing to do *if* we're taking e3ba51ab24fd into 6.6-stable.
>
> Otherwise, we need a separate fix, which Ryan initially advocating for
> initially.
My preference would be to cherry-pick the two upstream commits than
coming up with an alternative fix for 6.6.
--
Catalin
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 6.6 000/122] 6.6.28-rc1 review
2024-04-18 11:21 ` Catalin Marinas
@ 2024-04-19 10:40 ` Greg Kroah-Hartman
2024-04-19 10:50 ` Marc Zyngier
0 siblings, 1 reply; 13+ messages in thread
From: Greg Kroah-Hartman @ 2024-04-19 10:40 UTC (permalink / raw)
To: Catalin Marinas
Cc: Marc Zyngier, Naresh Kamboju, Mark Brown, stable, patches,
linux-kernel, torvalds, akpm, linux, shuah, patches, lkft-triage,
pavel, jonathanh, f.fainelli, sudipm.mukherjee, srw, rwarsow,
conor, allen.lkml, Yihuang Yu, Gavin Shan, Ryan Roberts,
Anshuman Khandual, Shaoqin Huang, Will Deacon, linux-arm-kernel,
Anders Roxell
On Thu, Apr 18, 2024 at 12:21:17PM +0100, Catalin Marinas wrote:
> On Thu, Apr 18, 2024 at 12:07:35PM +0100, Marc Zyngier wrote:
> > On Tue, 16 Apr 2024 18:28:10 +0100,
> > Catalin Marinas <catalin.marinas@arm.com> wrote:
> > > On Tue, Apr 16, 2024 at 02:22:07PM +0100, Marc Zyngier wrote:
> > > > On Tue, 16 Apr 2024 14:07:30 +0100,
> > > > Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> > > > > On Tue, 16 Apr 2024 at 16:04, Mark Brown <broonie@kernel.org> wrote:
> > > > > > On Mon, Apr 15, 2024 at 04:19:25PM +0200, Greg Kroah-Hartman wrote:
> > > > > > > This is the start of the stable review cycle for the 6.6.28 release.
> > > > > > > There are 122 patches in this series, all will be posted as a response
> > > > > > > to this one. If anyone has any issues with these being applied, please
> > > > > > > let me know.
> > > > > >
> > > > > > The bisect of the boot issue that's affecting the FVP in v6.6 (only)
> > > > > > landed on c9ad150ed8dd988 (arm64: tlb: Fix TLBI RANGE operand),
> > > > > > e3ba51ab24fdd in mainline, as being the first bad commit - it's also in
> > > > > > the -rc for v6.8 but that seems fine. I've done no investigation beyond
> > > > > > the bisect and looking at the commit log to pull out people to CC and
> > > > > > note that the fix was explicitly targeted at v6.6.
> > > > >
> > > > > Anders investigated this reported issues and bisected and also found
> > > > > the missing commit for stable-rc 6.6 is
> > > > > e2768b798a19 ("arm64/mm: Modify range-based tlbi to decrement scale")
> > > >
> > > > Which is definitely *not* stable candidate. We need to understand why
> > > > the invalidation goes south when the scale go up instead of down.
> > >
> > > If you backport e3ba51ab24fd ("arm64: tlb: Fix TLBI RANGE operand")
> > > which fixes 117940aa6e5f ("KVM: arm64: Define
> > > kvm_tlb_flush_vmid_range()") but without the newer e2768b798a19
> > > ("arm64/mm: Modify range-based tlbi to decrement scale"), it looks like
> > > "scale" in __flush_tlb_range_op() goes out of range to 4. Tested on my
> > > CBMC model, not on the actual kernel. It may be worth adding some
> > > WARN_ONs in __flush_tlb_range_op() if scale is outside the 0..3 range or
> > > num greater than 31.
> > >
> > > I haven't investigated properly (and I'm off tomorrow, back on Thu) but
> > > it's likely the original code was not very friendly to the maximum
> > > range, never tested. Anyway, if one figures out why it goes out of
> > > range, I think the solution is to also backport e2768b798a19 to stable.
> >
> > I looked into this, and I came to the conclusion that this patch is
> > pretty much incompatible with the increasing scale (even if you cap
> > num to 30).
>
> Thanks Marc for digging into this.
>
> > So despite my earlier comment, it looks like picking e2768b798a19 is
> > the right thing to do *if* we're taking e3ba51ab24fd into 6.6-stable.
> >
> > Otherwise, we need a separate fix, which Ryan initially advocating for
> > initially.
>
> My preference would be to cherry-pick the two upstream commits than
> coming up with an alternative fix for 6.6.
To be specific, which 2 commits, and what order?
thanks,
greg k-h
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 6.6 000/122] 6.6.28-rc1 review
2024-04-19 10:40 ` Greg Kroah-Hartman
@ 2024-04-19 10:50 ` Marc Zyngier
2024-04-19 11:05 ` Greg Kroah-Hartman
0 siblings, 1 reply; 13+ messages in thread
From: Marc Zyngier @ 2024-04-19 10:50 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Catalin Marinas, Naresh Kamboju, Mark Brown, stable, patches,
linux-kernel, torvalds, akpm, linux, shuah, patches, lkft-triage,
pavel, jonathanh, f.fainelli, sudipm.mukherjee, srw, rwarsow,
conor, allen.lkml, Yihuang Yu, Gavin Shan, Ryan Roberts,
Anshuman Khandual, Shaoqin Huang, Will Deacon, linux-arm-kernel,
Anders Roxell
On Fri, 19 Apr 2024 11:40:33 +0100,
Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
>
> On Thu, Apr 18, 2024 at 12:21:17PM +0100, Catalin Marinas wrote:
> > On Thu, Apr 18, 2024 at 12:07:35PM +0100, Marc Zyngier wrote:
> > > On Tue, 16 Apr 2024 18:28:10 +0100,
> > > Catalin Marinas <catalin.marinas@arm.com> wrote:
> > > > On Tue, Apr 16, 2024 at 02:22:07PM +0100, Marc Zyngier wrote:
> > > > > On Tue, 16 Apr 2024 14:07:30 +0100,
> > > > > Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> > > > > > On Tue, 16 Apr 2024 at 16:04, Mark Brown <broonie@kernel.org> wrote:
> > > > > > > On Mon, Apr 15, 2024 at 04:19:25PM +0200, Greg Kroah-Hartman wrote:
> > > > > > > > This is the start of the stable review cycle for the 6.6.28 release.
> > > > > > > > There are 122 patches in this series, all will be posted as a response
> > > > > > > > to this one. If anyone has any issues with these being applied, please
> > > > > > > > let me know.
> > > > > > >
> > > > > > > The bisect of the boot issue that's affecting the FVP in v6.6 (only)
> > > > > > > landed on c9ad150ed8dd988 (arm64: tlb: Fix TLBI RANGE operand),
> > > > > > > e3ba51ab24fdd in mainline, as being the first bad commit - it's also in
> > > > > > > the -rc for v6.8 but that seems fine. I've done no investigation beyond
> > > > > > > the bisect and looking at the commit log to pull out people to CC and
> > > > > > > note that the fix was explicitly targeted at v6.6.
> > > > > >
> > > > > > Anders investigated this reported issues and bisected and also found
> > > > > > the missing commit for stable-rc 6.6 is
> > > > > > e2768b798a19 ("arm64/mm: Modify range-based tlbi to decrement scale")
> > > > >
> > > > > Which is definitely *not* stable candidate. We need to understand why
> > > > > the invalidation goes south when the scale go up instead of down.
> > > >
> > > > If you backport e3ba51ab24fd ("arm64: tlb: Fix TLBI RANGE operand")
> > > > which fixes 117940aa6e5f ("KVM: arm64: Define
> > > > kvm_tlb_flush_vmid_range()") but without the newer e2768b798a19
> > > > ("arm64/mm: Modify range-based tlbi to decrement scale"), it looks like
> > > > "scale" in __flush_tlb_range_op() goes out of range to 4. Tested on my
> > > > CBMC model, not on the actual kernel. It may be worth adding some
> > > > WARN_ONs in __flush_tlb_range_op() if scale is outside the 0..3 range or
> > > > num greater than 31.
> > > >
> > > > I haven't investigated properly (and I'm off tomorrow, back on Thu) but
> > > > it's likely the original code was not very friendly to the maximum
> > > > range, never tested. Anyway, if one figures out why it goes out of
> > > > range, I think the solution is to also backport e2768b798a19 to stable.
> > >
> > > I looked into this, and I came to the conclusion that this patch is
> > > pretty much incompatible with the increasing scale (even if you cap
> > > num to 30).
> >
> > Thanks Marc for digging into this.
> >
> > > So despite my earlier comment, it looks like picking e2768b798a19 is
> > > the right thing to do *if* we're taking e3ba51ab24fd into 6.6-stable.
> > >
> > > Otherwise, we need a separate fix, which Ryan initially advocating for
> > > initially.
> >
> > My preference would be to cherry-pick the two upstream commits than
> > coming up with an alternative fix for 6.6.
>
> To be specific, which 2 commits, and what order?
That'd be:
e2768b798a19 ("arm64/mm: Modify range-based tlbi to decrement scale")
followed by:
e3ba51ab24fd ("arm64: tlb: Fix TLBI RANGE operand")
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 6.6 000/122] 6.6.28-rc1 review
2024-04-19 10:50 ` Marc Zyngier
@ 2024-04-19 11:05 ` Greg Kroah-Hartman
0 siblings, 0 replies; 13+ messages in thread
From: Greg Kroah-Hartman @ 2024-04-19 11:05 UTC (permalink / raw)
To: Marc Zyngier
Cc: Catalin Marinas, Naresh Kamboju, Mark Brown, stable, patches,
linux-kernel, torvalds, akpm, linux, shuah, patches, lkft-triage,
pavel, jonathanh, f.fainelli, sudipm.mukherjee, srw, rwarsow,
conor, allen.lkml, Yihuang Yu, Gavin Shan, Ryan Roberts,
Anshuman Khandual, Shaoqin Huang, Will Deacon, linux-arm-kernel,
Anders Roxell
On Fri, Apr 19, 2024 at 11:50:14AM +0100, Marc Zyngier wrote:
> On Fri, 19 Apr 2024 11:40:33 +0100,
> Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> >
> > On Thu, Apr 18, 2024 at 12:21:17PM +0100, Catalin Marinas wrote:
> > > On Thu, Apr 18, 2024 at 12:07:35PM +0100, Marc Zyngier wrote:
> > > > On Tue, 16 Apr 2024 18:28:10 +0100,
> > > > Catalin Marinas <catalin.marinas@arm.com> wrote:
> > > > > On Tue, Apr 16, 2024 at 02:22:07PM +0100, Marc Zyngier wrote:
> > > > > > On Tue, 16 Apr 2024 14:07:30 +0100,
> > > > > > Naresh Kamboju <naresh.kamboju@linaro.org> wrote:
> > > > > > > On Tue, 16 Apr 2024 at 16:04, Mark Brown <broonie@kernel.org> wrote:
> > > > > > > > On Mon, Apr 15, 2024 at 04:19:25PM +0200, Greg Kroah-Hartman wrote:
> > > > > > > > > This is the start of the stable review cycle for the 6.6.28 release.
> > > > > > > > > There are 122 patches in this series, all will be posted as a response
> > > > > > > > > to this one. If anyone has any issues with these being applied, please
> > > > > > > > > let me know.
> > > > > > > >
> > > > > > > > The bisect of the boot issue that's affecting the FVP in v6.6 (only)
> > > > > > > > landed on c9ad150ed8dd988 (arm64: tlb: Fix TLBI RANGE operand),
> > > > > > > > e3ba51ab24fdd in mainline, as being the first bad commit - it's also in
> > > > > > > > the -rc for v6.8 but that seems fine. I've done no investigation beyond
> > > > > > > > the bisect and looking at the commit log to pull out people to CC and
> > > > > > > > note that the fix was explicitly targeted at v6.6.
> > > > > > >
> > > > > > > Anders investigated this reported issues and bisected and also found
> > > > > > > the missing commit for stable-rc 6.6 is
> > > > > > > e2768b798a19 ("arm64/mm: Modify range-based tlbi to decrement scale")
> > > > > >
> > > > > > Which is definitely *not* stable candidate. We need to understand why
> > > > > > the invalidation goes south when the scale go up instead of down.
> > > > >
> > > > > If you backport e3ba51ab24fd ("arm64: tlb: Fix TLBI RANGE operand")
> > > > > which fixes 117940aa6e5f ("KVM: arm64: Define
> > > > > kvm_tlb_flush_vmid_range()") but without the newer e2768b798a19
> > > > > ("arm64/mm: Modify range-based tlbi to decrement scale"), it looks like
> > > > > "scale" in __flush_tlb_range_op() goes out of range to 4. Tested on my
> > > > > CBMC model, not on the actual kernel. It may be worth adding some
> > > > > WARN_ONs in __flush_tlb_range_op() if scale is outside the 0..3 range or
> > > > > num greater than 31.
> > > > >
> > > > > I haven't investigated properly (and I'm off tomorrow, back on Thu) but
> > > > > it's likely the original code was not very friendly to the maximum
> > > > > range, never tested. Anyway, if one figures out why it goes out of
> > > > > range, I think the solution is to also backport e2768b798a19 to stable.
> > > >
> > > > I looked into this, and I came to the conclusion that this patch is
> > > > pretty much incompatible with the increasing scale (even if you cap
> > > > num to 30).
> > >
> > > Thanks Marc for digging into this.
> > >
> > > > So despite my earlier comment, it looks like picking e2768b798a19 is
> > > > the right thing to do *if* we're taking e3ba51ab24fd into 6.6-stable.
> > > >
> > > > Otherwise, we need a separate fix, which Ryan initially advocating for
> > > > initially.
> > >
> > > My preference would be to cherry-pick the two upstream commits than
> > > coming up with an alternative fix for 6.6.
> >
> > To be specific, which 2 commits, and what order?
>
> That'd be:
>
> e2768b798a19 ("arm64/mm: Modify range-based tlbi to decrement scale")
>
> followed by:
>
> e3ba51ab24fd ("arm64: tlb: Fix TLBI RANGE operand")
Thanks, now queued up.
greg k-h
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2024-04-19 11:05 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20240415141953.365222063@linuxfoundation.org>
2024-04-16 10:34 ` [PATCH 6.6 000/122] 6.6.28-rc1 review Mark Brown
2024-04-16 11:04 ` Marc Zyngier
2024-04-16 11:14 ` Mark Brown
2024-04-16 13:07 ` Naresh Kamboju
2024-04-16 13:22 ` Marc Zyngier
2024-04-16 17:28 ` Catalin Marinas
2024-04-17 7:05 ` Greg Kroah-Hartman
2024-04-17 20:06 ` Catalin Marinas
2024-04-18 11:07 ` Marc Zyngier
2024-04-18 11:21 ` Catalin Marinas
2024-04-19 10:40 ` Greg Kroah-Hartman
2024-04-19 10:50 ` Marc Zyngier
2024-04-19 11:05 ` Greg Kroah-Hartman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).