* Re: [git pull] d_revalidate pile
[not found] ` <Z5fAOpnFoXMgpCWb@lappy>
@ 2025-01-27 19:12 ` Linus Torvalds
2025-01-27 20:38 ` Mark Brown
0 siblings, 1 reply; 8+ messages in thread
From: Linus Torvalds @ 2025-01-27 19:12 UTC (permalink / raw)
To: Sasha Levin, kernelci; +Cc: Al Viro, linux-fsdevel, linux-kernel
On Mon, 27 Jan 2025 at 09:19, Sasha Levin <sashal@kernel.org> wrote:
>
> With this pulled on top of Linus's tree, LKFT is managing to trigger
> kfence warnings:
>
> <3>[ 62.180289] BUG: KFENCE: out-of-bounds read in d_same_name+0x4c/0xd0
> <3>[ 62.180289]
> <3>[ 62.182647] Out-of-bounds read at 0x00000000eedd4b55 (64B right of kfence-#174):
> <4>[ 62.184178] d_same_name+0x4c/0xd0
Bah. I've said this before, but I really wish LKFT would use debug
builds and run the warnings through scripts/decode_stacktrace.sh.
Getting filenames and line numbers (and inlining information!) for
stack traces can be really really useful.
I think you are using KernelCI builds (at least that was the case last
time), and apparently they are non-debug builds. And that's possibly
due to just resource issues (the debug info does take a lot more disk
space and makes link times much longer too). So it might not be easily
fixable.
But let's see if it might be an option to get this capability. So I'm
adding the kernelci list to see if somebody goes "Oh, that was just an
oversight" and might easily be made to happen. Fingers crossed.
Linus
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [git pull] d_revalidate pile
2025-01-27 19:12 ` [git pull] d_revalidate pile Linus Torvalds
@ 2025-01-27 20:38 ` Mark Brown
2025-01-27 22:32 ` Sasha Levin
2025-01-28 9:19 ` Guillaume Tucker
0 siblings, 2 replies; 8+ messages in thread
From: Mark Brown @ 2025-01-27 20:38 UTC (permalink / raw)
To: Linus Torvalds
Cc: Sasha Levin, kernelci, Al Viro, linux-fsdevel, linux-kernel
[-- Attachment #1: Type: text/plain, Size: 2653 bytes --]
On Mon, Jan 27, 2025 at 11:12:11AM -0800, Linus Torvalds wrote:
> On Mon, 27 Jan 2025 at 09:19, Sasha Levin <sashal@kernel.org> wrote:
> > With this pulled on top of Linus's tree, LKFT is managing to trigger
> > kfence warnings:
> > <3>[ 62.180289] BUG: KFENCE: out-of-bounds read in d_same_name+0x4c/0xd0
> > <3>[ 62.180289]
> > <3>[ 62.182647] Out-of-bounds read at 0x00000000eedd4b55 (64B right of kfence-#174):
> > <4>[ 62.184178] d_same_name+0x4c/0xd0
> Bah. I've said this before, but I really wish LKFT would use debug
> builds and run the warnings through scripts/decode_stacktrace.sh.
> Getting filenames and line numbers (and inlining information!) for
> stack traces can be really really useful.
> I think you are using KernelCI builds (at least that was the case last
> time), and apparently they are non-debug builds. And that's possibly
> due to just resource issues (the debug info does take a lot more disk
> space and makes link times much longer too). So it might not be easily
> fixable.
They're not, they're using their own builds done with their tuxsuite
service which is a cloud front end for their tuxmake tool, that does
have the ability to save the vmlinux. Poking around the LKFT output it
does look like they're doing that for the LKFT builds:
https://qa-reports.linaro.org/lkft/sashal-linus-next/build/v6.13-rc7-8584-gd4639f3659ae/testrun/27027254/suite/build/test/gcc-13-tinyconfig/details/
https://storage.tuxsuite.com/public/linaro/lkft/builds/2sDW1jDhjHPNl1XNezFhsjSlvpI/
so hopefully the information is all there and it's just a question of
people doing the decode when reporting issues from LKFT.
> But let's see if it might be an option to get this capability. So I'm
> adding the kernelci list to see if somebody goes "Oh, that was just an
> oversight" and might easily be made to happen. Fingers crossed.
The issue with KernelCI has been that it's not storing the vmlinux, this
was indeed done due to space issues like you suggest. With the new
infrastructure that's been rolled out as part of the KernelCI 2.0 revamp
the storage should be a lot more scaleable and so this should hopefully
be a cost issue rather than actual space limits like it used to be so
more tractable. AFAICT we haven't actually revisited making the
required changes to include the vmlinux in the stored output though, I
filed a ticket:
https://github.com/kernelci/kernelci-project/issues/509
The builds themselves are generally using standard defconfigs and
derivatives of that so will normally have enough debug info for
decode_stacktrace.sh. Where they don't we should probably just change
that upstream.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [git pull] d_revalidate pile
2025-01-27 20:38 ` Mark Brown
@ 2025-01-27 22:32 ` Sasha Levin
2025-01-28 12:14 ` Mark Brown
2025-01-28 12:33 ` Dan Carpenter
2025-01-28 9:19 ` Guillaume Tucker
1 sibling, 2 replies; 8+ messages in thread
From: Sasha Levin @ 2025-01-27 22:32 UTC (permalink / raw)
To: Mark Brown
Cc: Linus Torvalds, kernelci, Al Viro, linux-fsdevel, linux-kernel,
lkft
[ Adding in the LKFT folks ]
On Mon, Jan 27, 2025 at 08:38:50PM +0000, Mark Brown wrote:
>On Mon, Jan 27, 2025 at 11:12:11AM -0800, Linus Torvalds wrote:
>> On Mon, 27 Jan 2025 at 09:19, Sasha Levin <sashal@kernel.org> wrote:
>
>> > With this pulled on top of Linus's tree, LKFT is managing to trigger
>> > kfence warnings:
>
>> > <3>[ 62.180289] BUG: KFENCE: out-of-bounds read in d_same_name+0x4c/0xd0
>> > <3>[ 62.180289]
>> > <3>[ 62.182647] Out-of-bounds read at 0x00000000eedd4b55 (64B right of kfence-#174):
>> > <4>[ 62.184178] d_same_name+0x4c/0xd0
>
>> Bah. I've said this before, but I really wish LKFT would use debug
>> builds and run the warnings through scripts/decode_stacktrace.sh.
>
>> Getting filenames and line numbers (and inlining information!) for
>> stack traces can be really really useful.
>
>> I think you are using KernelCI builds (at least that was the case last
>> time), and apparently they are non-debug builds. And that's possibly
>> due to just resource issues (the debug info does take a lot more disk
>> space and makes link times much longer too). So it might not be easily
>> fixable.
>
>They're not, they're using their own builds done with their tuxsuite
>service which is a cloud front end for their tuxmake tool, that does
>have the ability to save the vmlinux. Poking around the LKFT output it
>does look like they're doing that for the LKFT builds:
>
> https://qa-reports.linaro.org/lkft/sashal-linus-next/build/v6.13-rc7-8584-gd4639f3659ae/testrun/27027254/suite/build/test/gcc-13-tinyconfig/details/
> https://storage.tuxsuite.com/public/linaro/lkft/builds/2sDW1jDhjHPNl1XNezFhsjSlvpI/
>
>so hopefully the information is all there and it's just a question of
>people doing the decode when reporting issues from LKFT.
My understanding was that becuase CONFIG_DEBUG_INFO_NONE=y is set, we
actually don't have enough info to resolve line numbers.
I've tried running decode_stacktrace.sh on the vmlinux image linked
above, and indeed we can't get line numbers.
--
Thanks,
Sasha
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [git pull] d_revalidate pile
2025-01-27 20:38 ` Mark Brown
2025-01-27 22:32 ` Sasha Levin
@ 2025-01-28 9:19 ` Guillaume Tucker
1 sibling, 0 replies; 8+ messages in thread
From: Guillaume Tucker @ 2025-01-28 9:19 UTC (permalink / raw)
To: Mark Brown, Linus Torvalds
Cc: Sasha Levin, kernelci, Al Viro, linux-fsdevel, linux-kernel
On 27/01/2025 9:38 pm, Mark Brown wrote:
>> But let's see if it might be an option to get this capability. So I'm
>> adding the kernelci list to see if somebody goes "Oh, that was just an
>> oversight" and might easily be made to happen. Fingers crossed.
> The issue with KernelCI has been that it's not storing the vmlinux, this
> was indeed done due to space issues like you suggest. With the new
> infrastructure that's been rolled out as part of the KernelCI 2.0 revamp
> the storage should be a lot more scaleable and so this should hopefully
> be a cost issue rather than actual space limits like it used to be so
> more tractable. AFAICT we haven't actually revisited making the
> required changes to include the vmlinux in the stored output though, I
> filed a ticket:
>
> https://github.com/kernelci/kernelci-project/issues/509
>
> The builds themselves are generally using standard defconfigs and
> derivatives of that so will normally have enough debug info for
> decode_stacktrace.sh. Where they don't we should probably just change
> that upstream.
One approach that was suggested a while ago was to do extra debug
builds in automated post-processing jobs whenever a failure is
detected. This came as an evolution of the automated bisection
which had checks for the good and bad revisions: if a stacktrace
was found while testing the "bad" kernel then it could easily be
decoded since bisections do incremental builds and keep the
vmlinux at hand.
As Sasha mentioned in his email, some particular configs are
required in order to decode the stacktrace (IIRC this is enabled
with arm64_defconfig but not x86). Debug builds also make larger
binaries and affect runtime behaviour, as we all know. So one
post-processing check would be to do a special debug build with
the right configs for decoding stacktraces as well as maybe some
sanitizers and extra useful things to add more information.
Builds from bisections or any extra jobs should still be uploaded
to public storage so they would be available for manual
investigation too. That way, the impact on storage costs and
compute resources would be minimal without any real drawback - it
might take 30min to get the post-processing job to complete but
even that could be optimized and it seems a lot more efficient
than doing debug builds and uploading large vmlinux images all
the time.
Hope this helps!
Cheers,
Guillaume
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [git pull] d_revalidate pile
2025-01-27 22:32 ` Sasha Levin
@ 2025-01-28 12:14 ` Mark Brown
2025-01-28 12:43 ` Dan Carpenter
2025-01-28 12:33 ` Dan Carpenter
1 sibling, 1 reply; 8+ messages in thread
From: Mark Brown @ 2025-01-28 12:14 UTC (permalink / raw)
To: Sasha Levin
Cc: Linus Torvalds, kernelci, Al Viro, linux-fsdevel, linux-kernel,
lkft
[-- Attachment #1: Type: text/plain, Size: 1755 bytes --]
On Mon, Jan 27, 2025 at 05:32:18PM -0500, Sasha Levin wrote:
> [ Adding in the LKFT folks ]
Oops, sorry - didn't realise they weren't already on the report since it
was on LKFT or I'd have done the same.
> On Mon, Jan 27, 2025 at 08:38:50PM +0000, Mark Brown wrote:
> > have the ability to save the vmlinux. Poking around the LKFT output it
> > does look like they're doing that for the LKFT builds:
> My understanding was that becuase CONFIG_DEBUG_INFO_NONE=y is set, we
> actually don't have enough info to resolve line numbers.
The arm64 and arm defconfigs which are the main ones I'd end up looking
at both set CONFIG_DEBUG_INFO (_REDUCED in the case of arm64), the trace
you posted was from arm64 so unless it was some config that overrode
things there ought to be info. x86_64 which I guess you might use more
indeed doesn't have it.
> I've tried running decode_stacktrace.sh on the vmlinux image linked
> above, and indeed we can't get line numbers.
That was a random build I pulled out which turns out to be a tinyconfig
rather than the specific build that was used - if we look at an arm64
defconfig (your trace looked to be from arm64):
https://qa-reports.linaro.org/lkft/sashal-linus-next/build/v6.13-rc7-8584-gd4639f3659ae/testrun/27028517/suite/build/test/clang-nightly-defconfig-40bc7ee5/details/
https://storage.tuxsuite.com/public/linaro/lkft/builds/2sDW1oYDQrsEOOs4L6yoysbu9aS/
I'm able to decode (just feeding a random log message in, no idea what
specific build generated the log message so the line number is almost
certainly wrong):
$ echo [ 62.184178] d_same_name+0x4c/0xd0 | ./scripts/decode_stacktrace.sh /tmp/vmlinux
[ 62.184178] d_same_name (fs/dcache.c:2127)
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [git pull] d_revalidate pile
2025-01-27 22:32 ` Sasha Levin
2025-01-28 12:14 ` Mark Brown
@ 2025-01-28 12:33 ` Dan Carpenter
2025-01-28 19:24 ` Sasha Levin
1 sibling, 1 reply; 8+ messages in thread
From: Dan Carpenter @ 2025-01-28 12:33 UTC (permalink / raw)
To: Sasha Levin
Cc: Mark Brown, Linus Torvalds, kernelci, Al Viro, linux-fsdevel,
linux-kernel, lkft
On Mon, Jan 27, 2025 at 05:32:18PM -0500, Sasha Levin wrote:
> [ Adding in the LKFT folks ]
Ugh... The website is pretty difficult to navigate. I've filed a
ticket to hopefully avoid this going forward. It's a bit late for
the line numbers to be any use but here they are:
Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
<3>[ 62.179009] ==================================================================
<3>[ 62.180289] BUG: KFENCE: out-of-bounds read in d_same_name (include/asm-generic/rwonce.h:86 fs/dcache.c:243 fs/dcache.c:295 fs/dcache.c:2129)
<3>[ 62.180289]
<3>[ 62.182647] Out-of-bounds read at 0x00000000eedd4b55 (64B right of kfence-#174):
<4>[ 62.184178] d_same_name (include/asm-generic/rwonce.h:86 fs/dcache.c:243 fs/dcache.c:295 fs/dcache.c:2129)
<4>[ 62.184717] d_lookup (fs/dcache.c:2292)
<4>[ 62.185378] lookup_dcache (fs/namei.c:1654)
<4>[ 62.185980] lookup_one_qstr_excl (fs/namei.c:1678)
<4>[ 62.186523] do_renameat2 (fs/namei.c:5167)
<4>[ 62.186948] __arm64_sys_renameat (fs/namei.c:5264)
<4>[ 62.187484] invoke_syscall (arch/arm64/include/asm/current.h:19 arch/arm64/kernel/syscall.c:54)
<4>[ 62.188220] el0_svc_common.constprop.0 (include/linux/thread_info.h:135 (discriminator 2) arch/arm64/kernel/syscall.c:140 (discriminator 2))
<4>[ 62.189031] do_el0_svc_compat (arch/arm64/kernel/syscall.c:159)
<4>[ 62.189635] el0_svc_compat (arch/arm64/include/asm/irqflags.h:82 (discriminator 1) arch/arm64/include/asm/irqflags.h:123 (discriminator 1) arch/arm64/include/asm/irqflags.h:136 (discriminator 1) arch/arm64/kernel/entry-common.c:165 (discriminator 1) arch/arm64/kernel/entry-common.c:178 (discriminator 1) arch/arm64/kernel/entry-common.c:888 (discriminator 1))
<4>[ 62.190018] el0t_32_sync_handler (arch/arm64/kernel/entry-common.c:933)
<4>[ 62.190537] el0t_32_sync (arch/arm64/kernel/entry.S:605)
<3>[ 62.190946]
<4>[ 62.191399] kfence-#174: 0x0000000012d508d5-0x0000000023355f7e, size=64, cache=kmalloc-rcl-64
<4>[ 62.191399]
<4>[ 62.192260] allocated by task 1 on cpu 0 at 62.177313s (0.014839s ago):
<4>[ 62.193504] __d_alloc (fs/dcache.c:1678)
<4>[ 62.193925] d_alloc (fs/dcache.c:1737)
<4>[ 62.194204] lookup_one_qstr_excl (fs/namei.c:1689)
<4>[ 62.194741] filename_create (fs/namei.c:4083)
<4>[ 62.195129] do_symlinkat (fs/namei.c:4690)
<4>[ 62.195657] __arm64_sys_symlinkat (fs/namei.c:4710)
<4>[ 62.195954] invoke_syscall (arch/arm64/include/asm/current.h:19 arch/arm64/kernel/syscall.c:54)
<4>[ 62.196461] el0_svc_common.constprop.0 (include/linux/thread_info.h:135 (discriminator 2) arch/arm64/kernel/syscall.c:140 (discriminator 2))
<4>[ 62.197053] do_el0_svc_compat (arch/arm64/kernel/syscall.c:159)
<4>[ 62.197411] el0_svc_compat (arch/arm64/include/asm/irqflags.h:82 (discriminator 1) arch/arm64/include/asm/irqflags.h:123 (discriminator 1) arch/arm64/include/asm/irqflags.h:136 (discriminator 1) arch/arm64/kernel/entry-common.c:165 (discriminator 1) arch/arm64/kernel/entry-common.c:178 (discriminator 1) arch/arm64/kernel/entry-common.c:888 (discriminator 1))
<4>[ 62.197849] el0t_32_sync_handler (arch/arm64/kernel/entry-common.c:933)
<4>[ 62.198422] el0t_32_sync (arch/arm64/kernel/entry.S:605)
<3>[ 62.198857]
<3>[ 62.199577] CPU: 0 UID: 0 PID: 1 Comm: systemd Not tainted 6.13.0 #1
<3>[ 62.200435] Hardware name: linux,dummy-virt (DT)
<3>[ 62.201130] ==================================================================
[?2004hroot@runner-vwmj3eza-project-40964107-concurrent-3:~#
regards,
dan carpenter
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [git pull] d_revalidate pile
2025-01-28 12:14 ` Mark Brown
@ 2025-01-28 12:43 ` Dan Carpenter
0 siblings, 0 replies; 8+ messages in thread
From: Dan Carpenter @ 2025-01-28 12:43 UTC (permalink / raw)
To: Mark Brown
Cc: Sasha Levin, Linus Torvalds, kernelci, Al Viro, linux-fsdevel,
linux-kernel, lkft
On Tue, Jan 28, 2025 at 12:14:07PM +0000, 'Mark Brown' via lkft wrote:
> I'm able to decode (just feeding a random log message in, no idea what
> specific build generated the log message so the line number is almost
> certainly wrong):
>
All the kernels that we're planning to boot have the DEBUG_INFO enabled.
Here is the config and the vmlinux.xz for this one.
https://storage.tuxsuite.com/public/linaro/lkft/builds/2sDW1u8fB268uU3L32J8FqAxYYR/
I don't want to explain how to find this URL because I've filed a ticket
to make it prominent so the instructions will change soon. And ideally
we would just have the line numbers on the webpage dmesg itself.
regards,
dan carpenter
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [git pull] d_revalidate pile
2025-01-28 12:33 ` Dan Carpenter
@ 2025-01-28 19:24 ` Sasha Levin
0 siblings, 0 replies; 8+ messages in thread
From: Sasha Levin @ 2025-01-28 19:24 UTC (permalink / raw)
To: Dan Carpenter
Cc: Mark Brown, Linus Torvalds, kernelci, Al Viro, linux-fsdevel,
linux-kernel, lkft
On Tue, Jan 28, 2025 at 03:33:14PM +0300, Dan Carpenter wrote:
>On Mon, Jan 27, 2025 at 05:32:18PM -0500, Sasha Levin wrote:
>> [ Adding in the LKFT folks ]
>
>Ugh... The website is pretty difficult to navigate. I've filed a
>ticket to hopefully avoid this going forward. It's a bit late for
>the line numbers to be any use but here they are:
Thanks Dan & Mark! I think I've figured out (and scripted) it for next
time :)
--
Thanks,
Sasha
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-01-28 19:24 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20250127044721.GD1977892@ZenIV>
[not found] ` <Z5fAOpnFoXMgpCWb@lappy>
2025-01-27 19:12 ` [git pull] d_revalidate pile Linus Torvalds
2025-01-27 20:38 ` Mark Brown
2025-01-27 22:32 ` Sasha Levin
2025-01-28 12:14 ` Mark Brown
2025-01-28 12:43 ` Dan Carpenter
2025-01-28 12:33 ` Dan Carpenter
2025-01-28 19:24 ` Sasha Levin
2025-01-28 9:19 ` Guillaume Tucker
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox