* Re: [git pull] d_revalidate pile [not found] ` <Z5fAOpnFoXMgpCWb@lappy> @ 2025-01-27 19:12 ` Linus Torvalds 2025-01-27 20:38 ` Mark Brown 0 siblings, 1 reply; 8+ messages in thread From: Linus Torvalds @ 2025-01-27 19:12 UTC (permalink / raw) To: Sasha Levin, kernelci; +Cc: Al Viro, linux-fsdevel, linux-kernel On Mon, 27 Jan 2025 at 09:19, Sasha Levin <sashal@kernel.org> wrote: > > With this pulled on top of Linus's tree, LKFT is managing to trigger > kfence warnings: > > <3>[ 62.180289] BUG: KFENCE: out-of-bounds read in d_same_name+0x4c/0xd0 > <3>[ 62.180289] > <3>[ 62.182647] Out-of-bounds read at 0x00000000eedd4b55 (64B right of kfence-#174): > <4>[ 62.184178] d_same_name+0x4c/0xd0 Bah. I've said this before, but I really wish LKFT would use debug builds and run the warnings through scripts/decode_stacktrace.sh. Getting filenames and line numbers (and inlining information!) for stack traces can be really really useful. I think you are using KernelCI builds (at least that was the case last time), and apparently they are non-debug builds. And that's possibly due to just resource issues (the debug info does take a lot more disk space and makes link times much longer too). So it might not be easily fixable. But let's see if it might be an option to get this capability. So I'm adding the kernelci list to see if somebody goes "Oh, that was just an oversight" and might easily be made to happen. Fingers crossed. Linus ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [git pull] d_revalidate pile 2025-01-27 19:12 ` [git pull] d_revalidate pile Linus Torvalds @ 2025-01-27 20:38 ` Mark Brown 2025-01-27 22:32 ` Sasha Levin 2025-01-28 9:19 ` Guillaume Tucker 0 siblings, 2 replies; 8+ messages in thread From: Mark Brown @ 2025-01-27 20:38 UTC (permalink / raw) To: Linus Torvalds Cc: Sasha Levin, kernelci, Al Viro, linux-fsdevel, linux-kernel [-- Attachment #1: Type: text/plain, Size: 2653 bytes --] On Mon, Jan 27, 2025 at 11:12:11AM -0800, Linus Torvalds wrote: > On Mon, 27 Jan 2025 at 09:19, Sasha Levin <sashal@kernel.org> wrote: > > With this pulled on top of Linus's tree, LKFT is managing to trigger > > kfence warnings: > > <3>[ 62.180289] BUG: KFENCE: out-of-bounds read in d_same_name+0x4c/0xd0 > > <3>[ 62.180289] > > <3>[ 62.182647] Out-of-bounds read at 0x00000000eedd4b55 (64B right of kfence-#174): > > <4>[ 62.184178] d_same_name+0x4c/0xd0 > Bah. I've said this before, but I really wish LKFT would use debug > builds and run the warnings through scripts/decode_stacktrace.sh. > Getting filenames and line numbers (and inlining information!) for > stack traces can be really really useful. > I think you are using KernelCI builds (at least that was the case last > time), and apparently they are non-debug builds. And that's possibly > due to just resource issues (the debug info does take a lot more disk > space and makes link times much longer too). So it might not be easily > fixable. They're not, they're using their own builds done with their tuxsuite service which is a cloud front end for their tuxmake tool, that does have the ability to save the vmlinux. Poking around the LKFT output it does look like they're doing that for the LKFT builds: https://qa-reports.linaro.org/lkft/sashal-linus-next/build/v6.13-rc7-8584-gd4639f3659ae/testrun/27027254/suite/build/test/gcc-13-tinyconfig/details/ https://storage.tuxsuite.com/public/linaro/lkft/builds/2sDW1jDhjHPNl1XNezFhsjSlvpI/ so hopefully the information is all there and it's just a question of people doing the decode when reporting issues from LKFT. > But let's see if it might be an option to get this capability. So I'm > adding the kernelci list to see if somebody goes "Oh, that was just an > oversight" and might easily be made to happen. Fingers crossed. The issue with KernelCI has been that it's not storing the vmlinux, this was indeed done due to space issues like you suggest. With the new infrastructure that's been rolled out as part of the KernelCI 2.0 revamp the storage should be a lot more scaleable and so this should hopefully be a cost issue rather than actual space limits like it used to be so more tractable. AFAICT we haven't actually revisited making the required changes to include the vmlinux in the stored output though, I filed a ticket: https://github.com/kernelci/kernelci-project/issues/509 The builds themselves are generally using standard defconfigs and derivatives of that so will normally have enough debug info for decode_stacktrace.sh. Where they don't we should probably just change that upstream. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [git pull] d_revalidate pile 2025-01-27 20:38 ` Mark Brown @ 2025-01-27 22:32 ` Sasha Levin 2025-01-28 12:14 ` Mark Brown 2025-01-28 12:33 ` Dan Carpenter 2025-01-28 9:19 ` Guillaume Tucker 1 sibling, 2 replies; 8+ messages in thread From: Sasha Levin @ 2025-01-27 22:32 UTC (permalink / raw) To: Mark Brown Cc: Linus Torvalds, kernelci, Al Viro, linux-fsdevel, linux-kernel, lkft [ Adding in the LKFT folks ] On Mon, Jan 27, 2025 at 08:38:50PM +0000, Mark Brown wrote: >On Mon, Jan 27, 2025 at 11:12:11AM -0800, Linus Torvalds wrote: >> On Mon, 27 Jan 2025 at 09:19, Sasha Levin <sashal@kernel.org> wrote: > >> > With this pulled on top of Linus's tree, LKFT is managing to trigger >> > kfence warnings: > >> > <3>[ 62.180289] BUG: KFENCE: out-of-bounds read in d_same_name+0x4c/0xd0 >> > <3>[ 62.180289] >> > <3>[ 62.182647] Out-of-bounds read at 0x00000000eedd4b55 (64B right of kfence-#174): >> > <4>[ 62.184178] d_same_name+0x4c/0xd0 > >> Bah. I've said this before, but I really wish LKFT would use debug >> builds and run the warnings through scripts/decode_stacktrace.sh. > >> Getting filenames and line numbers (and inlining information!) for >> stack traces can be really really useful. > >> I think you are using KernelCI builds (at least that was the case last >> time), and apparently they are non-debug builds. And that's possibly >> due to just resource issues (the debug info does take a lot more disk >> space and makes link times much longer too). So it might not be easily >> fixable. > >They're not, they're using their own builds done with their tuxsuite >service which is a cloud front end for their tuxmake tool, that does >have the ability to save the vmlinux. Poking around the LKFT output it >does look like they're doing that for the LKFT builds: > > https://qa-reports.linaro.org/lkft/sashal-linus-next/build/v6.13-rc7-8584-gd4639f3659ae/testrun/27027254/suite/build/test/gcc-13-tinyconfig/details/ > https://storage.tuxsuite.com/public/linaro/lkft/builds/2sDW1jDhjHPNl1XNezFhsjSlvpI/ > >so hopefully the information is all there and it's just a question of >people doing the decode when reporting issues from LKFT. My understanding was that becuase CONFIG_DEBUG_INFO_NONE=y is set, we actually don't have enough info to resolve line numbers. I've tried running decode_stacktrace.sh on the vmlinux image linked above, and indeed we can't get line numbers. -- Thanks, Sasha ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [git pull] d_revalidate pile 2025-01-27 22:32 ` Sasha Levin @ 2025-01-28 12:14 ` Mark Brown 2025-01-28 12:43 ` Dan Carpenter 2025-01-28 12:33 ` Dan Carpenter 1 sibling, 1 reply; 8+ messages in thread From: Mark Brown @ 2025-01-28 12:14 UTC (permalink / raw) To: Sasha Levin Cc: Linus Torvalds, kernelci, Al Viro, linux-fsdevel, linux-kernel, lkft [-- Attachment #1: Type: text/plain, Size: 1755 bytes --] On Mon, Jan 27, 2025 at 05:32:18PM -0500, Sasha Levin wrote: > [ Adding in the LKFT folks ] Oops, sorry - didn't realise they weren't already on the report since it was on LKFT or I'd have done the same. > On Mon, Jan 27, 2025 at 08:38:50PM +0000, Mark Brown wrote: > > have the ability to save the vmlinux. Poking around the LKFT output it > > does look like they're doing that for the LKFT builds: > My understanding was that becuase CONFIG_DEBUG_INFO_NONE=y is set, we > actually don't have enough info to resolve line numbers. The arm64 and arm defconfigs which are the main ones I'd end up looking at both set CONFIG_DEBUG_INFO (_REDUCED in the case of arm64), the trace you posted was from arm64 so unless it was some config that overrode things there ought to be info. x86_64 which I guess you might use more indeed doesn't have it. > I've tried running decode_stacktrace.sh on the vmlinux image linked > above, and indeed we can't get line numbers. That was a random build I pulled out which turns out to be a tinyconfig rather than the specific build that was used - if we look at an arm64 defconfig (your trace looked to be from arm64): https://qa-reports.linaro.org/lkft/sashal-linus-next/build/v6.13-rc7-8584-gd4639f3659ae/testrun/27028517/suite/build/test/clang-nightly-defconfig-40bc7ee5/details/ https://storage.tuxsuite.com/public/linaro/lkft/builds/2sDW1oYDQrsEOOs4L6yoysbu9aS/ I'm able to decode (just feeding a random log message in, no idea what specific build generated the log message so the line number is almost certainly wrong): $ echo [ 62.184178] d_same_name+0x4c/0xd0 | ./scripts/decode_stacktrace.sh /tmp/vmlinux [ 62.184178] d_same_name (fs/dcache.c:2127) [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [git pull] d_revalidate pile 2025-01-28 12:14 ` Mark Brown @ 2025-01-28 12:43 ` Dan Carpenter 0 siblings, 0 replies; 8+ messages in thread From: Dan Carpenter @ 2025-01-28 12:43 UTC (permalink / raw) To: Mark Brown Cc: Sasha Levin, Linus Torvalds, kernelci, Al Viro, linux-fsdevel, linux-kernel, lkft On Tue, Jan 28, 2025 at 12:14:07PM +0000, 'Mark Brown' via lkft wrote: > I'm able to decode (just feeding a random log message in, no idea what > specific build generated the log message so the line number is almost > certainly wrong): > All the kernels that we're planning to boot have the DEBUG_INFO enabled. Here is the config and the vmlinux.xz for this one. https://storage.tuxsuite.com/public/linaro/lkft/builds/2sDW1u8fB268uU3L32J8FqAxYYR/ I don't want to explain how to find this URL because I've filed a ticket to make it prominent so the instructions will change soon. And ideally we would just have the line numbers on the webpage dmesg itself. regards, dan carpenter ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [git pull] d_revalidate pile 2025-01-27 22:32 ` Sasha Levin 2025-01-28 12:14 ` Mark Brown @ 2025-01-28 12:33 ` Dan Carpenter 2025-01-28 19:24 ` Sasha Levin 1 sibling, 1 reply; 8+ messages in thread From: Dan Carpenter @ 2025-01-28 12:33 UTC (permalink / raw) To: Sasha Levin Cc: Mark Brown, Linus Torvalds, kernelci, Al Viro, linux-fsdevel, linux-kernel, lkft On Mon, Jan 27, 2025 at 05:32:18PM -0500, Sasha Levin wrote: > [ Adding in the LKFT folks ] Ugh... The website is pretty difficult to navigate. I've filed a ticket to hopefully avoid this going forward. It's a bit late for the line numbers to be any use but here they are: Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent permitted by applicable law. <3>[ 62.179009] ================================================================== <3>[ 62.180289] BUG: KFENCE: out-of-bounds read in d_same_name (include/asm-generic/rwonce.h:86 fs/dcache.c:243 fs/dcache.c:295 fs/dcache.c:2129) <3>[ 62.180289] <3>[ 62.182647] Out-of-bounds read at 0x00000000eedd4b55 (64B right of kfence-#174): <4>[ 62.184178] d_same_name (include/asm-generic/rwonce.h:86 fs/dcache.c:243 fs/dcache.c:295 fs/dcache.c:2129) <4>[ 62.184717] d_lookup (fs/dcache.c:2292) <4>[ 62.185378] lookup_dcache (fs/namei.c:1654) <4>[ 62.185980] lookup_one_qstr_excl (fs/namei.c:1678) <4>[ 62.186523] do_renameat2 (fs/namei.c:5167) <4>[ 62.186948] __arm64_sys_renameat (fs/namei.c:5264) <4>[ 62.187484] invoke_syscall (arch/arm64/include/asm/current.h:19 arch/arm64/kernel/syscall.c:54) <4>[ 62.188220] el0_svc_common.constprop.0 (include/linux/thread_info.h:135 (discriminator 2) arch/arm64/kernel/syscall.c:140 (discriminator 2)) <4>[ 62.189031] do_el0_svc_compat (arch/arm64/kernel/syscall.c:159) <4>[ 62.189635] el0_svc_compat (arch/arm64/include/asm/irqflags.h:82 (discriminator 1) arch/arm64/include/asm/irqflags.h:123 (discriminator 1) arch/arm64/include/asm/irqflags.h:136 (discriminator 1) arch/arm64/kernel/entry-common.c:165 (discriminator 1) arch/arm64/kernel/entry-common.c:178 (discriminator 1) arch/arm64/kernel/entry-common.c:888 (discriminator 1)) <4>[ 62.190018] el0t_32_sync_handler (arch/arm64/kernel/entry-common.c:933) <4>[ 62.190537] el0t_32_sync (arch/arm64/kernel/entry.S:605) <3>[ 62.190946] <4>[ 62.191399] kfence-#174: 0x0000000012d508d5-0x0000000023355f7e, size=64, cache=kmalloc-rcl-64 <4>[ 62.191399] <4>[ 62.192260] allocated by task 1 on cpu 0 at 62.177313s (0.014839s ago): <4>[ 62.193504] __d_alloc (fs/dcache.c:1678) <4>[ 62.193925] d_alloc (fs/dcache.c:1737) <4>[ 62.194204] lookup_one_qstr_excl (fs/namei.c:1689) <4>[ 62.194741] filename_create (fs/namei.c:4083) <4>[ 62.195129] do_symlinkat (fs/namei.c:4690) <4>[ 62.195657] __arm64_sys_symlinkat (fs/namei.c:4710) <4>[ 62.195954] invoke_syscall (arch/arm64/include/asm/current.h:19 arch/arm64/kernel/syscall.c:54) <4>[ 62.196461] el0_svc_common.constprop.0 (include/linux/thread_info.h:135 (discriminator 2) arch/arm64/kernel/syscall.c:140 (discriminator 2)) <4>[ 62.197053] do_el0_svc_compat (arch/arm64/kernel/syscall.c:159) <4>[ 62.197411] el0_svc_compat (arch/arm64/include/asm/irqflags.h:82 (discriminator 1) arch/arm64/include/asm/irqflags.h:123 (discriminator 1) arch/arm64/include/asm/irqflags.h:136 (discriminator 1) arch/arm64/kernel/entry-common.c:165 (discriminator 1) arch/arm64/kernel/entry-common.c:178 (discriminator 1) arch/arm64/kernel/entry-common.c:888 (discriminator 1)) <4>[ 62.197849] el0t_32_sync_handler (arch/arm64/kernel/entry-common.c:933) <4>[ 62.198422] el0t_32_sync (arch/arm64/kernel/entry.S:605) <3>[ 62.198857] <3>[ 62.199577] CPU: 0 UID: 0 PID: 1 Comm: systemd Not tainted 6.13.0 #1 <3>[ 62.200435] Hardware name: linux,dummy-virt (DT) <3>[ 62.201130] ================================================================== [?2004hroot@runner-vwmj3eza-project-40964107-concurrent-3:~# regards, dan carpenter ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [git pull] d_revalidate pile 2025-01-28 12:33 ` Dan Carpenter @ 2025-01-28 19:24 ` Sasha Levin 0 siblings, 0 replies; 8+ messages in thread From: Sasha Levin @ 2025-01-28 19:24 UTC (permalink / raw) To: Dan Carpenter Cc: Mark Brown, Linus Torvalds, kernelci, Al Viro, linux-fsdevel, linux-kernel, lkft On Tue, Jan 28, 2025 at 03:33:14PM +0300, Dan Carpenter wrote: >On Mon, Jan 27, 2025 at 05:32:18PM -0500, Sasha Levin wrote: >> [ Adding in the LKFT folks ] > >Ugh... The website is pretty difficult to navigate. I've filed a >ticket to hopefully avoid this going forward. It's a bit late for >the line numbers to be any use but here they are: Thanks Dan & Mark! I think I've figured out (and scripted) it for next time :) -- Thanks, Sasha ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [git pull] d_revalidate pile 2025-01-27 20:38 ` Mark Brown 2025-01-27 22:32 ` Sasha Levin @ 2025-01-28 9:19 ` Guillaume Tucker 1 sibling, 0 replies; 8+ messages in thread From: Guillaume Tucker @ 2025-01-28 9:19 UTC (permalink / raw) To: Mark Brown, Linus Torvalds Cc: Sasha Levin, kernelci, Al Viro, linux-fsdevel, linux-kernel On 27/01/2025 9:38 pm, Mark Brown wrote: >> But let's see if it might be an option to get this capability. So I'm >> adding the kernelci list to see if somebody goes "Oh, that was just an >> oversight" and might easily be made to happen. Fingers crossed. > The issue with KernelCI has been that it's not storing the vmlinux, this > was indeed done due to space issues like you suggest. With the new > infrastructure that's been rolled out as part of the KernelCI 2.0 revamp > the storage should be a lot more scaleable and so this should hopefully > be a cost issue rather than actual space limits like it used to be so > more tractable. AFAICT we haven't actually revisited making the > required changes to include the vmlinux in the stored output though, I > filed a ticket: > > https://github.com/kernelci/kernelci-project/issues/509 > > The builds themselves are generally using standard defconfigs and > derivatives of that so will normally have enough debug info for > decode_stacktrace.sh. Where they don't we should probably just change > that upstream. One approach that was suggested a while ago was to do extra debug builds in automated post-processing jobs whenever a failure is detected. This came as an evolution of the automated bisection which had checks for the good and bad revisions: if a stacktrace was found while testing the "bad" kernel then it could easily be decoded since bisections do incremental builds and keep the vmlinux at hand. As Sasha mentioned in his email, some particular configs are required in order to decode the stacktrace (IIRC this is enabled with arm64_defconfig but not x86). Debug builds also make larger binaries and affect runtime behaviour, as we all know. So one post-processing check would be to do a special debug build with the right configs for decoding stacktraces as well as maybe some sanitizers and extra useful things to add more information. Builds from bisections or any extra jobs should still be uploaded to public storage so they would be available for manual investigation too. That way, the impact on storage costs and compute resources would be minimal without any real drawback - it might take 30min to get the post-processing job to complete but even that could be optimized and it seems a lot more efficient than doing debug builds and uploading large vmlinux images all the time. Hope this helps! Cheers, Guillaume ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-01-28 19:24 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20250127044721.GD1977892@ZenIV>
[not found] ` <Z5fAOpnFoXMgpCWb@lappy>
2025-01-27 19:12 ` [git pull] d_revalidate pile Linus Torvalds
2025-01-27 20:38 ` Mark Brown
2025-01-27 22:32 ` Sasha Levin
2025-01-28 12:14 ` Mark Brown
2025-01-28 12:43 ` Dan Carpenter
2025-01-28 12:33 ` Dan Carpenter
2025-01-28 19:24 ` Sasha Levin
2025-01-28 9:19 ` Guillaume Tucker
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox