public inbox for linux-trace-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Nathan Chancellor <nathan@kernel.org>
To: Jinghao Jia <jinghao7@illinois.edu>
Cc: Matt.Kelly2@boeing.com, akpm@linux-foundation.org,
	andrew.j.oppelt@boeing.com, anton.ivanov@cambridgegreys.com,
	ardb@kernel.org, arnd@arndb.de, bhelgaas@google.com,
	bp@alien8.de, chuck.wolber@boeing.com,
	dave.hansen@linux.intel.com, dvyukov@google.com, hpa@zytor.com,
	johannes@sipsolutions.net, jpoimboe@kernel.org,
	justinstitt@google.com, kees@kernel.org,
	kent.overstreet@linux.dev, linux-arch@vger.kernel.org,
	linux-efi@vger.kernel.org, Wentao Zhang <wentaoz5@illinois.edu>,
	linux-kbuild@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-trace-kernel@vger.kernel.org, linux-um@lists.infradead.org,
	llvm@lists.linux.dev, luto@kernel.org, marinov@illinois.edu,
	masahiroy@kernel.org, maskray@google.com,
	mathieu.desnoyers@efficios.com, matthew.l.weber3@boeing.com,
	mhiramat@kernel.org, mingo@redhat.com, morbo@google.com,
	ndesaulniers@google.com, oberpar@linux.ibm.com,
	paulmck@kernel.org, peterz@infradead.org, richard@nod.at,
	rostedt@goodmis.org, samitolvanen@google.com,
	samuel.sarkisian@boeing.com, steven.h.vanderleest@boeing.com,
	tglx@linutronix.de, tingxur@illinois.edu, tyxu@illinois.edu,
	x86@kernel.org
Subject: Re: [PATCH v2 0/4] Enable measuring the kernel's Source-based Code Coverage and MC/DC with Clang
Date: Fri, 22 Nov 2024 21:39:22 -0700	[thread overview]
Message-ID: <20241123043922.GA584876@thelio-3990X> (raw)
In-Reply-To: <284fe8fa-c094-49b7-8e16-3318676d38e3@illinois.edu>

Hi Jinghao,

On Thu, Nov 21, 2024 at 11:05:14PM -0600, Jinghao Jia wrote:
> Wentao and I were looking into this issue in the past weeks. The high level
> conclusion is that it seems to be some problem with lld and I will go over the
> detail here.

Thanks a lot for looking into this!

> On 10/3/24 6:29 PM, Nathan Chancellor wrote:
> > I seem to have narrowed down it to a few different configurations on top
> > of x86_64_defconfig but I will include the full bad configuration as an
> > attachment just in case anything else is relevant.
> > 
> > $ echo 'CONFIG_LLVM_COV_KERNEL=y
> > CONFIG_LLVM_COV_PROFILE_ALL=y' >kernel/configs/llvm_cov.config
> > 
> > $ echo CONFIG_FORTIFY_SOURCE=y >kernel/configs/fortify_source.config
> > 
> > $ echo CONFIG_AMD_MEM_ENCRYPT=y >arch/x86/configs/amd_mem_encrypt.config
> > 
> > $ /usr/bin/time -v make -skj"$(nproc)" ARCH=x86_64 LLVM=1 mrproper {def,amd_mem_encrypt.,fortify_source.,llvm_cov.}config bzImage
> > ...
> > vmlinux.o: warning: objtool: __sev_es_nmi_complete+0x6e: call to kasan_check_write() leaves .noinstr.text section
> > vmlinux.o: warning: objtool: do_syscall_64+0x141: call to lockdep_hardirqs_off() leaves .noinstr.text section
> > vmlinux.o: warning: objtool: do_int80_emulation+0x138: call to lockdep_hardirqs_off() leaves .noinstr.text section
> > vmlinux.o: warning: objtool: handle_bug+0x5: call to kmsan_unpoison_entry_regs() leaves .noinstr.text section
> > vmlinux.o: warning: objtool: syscall_enter_from_user_mode_prepare+0x105: call to lockdep_hardirqs_off() leaves .noinstr.text section
> > vmlinux.o: warning: objtool: syscall_exit_to_user_mode+0x73: call to user_enter_irqoff() leaves .noinstr.text section
> > vmlinux.o: warning: objtool: irqentry_enter_from_user_mode+0x105: call to lockdep_hardirqs_off() leaves .noinstr.text section
> > vmlinux.o: warning: objtool: irqentry_exit_to_user_mode+0x62: call to user_enter_irqoff() leaves .noinstr.text section
> > vmlinux.o: warning: objtool: irqentry_enter+0x45: call to lockdep_hardirqs_off() leaves .noinstr.text section
> > vmlinux.o: warning: objtool: irqentry_exit+0x4a: call to lockdep_hardirqs_on() leaves .noinstr.text section
> > vmlinux.o: warning: objtool: irqentry_nmi_enter+0x4: call to lockdep_off() leaves .noinstr.text section
> > vmlinux.o: warning: objtool: irqentry_nmi_exit+0x67: call to lockdep_on() leaves .noinstr.text section
> > vmlinux.o: warning: objtool: enter_s2idle_proper+0xb5: call to lockdep_hardirqs_off() leaves .noinstr.text section
> > vmlinux.o: warning: objtool: cpuidle_enter_state+0x113: call to lockdep_hardirqs_off() leaves .noinstr.text section
> > vmlinux.o: warning: objtool: default_idle_call+0xad: call to lockdep_hardirqs_on() leaves .noinstr.text section
> > vmlinux.o: warning: objtool: cpu_idle_poll+0x29: call to lockdep_hardirqs_on() leaves .noinstr.text section
> > vmlinux.o: warning: objtool: acpi_idle_enter_bm+0x118: call to lockdep_hardirqs_on() leaves .noinstr.text section
> > vmlinux.o: warning: objtool: acpi_idle_do_entry+0x4: call to perf_lopwr_cb() leaves .noinstr.text section
> > ...
> >         User time (seconds): 670.86
> >         System time (seconds): 459.05
> >         Percent of CPU this job got: 169%
> >         Elapsed (wall clock) time (h:mm:ss or m:ss): 11:06.15
> >         Average shared text size (kbytes): 0
> >         Average unshared data size (kbytes): 0
> >         Average stack size (kbytes): 0
> >         Average total size (kbytes): 0
> >         Maximum resident set size (kbytes): 38644844
> >         Average resident set size (kbytes): 0
> >         Major (requiring I/O) page faults: 18694
> >         Minor (reclaiming a frame) page faults: 23068856
> >         Voluntary context switches: 32215431
> >         Involuntary context switches: 46422
> >         Swaps: 0
> >         File system inputs: 0
> >         File system outputs: 40127696
> >         Socket messages sent: 0
> >         Socket messages received: 0
> >         Signals delivered: 0
> >         Page size (bytes): 4096
> >         Exit status: 0
> > 
> > $ curl -LSs https://urldefense.com/v3/__https://github.com/ClangBuiltLinux/boot-utils/releases/download/20230707-182910/x86_64-rootfs.cpio.zst__;!!DZ3fjg!7BrjObiTQ7yWOq1feQGQPxe3uzUM5t4pPHkLUuijWyjOwoaX2rdCwZoD4P52pNU_t1tCT2OCWV3GPtNnAw8$  | zstd -d >rootfs.cpio
> > 
> > $ qemu-system-x86_64 \
> >     -display none \
> >     -nodefaults \
> >     -M q35 \
> >     -d unimp,guest_errors \
> >     -append 'console=ttyS0 earlycon=uart8250,io,0x3f8' \
> >     -kernel arch/x86/boot/bzImage
> >     -initrd rootfs.cpio \
> >     -cpu host \
> >     -enable-kvm \
> >     -m 8G \
> >     -smp 8 \
> >     -serial mon:stdio
> > <hangs with no output>
> 
> This hang is caused by an early boot exception -- gdb shows the execution
> reaches the halt loop in early_fixup_exception().  Dumping regs->ip associated
> with this exception points us to the following instruction:
> 
> ffffffff89b58074:       48 ff 05 85 7f 4a 76    incq   0x764a7f85(%rip)        # 0 <fixed_percpu_data>
> 
> This is apparently an incorrect access to the per-cpu variable (the cpu offset
> in %gs is needed) and triggers a null-ptr-deref. Without CONFIG_AMD_MEM_ENCRYPT
> (one of the bad configs), it turns out the instruction is actually accessing
> the llvm prof-counter of strscpy():
> 
> ffffffff89b85a04:       48 ff 05 6d 94 7d fa    incq   -0x5826b93(%rip)        # ffffffff8435ee78 <__profc__Z13sized_strscpyPcU25pass_dynamic_object_size1PKcU25pass_dynamic_object_size1m>
> 
> This symbol is left undefined in the bad vmlinux, which explains why the
> faulting instruction is accessing address 0.  Tracing through the kernel
> linking process shows that the symbol is still defined (as a weak symbol) in
> vmlinux.a and vmlinux.o, but becomes undefined after the first round of linking
> of the kernel image (.tmp_vmlinux1).
> 
> After playing with it a little bit, we found the creation of vmlinux.o to be
> the problem. Specifically, if we use mold[1] instead of lld to create the
> object and pass it to the later stages of kernel linking, the symbol will be
> properly defined as a data symbol (and the kernel can boot).
> 
> It seems that the issue does not reproduce with LLVM-20.

I just ran my original reproducer with a version of ld.lld from LLVM
main (132de3a71f581dcb008a124d52c83ccca8158d98) and I still see the boot
hang, so it seems like it might still be relevant there? Or am I
misunderstanding your comment here?

> Nevertheless we have reported[2] this to upstream llvm.

Thank you for reporting this upstream. Hopefully Fangrui or someone more
familiar with LLD internals can take a look. I am guessing it is not too
easy to get a concise reproducer for this behavior.

> [1]: https://github.com/rui314/mold
> [2]: https://github.com/llvm/llvm-project/issues/116575
> 
> P.S.: We used mold because gnu ld is simply too slow with all these llvm-cov
> sections -- the vmlinux.o step ran for 10+ hours and still didn't stop. At the
> same time, the fact that the creation of vmlinux.o does not use a linker script
> allows us to directly plug mold in.

Hmmm, that seems like it might be worth reporting to binutils upstream
to see if that is a bug or expected.

Cheers,
Nathan

  reply	other threads:[~2024-11-23  4:39 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-24 23:06 [RFC PATCH 0/3] Enable measuring the kernel's Source-based Code Coverage and MC/DC with Clang Wentao Zhang
2024-08-24 23:06 ` [RFC PATCH 1/3] llvm-cov: add Clang's Source-based Code Coverage support Wentao Zhang
2024-08-25 11:52   ` Thomas Gleixner
2024-08-24 23:06 ` [RFC PATCH 2/3] kbuild, llvm-cov: disable instrumentation in odd or sensitive code Wentao Zhang
2024-08-25 12:12   ` Thomas Gleixner
2024-08-24 23:06 ` [RFC PATCH 3/3] llvm-cov: add Clang's MC/DC support Wentao Zhang
2024-09-05  4:32 ` [PATCH v2 0/4] Enable measuring the kernel's Source-based Code Coverage and MC/DC with Clang Wentao Zhang
2024-09-05  4:32   ` [PATCH v2 1/4] llvm-cov: add Clang's Source-based Code Coverage support Wentao Zhang
2024-10-02  0:30     ` Nathan Chancellor
2024-09-05  4:32   ` [PATCH v2 2/4] llvm-cov: add Clang's MC/DC support Wentao Zhang
2024-10-02  1:10     ` Nathan Chancellor
2024-10-03  3:14       ` Wentao Zhang
2024-09-05  4:32   ` [PATCH v2 3/4] x86: disable llvm-cov instrumentation Wentao Zhang
2024-10-02  1:17     ` Nathan Chancellor
2024-09-05  4:32   ` [PATCH v2 4/4] x86: enable llvm-cov support Wentao Zhang
2024-10-02  1:18     ` Nathan Chancellor
2024-09-05 11:41   ` [PATCH v2 0/4] Enable measuring the kernel's Source-based Code Coverage and MC/DC with Clang Peter Zijlstra
     [not found]     ` <BN0P110MB1785427A8771BD53DADB2E4DAB9DA@BN0P110MB1785.NAMP110.PROD.OUTLOOK.COM>
     [not found]       ` <BN0P110MB1785CA856C1898EEC22ACD7EAB9DA@BN0P110MB1785.NAMP110.PROD.OUTLOOK.COM>
2024-09-05 12:24         ` FW: [EXTERNAL] " Steve VanderLeest
2024-09-05 18:07     ` Wentao Zhang
2024-10-02  4:53   ` Nathan Chancellor
2024-10-02  6:42     ` Wentao Zhang
2024-10-03 23:29       ` Nathan Chancellor
2024-10-09  3:17         ` Wentao Zhang
2024-11-22  5:05         ` Jinghao Jia
2024-11-23  4:39           ` Nathan Chancellor [this message]
2025-08-29 18:10           ` Nathan Chancellor
2025-10-14 23:26             ` [RFC PATCH 0/4] Enable Clang's Source-based Code Coverage and MC/DC for x86-64 Sasha Levin
2025-10-14 23:26               ` [RFC PATCH 1/4] llvm-cov: add Clang's Source-based Code Coverage support Sasha Levin
2025-10-14 23:26               ` [RFC PATCH 2/4] llvm-cov: add Clang's MC/DC support Sasha Levin
2025-10-14 23:26               ` [RFC PATCH 3/4] x86: disable llvm-cov instrumentation Sasha Levin
2025-10-14 23:26               ` [RFC PATCH 4/4] x86: enable llvm-cov support Sasha Levin
2025-10-15  7:37               ` [RFC PATCH 0/4] Enable Clang's Source-based Code Coverage and MC/DC for x86-64 Peter Zijlstra
2025-10-15  8:26                 ` Chuck Wolber
2025-10-15  9:21                   ` Peter Zijlstra
2026-03-15 14:15                     ` Sasha Levin
2024-11-22 12:27         ` [PATCH v2 0/4] Enable measuring the kernel's Source-based Code Coverage and MC/DC with Clang Peter Zijlstra
2024-11-22 19:28           ` [EXTERNAL] " Wolber (US), Chuck
2024-11-23  3:09           ` Nathan Chancellor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241123043922.GA584876@thelio-3990X \
    --to=nathan@kernel.org \
    --cc=Matt.Kelly2@boeing.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrew.j.oppelt@boeing.com \
    --cc=anton.ivanov@cambridgegreys.com \
    --cc=ardb@kernel.org \
    --cc=arnd@arndb.de \
    --cc=bhelgaas@google.com \
    --cc=bp@alien8.de \
    --cc=chuck.wolber@boeing.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dvyukov@google.com \
    --cc=hpa@zytor.com \
    --cc=jinghao7@illinois.edu \
    --cc=johannes@sipsolutions.net \
    --cc=jpoimboe@kernel.org \
    --cc=justinstitt@google.com \
    --cc=kees@kernel.org \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-efi@vger.kernel.org \
    --cc=linux-kbuild@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=linux-um@lists.infradead.org \
    --cc=llvm@lists.linux.dev \
    --cc=luto@kernel.org \
    --cc=marinov@illinois.edu \
    --cc=masahiroy@kernel.org \
    --cc=maskray@google.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=matthew.l.weber3@boeing.com \
    --cc=mhiramat@kernel.org \
    --cc=mingo@redhat.com \
    --cc=morbo@google.com \
    --cc=ndesaulniers@google.com \
    --cc=oberpar@linux.ibm.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=richard@nod.at \
    --cc=rostedt@goodmis.org \
    --cc=samitolvanen@google.com \
    --cc=samuel.sarkisian@boeing.com \
    --cc=steven.h.vanderleest@boeing.com \
    --cc=tglx@linutronix.de \
    --cc=tingxur@illinois.edu \
    --cc=tyxu@illinois.edu \
    --cc=wentaoz5@illinois.edu \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox