public inbox for llvm@lists.linux.dev
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Wentao Zhang <wentaoz5@illinois.edu>
Cc: Matt.Kelly2@boeing.com, akpm@linux-foundation.org,
	andrew.j.oppelt@boeing.com, anton.ivanov@cambridgegreys.com,
	ardb@kernel.org, arnd@arndb.de, bhelgaas@google.com,
	bp@alien8.de, chuck.wolber@boeing.com,
	dave.hansen@linux.intel.com, dvyukov@google.com, hpa@zytor.com,
	jinghao7@illinois.edu, johannes@sipsolutions.net,
	jpoimboe@kernel.org, justinstitt@google.com, kees@kernel.org,
	kent.overstreet@linux.dev, linux-arch@vger.kernel.org,
	linux-efi@vger.kernel.org, linux-kbuild@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
	linux-um@lists.infradead.org, llvm@lists.linux.dev,
	luto@kernel.org, marinov@illinois.edu, masahiroy@kernel.org,
	maskray@google.com, mathieu.desnoyers@efficios.com,
	matthew.l.weber3@boeing.com, mhiramat@kernel.org,
	mingo@redhat.com, morbo@google.com, nathan@kernel.org,
	ndesaulniers@google.com, oberpar@linux.ibm.com,
	paulmck@kernel.org, richard@nod.at, rostedt@goodmis.org,
	samitolvanen@google.com, samuel.sarkisian@boeing.com,
	steven.h.vanderleest@boeing.com, tglx@linutronix.de,
	tingxur@illinois.edu, tyxu@illinois.edu, x86@kernel.org
Subject: Re: [PATCH v2 0/4] Enable measuring the kernel's Source-based Code Coverage and MC/DC with Clang
Date: Thu, 5 Sep 2024 13:41:40 +0200	[thread overview]
Message-ID: <20240905114140.GN4723@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <20240905043245.1389509-1-wentaoz5@illinois.edu>

On Wed, Sep 04, 2024 at 11:32:41PM -0500, Wentao Zhang wrote:
> From: Wentao Zhang <zhangwt1997@gmail.com>
> 
> This series adds support for building x86-64 kernels with Clang's Source-
> based Code Coverage[1] in order to facilitate Modified Condition/Decision
> Coverage (MC/DC)[2] that provably correlates to source code for all levels
> of compiler optimization.
> 
> The newly added kernel/llvm-cov/ directory complements the existing gcov
> implementation. Gcov works at the object code level which may better
> reflect actual execution. However, Gcov lacks the necessary information to
> correlate coverage measurement with source code location when compiler
> optimization level is non-zero (which is the default when building the
> kernel). In addition, gcov reports are occasionally ambiguous when
> attempting to compare with source code level developer intent.
> 
> In the following gcov example from drivers/firmware/dmi_scan.c, an
> expression with four conditions is reported to have six branch outcomes,
> which is not ideally informative in many safety related use cases, such as
> automotive, medical, and aerospace.
> 
>         5: 1068:	if (s == e || *e != '/' || !month || month > 12) {
> branch  0 taken 5 (fallthrough)
> branch  1 taken 0
> branch  2 taken 5 (fallthrough)
> branch  3 taken 0
> branch  4 taken 0 (fallthrough)
> branch  5 taken 5
> 
> On the other hand, Clang's Source-based Code Coverage instruments at the
> compiler frontend which maintains an accurate mapping from coverage
> measurement to source code location. Coverage reports reflect exactly how
> the code is written regardless of optimization and can present advanced
> metrics like branch coverage and MC/DC in a clearer way. Coverage report
> for the same snippet by llvm-cov would look as follows:
> 
>  1068|      5|	if (s == e || *e != '/' || !month || month > 12) {
>   ------------------
>   |  Branch (1068:6): [True: 0, False: 5]
>   |  Branch (1068:16): [True: 0, False: 5]
>   |  Branch (1068:29): [True: 0, False: 5]
>   |  Branch (1068:39): [True: 0, False: 5]
>   ------------------
> 
> Clang has added MC/DC support as of its 18.1.0 release. MC/DC is a fine-
> grained coverage metric required by many automotive and aviation industrial
> standards for certifying mission-critical software [3].
> 
> In the following example from arch/x86/events/probe.c, llvm-cov gives the
> MC/DC measurement for the compound logic decision at line 43.
> 
>    43|     12|			if (msr[bit].test && !msr[bit].test(bit, data))
>   ------------------
>   |---> MC/DC Decision Region (43:8) to (43:50)
>   |
>   |  Number of Conditions: 2
>   |     Condition C1 --> (43:8)
>   |     Condition C2 --> (43:25)
>   |
>   |  Executed MC/DC Test Vectors:
>   |
>   |     C1, C2    Result
>   |  1 { T,  F  = F      }
>   |  2 { T,  T  = T      }
>   |
>   |  C1-Pair: not covered
>   |  C2-Pair: covered: (1,2)
>   |  MC/DC Coverage for Decision: 50.00%
>   |
>   ------------------
>    44|      5|				continue;
> 
> As the results suggest, during the span of measurement, only condition C2
> (!msr[bit].test(bit, data)) is covered. That means C2 was evaluated to both
> true and false, and in those test vectors C2 affected the decision outcome
> independently. Therefore MC/DC for this decision is 1 out of 2 (50.00%).
> 
> To do a full kernel measurement, instrument the kernel with
> LLVM_COV_KERNEL_MCDC enabled, and optionally set a
> LLVM_COV_KERNEL_MCDC_MAX_CONDITIONS value (the default is six). Run the
> testsuites, and collect the raw profile data under
> /sys/kernel/debug/llvm-cov/profraw. Such raw profile data can be merged and
> indexed, and used for generating coverage reports in various formats.
> 
>   $ cp /sys/kernel/debug/llvm-cov/profraw vmlinux.profraw
>   $ llvm-profdata merge vmlinux.profraw -o vmlinux.profdata
>   $ llvm-cov show --show-mcdc --show-mcdc-summary                         \
>              --format=text --use-color=false -output-dir=coverage_reports \
>              -instr-profile vmlinux.profdata vmlinux
> 
> The first two patches enable the llvm-cov infrastructure, where the first
> enables source based code coverage and the second adds MC/DC support. The
> next patch disables instrumentation for curve25519-x86_64.c for the same
> reason as gcov. The final patch enables coverage for x86-64.
> 
> The choice to use a new Makefile variable LLVM_COV_PROFILE, instead of
> reusing GCOV_PROFILE, was deliberate. More work needs to be done to
> determine if it is appropriate to reuse the same flag. In addition, given
> the fundamentally different approaches to instrumentation and the resulting
> variation in coverage reports, there is a strong likelihood that coverage
> type will need to be managed separately.
> 
> This work reuses code from a previous effort by Sami Tolvanen et al. [4].
> Our aim is for source-based *code coverage* required for high assurance
> (MC/DC) while [4] focused more on performance optimization.
> 
> This initial submission is restricted to x86-64. Support for other
> architectures would need a bit more Makefile & linker script modification.
> Informally we've confirmed that arm64 works and more are being tested.
> 
> Note that Source-based Code Coverage is Clang-specific and isn't compatible
> with Clang's gcov support in kernel/gcov/. Currently, kernel/gcov/ is not
> able to measure MC/DC without modifying CFLAGS_GCOV and it would face the
> same issues in terms of source correlation as gcov in general does.
> 
> Some demo and results can be found in [5]. We will talk about this patch
> series in the Refereed Track at LPC 2024 [6].
> 
> Known Limitations:
> 
> Kernel code with logical expressions exceeding
> LVM_COV_KERNEL_MCDC_MAX_CONDITIONS will produce a compiler warning.
> Expressions with up to 47 conditions are found in the Linux kernel source
> tree (as of v6.11), but 46 seems to be the max value before the build fails
> due to kernel size. As of LLVM 19 the max number of conditions possible is
> 32767.
> 
> As of LLVM 19, certain expressions are still not covered, and will produce
> build warnings when they are encountered:
> 
> "[...] if a boolean expression is embedded in the nest of another boolean
>  expression but separated by a non-logical operator, this is also not
>  supported. For example, in x = (a && b && c && func(d && f)), the d && f
>  case starts a new boolean expression that is separated from the other
>  conditions by the operator func(). When this is encountered, a warning
>  will be generated and the boolean expression will not be
>  instrumented." [7]
> 

What does this actually look like in the generated code?

Where is the modification to noinstr ?

What is the impact on certification of not covering the noinstr code.



  parent reply	other threads:[~2024-09-05 11:42 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-24 23:06 [RFC PATCH 0/3] Enable measuring the kernel's Source-based Code Coverage and MC/DC with Clang Wentao Zhang
2024-08-24 23:06 ` [RFC PATCH 1/3] llvm-cov: add Clang's Source-based Code Coverage support Wentao Zhang
2024-08-25 11:52   ` Thomas Gleixner
2024-08-24 23:06 ` [RFC PATCH 2/3] kbuild, llvm-cov: disable instrumentation in odd or sensitive code Wentao Zhang
2024-08-25 12:12   ` Thomas Gleixner
2024-08-24 23:06 ` [RFC PATCH 3/3] llvm-cov: add Clang's MC/DC support Wentao Zhang
2024-09-05  4:32 ` [PATCH v2 0/4] Enable measuring the kernel's Source-based Code Coverage and MC/DC with Clang Wentao Zhang
2024-09-05  4:32   ` [PATCH v2 1/4] llvm-cov: add Clang's Source-based Code Coverage support Wentao Zhang
2024-10-02  0:30     ` Nathan Chancellor
2024-09-05  4:32   ` [PATCH v2 2/4] llvm-cov: add Clang's MC/DC support Wentao Zhang
2024-10-02  1:10     ` Nathan Chancellor
2024-10-03  3:14       ` Wentao Zhang
2024-09-05  4:32   ` [PATCH v2 3/4] x86: disable llvm-cov instrumentation Wentao Zhang
2024-10-02  1:17     ` Nathan Chancellor
2024-09-05  4:32   ` [PATCH v2 4/4] x86: enable llvm-cov support Wentao Zhang
2024-10-02  1:18     ` Nathan Chancellor
2024-09-05 11:41   ` Peter Zijlstra [this message]
     [not found]     ` <BN0P110MB1785427A8771BD53DADB2E4DAB9DA@BN0P110MB1785.NAMP110.PROD.OUTLOOK.COM>
     [not found]       ` <BN0P110MB1785CA856C1898EEC22ACD7EAB9DA@BN0P110MB1785.NAMP110.PROD.OUTLOOK.COM>
2024-09-05 12:24         ` FW: [EXTERNAL] Re: [PATCH v2 0/4] Enable measuring the kernel's Source-based Code Coverage and MC/DC with Clang Steve VanderLeest
2024-09-05 18:07     ` Wentao Zhang
2024-10-02  4:53   ` Nathan Chancellor
2024-10-02  6:42     ` Wentao Zhang
2024-10-03 23:29       ` Nathan Chancellor
2024-10-09  3:17         ` Wentao Zhang
2024-11-22  5:05         ` Jinghao Jia
2024-11-23  4:39           ` Nathan Chancellor
2025-08-29 18:10           ` Nathan Chancellor
2025-10-14 23:26             ` [RFC PATCH 0/4] Enable Clang's Source-based Code Coverage and MC/DC for x86-64 Sasha Levin
2025-10-14 23:26               ` [RFC PATCH 1/4] llvm-cov: add Clang's Source-based Code Coverage support Sasha Levin
2025-10-14 23:26               ` [RFC PATCH 2/4] llvm-cov: add Clang's MC/DC support Sasha Levin
2025-10-14 23:26               ` [RFC PATCH 3/4] x86: disable llvm-cov instrumentation Sasha Levin
2025-10-14 23:26               ` [RFC PATCH 4/4] x86: enable llvm-cov support Sasha Levin
2025-10-15  7:37               ` [RFC PATCH 0/4] Enable Clang's Source-based Code Coverage and MC/DC for x86-64 Peter Zijlstra
2025-10-15  8:26                 ` Chuck Wolber
2025-10-15  9:21                   ` Peter Zijlstra
2026-03-15 14:15                     ` Sasha Levin
2024-11-22 12:27         ` [PATCH v2 0/4] Enable measuring the kernel's Source-based Code Coverage and MC/DC with Clang Peter Zijlstra
2024-11-22 19:28           ` [EXTERNAL] " Wolber (US), Chuck
2024-11-23  3:09           ` Nathan Chancellor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240905114140.GN4723@noisy.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=Matt.Kelly2@boeing.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrew.j.oppelt@boeing.com \
    --cc=anton.ivanov@cambridgegreys.com \
    --cc=ardb@kernel.org \
    --cc=arnd@arndb.de \
    --cc=bhelgaas@google.com \
    --cc=bp@alien8.de \
    --cc=chuck.wolber@boeing.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=dvyukov@google.com \
    --cc=hpa@zytor.com \
    --cc=jinghao7@illinois.edu \
    --cc=johannes@sipsolutions.net \
    --cc=jpoimboe@kernel.org \
    --cc=justinstitt@google.com \
    --cc=kees@kernel.org \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-efi@vger.kernel.org \
    --cc=linux-kbuild@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=linux-um@lists.infradead.org \
    --cc=llvm@lists.linux.dev \
    --cc=luto@kernel.org \
    --cc=marinov@illinois.edu \
    --cc=masahiroy@kernel.org \
    --cc=maskray@google.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=matthew.l.weber3@boeing.com \
    --cc=mhiramat@kernel.org \
    --cc=mingo@redhat.com \
    --cc=morbo@google.com \
    --cc=nathan@kernel.org \
    --cc=ndesaulniers@google.com \
    --cc=oberpar@linux.ibm.com \
    --cc=paulmck@kernel.org \
    --cc=richard@nod.at \
    --cc=rostedt@goodmis.org \
    --cc=samitolvanen@google.com \
    --cc=samuel.sarkisian@boeing.com \
    --cc=steven.h.vanderleest@boeing.com \
    --cc=tglx@linutronix.de \
    --cc=tingxur@illinois.edu \
    --cc=tyxu@illinois.edu \
    --cc=wentaoz5@illinois.edu \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox