Re: [syzbot] [kernel?] upstream test error: KMSAN: uninit-value in irqentry_exit_to_kernel_mode_preempt

The Linux Kernel Mailing List
 help / color / mirror / Atom feed

From: Mark Rutland <mark.rutland@arm.com>
To: Alexander Potapenko <glider@google.com>
Cc: Thomas Gleixner <tglx@kernel.org>,
	Dmitry Vyukov <dvyukov@google.com>,
	syzbot <syzbot+cdcfd55737fe43eeb3a3@syzkaller.appspotmail.com>,
	kasan-dev <kasan-dev@googlegroups.com>,
	linux-kernel@vger.kernel.org, luto@kernel.org,
	peterz@infradead.org, syzkaller-bugs@googlegroups.com,
	ruanjinjie@huawei.com
Subject: Re: [syzbot] [kernel?] upstream test error: KMSAN: uninit-value in irqentry_exit_to_kernel_mode_preempt
Date: Tue, 12 May 2026 18:46:31 +0100	[thread overview]
Message-ID: <agNnafV0ArGXyM9D@J2N7QTR9R3> (raw)
In-Reply-To: <CAG_fn=VGfkvsTJTq_TFkg_=FiCFmbY5uj=2bwSzMR+8Cmdxi+g@mail.gmail.com>

Hi,

Thanks for this; the explanation of the KMSAN ABI is *very* helpful.

I have a few questions and comments inline below.

On Tue, May 12, 2026 at 06:24:03PM +0200, Alexander Potapenko wrote:
> On Tue, May 12, 2026 at 1:15 PM Mark Rutland <mark.rutland@arm.com> wrote:
> > For context, can you explain how this is expected to work across
> > compilation units when the caller and callee *are* instrumented, when an
> > argument is passed in a register?
> 
> Per KMSAN ABI, the metadata for the function arguments is passed via
> buffers in the so-called context state (see
> include/linux/kmsan_types.h)
> In the userspace, these buffers are thread-local variables referenced
> by inline loads and stores.
> In the kernel space, the compiler inserts a call
> __msan_get_context_state() at the beginning of every function, and
> then the instrumentation code uses whatever that function returned.
> 
> Assume we have a function:
> 
> int sum(int a, int b) {
>     result = a + b;
>     return result;
> }
> 
> Its instrumented version looks roughly as follows (we'll omit origin
> tracking for simplicity):
> 
> int sum(int a, int b) {
>   struct kmsan_context_state *kcs = __msan_get_context_state();
>   int s_a = ((int)kcs->param_tls)[0];  // shadow of a
>   int s_b = ((int)kcs->param_tls)[1];  // shadow of b
>   result = a + b;
>   s_result = s_a | s_b;
>   ((int)kcs->retval_tls)[0] = s_result;  // returned shadow
>   return result;
> }
> 
> Most certainly `a` and `b` will be passed using registers, but that
> doesn't matter: their metadata is safe as long as the caller does:
> 
> ((int)kcs->param_tls)[0] = s_a;
> ((int)kcs->param_tls)[1] = s_b;
> result = sum(a, b);
> s_result = ((int)kcs->retval_tls)[0];

Ok. I see how that works when both caller and callee are instrumented.

I assume that's tracked for all arguments regardless of type? e.g. that
includes pointers, structs, etc?

> All the metadata tracking is implemented using an LLVM IR pass, which
> ensures that the shadow for each uninitialized value is tracked
> regardless of where that value is stored - be that heap memory, stack
> or registers:
> - when a value is created or loaded from memory, the compiler inserts
> SSA registers corresponding to that value's shadow;
> - when a value is written to memory, a shadow store is inserted;
> - when a value is used in an operation, the result of that operation
> receives a shadow value depending on the operands and their shadow
> values;
> - when a value is passed to a function, the corresponding context
> state stores and loads are emitted.
> 
> Now, this works fine under the assumption that all code in the kernel
> is instrumented with KMSAN.
> However this is not true.
> 
> 1. There are a few `KMSAN_SANITIZE := n` in the Makefiles: some
> prevent infinite recursion caused by KMSAN calling instrumented code;
> others disable instrumentation of the code that executes before KMSAN
> is fully set up.
> 2. There are annotations to soft-disable KMSAN for a particular
> function by adding a `__no_kmsan_checks` attribute. There will still
> be instrumentation, but all the function outputs (stores and returns)
> will be initialized, and no errors will be reported.
> This is handy when the compiler cannot properly instrument the
> underlying code. One example would be tricky inline assembly, another
> one - stack walking functions that deliberately read and write
> uninitialized values.
> 3. There are cases in which KMSAN must be disabled for the whole
> function (`__no_sanitize_memory`).
> We disable instrumentation for `noinstr` functions. Additionally, on
> x86 there is exactly one case where this is done to avoid infinite
> recursion when storing the origins.

In addition to the above cases, out-of-line assembly functions won't
manipulate the shadow (so they're effectively the same as noinstr).

> It's important to note that, due to how function attributes are
> implemented in Clang, __no_kmsan_checks and __no_sanitize_memory will
> be applied to a function if that function exists during the KMSAN
> instrumentation pass. If it was inlined before that pass, the compiler
> will apply the instrumentation rules for whichever function is calling
> the inlined one.
> 
> Now, because of the described ABI, calls between instrumented
> functions and/or functions marked `__no_kmsan_checks` are perfectly
> fine because the function arguments are stored and loaded on both
> ends.
> What does not work well is calling non-instrumented functions from
> instrumented functions, and vice versa.

Yep; understood.

> In the former case, KMSAN will not see the callee's side effects
> (return values or memory stores), in the latter case, the callee may
> receive incorrect shadow values for the function parameters or memory
> stores in the caller. Both will

Incomplete sentence here?

> We have few tools to deal with such cases. It often helps to move the
> border between the instrumented and non-instrumented code by applying
> __no_kmsan_checks or __no_sanitize_memory until we get to a point
> where there are no incorrect shadow arguments.
> Another way to deal with uninitialized data coming from
> non-instrumented code is kmsan_unpoison_memory(address, size).
> Unfortunately, as Thomas pointed out, we can't use it for locals in
> non-instrumented code.

It's clearly not sufficient to use kmsan_unpoison_memory() for locals,
and I think it's non-sensical because conceptually they aren't memory.

I don't think kmsan_unpoison_entry_regs() alone is sufficient for the
regs. Surely the pointer argument itself needs to be unpoisoned before
being passed to a callee?

Otherwise, when pointer to regs is passed as an argument to a callee,
the callee will consume stale shadow, which might indicate that the
pointer to regs is uninitialized?

That seems to be exactly what we're hitting with the irq entry state, so
I'm surprised we're not hitting it for the regs. Is that sheer luck, or
is something else protecting us?

> > Below you suggest that the caller might add "hints", but it's not clear
> > specifically what this means.
> >
> > > > Something is wrong in KMSAN/compiler land or do you still believe that
> > > > you just need to unpoison the non existing memory 'state'?
> > >
> > > When we call an instrumented function from a non-instrumented one, the compiler
> > > is doomed to not understand that and to be unable to track the function
> > > parameters properly. Exactly because `noinstr` implies no instrumentation
> > > whatsoever, the compiler may not add any hints on the caller side that would
> > > help the callee understand what's going on - even if KMSAN is able to see this
> > > `noinstr` function (which is not always the case).
> > >
> > > So what we could do is to add annotations manualy on either the caller side or
> > > the callee side.
> >
> > Without some understanding of those "hints" you mention, I don't see how
> > we can do that on the caller side.
> 
> In this case by "hints" I meant any kind of instrumentation that the
> compiler could have inserted automatically to mark the data in the
> instrumented functions as initialized.
> There are no such "hints" right now (apart from letting KMSAN
> instrument the function).
> As for the manual annotations, we only have the mentioned
> `__no_kmsan_checks`, `__no_sanitize_memory`, and
> kmsan_unpoison_memory().

Ok.

For noinstr code calling instrumented code, I see that we could
explicitly modify the argument shadow in the noinstr caller, but that's
*really* grotty, and I don't think we want to do that.

> > > We can apply `__no_kmsan_checks` to irqentry_exit_to_kernel_mode_preempt(),
> > > making all its inputs initialized. This is the easiest solution, it may
> > > introduce false negatives, but we are on a very thin ice anyway, so perhaps
> > > doing so is better than dealing with more false positives in the interrupt code.
> > >
> > > Another option for the callee would be applying `__always_inline`, so that
> > > irqentry_exit_to_kernel_mode_preempt() also becomes non-instrumented.
> > > Given that irqentry_exit_to_kernel_mode_after_preempt() is already
> > > `__always_inline`, it might be the right thing to do.
> >
> > We can do that, but this really suggests that there's a fundemantal
> > inability to pass arguments between code which is noinstr and code which
> > isinstrumented with AddressSanitizer, and that's inevitably going to
> > bite us in future.
> 
> This is true (assuming you mean MemorySanitizer), but:

Sorry, yes, I meant MemorySanitizer.

> - I don't think we can avoid that, given that there will always be
> non-instrumented code;
> - so far the number of annotations has been manageable.

As above, I suspect that we're actually missing many necessary
annotations and getting away with this by accident.

I suspect that we need additional compiler support to say something like
"assume this argument/return value IS initialized".

> > > On the caller side, we could do something creative with instrumentation_begin()
> > > and instrumentation_end(). We've had a discussion about that exactly four years
> > > ago: https://lore.kernel.org/all/20220426164315.625149-29-glider@google.com/T/#u
> > > , but came to a conclusion that a handful of annotations on the noinstr/instr
> > > boundary may do a better job than a solution that doesn't cover all cases.
> >
> > That doesn't look general at all, so I am not keen on that.
> 
> Ack.
> 
> > > In particular, the case of irqentry_exit_to_kernel_mode_preempt() could have
> > > been solved by `__memset(state, 0, sizeof(struct kmsan_context_state))` in
> > > instrumentation_begin(). But it wouldn't solve more complex (yet rare, and
> > > non-existing today) cases where two functions are called from an instrumented
> > > region, and the first function somehow leaves the argument state poisoned.
> >
> > How exactly is kmsan_context_state used?
> 
> Hope the explanation above helps.
> >
> > If that's supposed to carry some global or current context, surely
> > blatting that in entry code will affect the code that was interrupted?
> 
> It will mostly affect the bottom-level function called by the entry
> code, after that, nested functions will be just passing their
> parameters as per KMSAN ABI.

The problem I was explaining was the other way around: we clobber the
context of the function that was executing *before* exception entry.

Consider when the CPU is executing some function foo(), and an exception
is taken. Within the exception handler, code will clobber the context
state. AFAICT, that state is not saved/restored, and nor is it zeroed
prior to exception return. The exception handler returns back to the
middle of foo(). From foo's PoV, the context state has been clobbered
arbitrarily, and that clobbered state could trigger false positives or
false negatives.

The way kmsan_get_context() uses in_task() will avoid that in *some*
cases, but not *all* cases. In particular, I don't think that's going to
help when a fault is taken for a uaccess.

Maybe I'm missing something here?

> > I see kmsan_get_context() has an in_task() check, but that can't help
> > with nested exceptions, so this doesn't look right at all.
> 
> KMSAN context is per-task, and if !in_task(), it is per-CPU.
> 
> For the nested exceptions, we conservatively bail out (see
> kmsan_in_runtime()) to avoid deadlocking.
> This may cause false negatives.

AFAICT that doesn't address the case I've described above.

Mark.

next prev parent reply	other threads:[~2026-05-12 17:46 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <69e7ee1f.a00a0220.17a17.001d.GAE@google.com>
     [not found] ` <CACT4Y+b5wU2wHfugXH0tjGyLS1ibTjj6KibXMMhKN59rXRuafw@mail.gmail.com>
     [not found]   ` <CAG_fn=WGjaOSEh7JJ9=e-QvpoMSnSv7=__cerR696-qAajc=Vw@mail.gmail.com>
2026-05-11 12:25     ` [syzbot] [kernel?] upstream test error: KMSAN: uninit-value in irqentry_exit_to_kernel_mode_preempt Thomas Gleixner
2026-05-12  9:33       ` Alexander Potapenko
2026-05-12 11:15         ` Mark Rutland
2026-05-12 16:24           ` Alexander Potapenko
2026-05-12 17:46             ` Mark Rutland [this message]
2026-05-13  0:36             ` Thomas Gleixner
2026-05-13  7:54               ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=agNnafV0ArGXyM9D@J2N7QTR9R3 \
    --to=mark.rutland@arm.com \
    --cc=dvyukov@google.com \
    --cc=glider@google.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=peterz@infradead.org \
    --cc=ruanjinjie@huawei.com \
    --cc=syzbot+cdcfd55737fe43eeb3a3@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=tglx@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox