linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Alexander Potapenko <glider@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrey Konovalov <andreyknvl@google.com>,
	Andy Lutomirski <luto@kernel.org>, Arnd Bergmann <arnd@arndb.de>,
	Borislav Petkov <bp@alien8.de>, Christoph Hellwig <hch@lst.de>,
	Christoph Lameter <cl@linux.com>,
	David Rientjes <rientjes@google.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	Eric Dumazet <edumazet@google.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	Ilya Leoshkevich <iii@linux.ibm.com>,
	Ingo Molnar <mingo@redhat.com>, Jens Axboe <axboe@kernel.dk>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Kees Cook <keescook@chromium.org>, Marco Elver <elver@google.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Matthew Wilcox <willy@infradead.org>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Pekka Enberg <penberg@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Petr Mladek <pmladek@suse.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Vegard Nossum <vegard.nossum@oracle.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	kasan-dev <kasan-dev@googlegroups.com>,
	Linux Memory Management List <linux-mm@kvack.org>,
	Linux-Arch <linux-arch@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v3 28/46] kmsan: entry: handle register passing from uninstrumented code
Date: Tue, 03 May 2022 00:00:36 +0200	[thread overview]
Message-ID: <87y1zjlhmj.ffs@tglx> (raw)
In-Reply-To: <CAG_fn=U7PPBmmkgxFcWFQUCqZitzMizr1e69D9f26sGGzeitLQ@mail.gmail.com>

Alexander,

On Mon, May 02 2022 at 19:00, Alexander Potapenko wrote:
> On Wed, Apr 27, 2022 at 3:32 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>> > --- a/kernel/entry/common.c
>> > +++ b/kernel/entry/common.c
>> > @@ -23,7 +23,7 @@ static __always_inline void __enter_from_user_mode(struct pt_regs *regs)
>> >       CT_WARN_ON(ct_state() != CONTEXT_USER);
>> >       user_exit_irqoff();
>> >
>> > -     instrumentation_begin();
>> > +     instrumentation_begin_with_regs(regs);
>>
>> I can see what you are trying to do, but this will end up doing the same
>> thing over and over. Let's just look at a syscall.
>>
>> __visible noinstr void do_syscall_64(struct pt_regs *regs, int nr)
>> {
>>         ...
>>         nr = syscall_enter_from_user_mode(regs, nr)
>>
>>                 __enter_from_user_mode(regs)
>>                         .....
>>                         instrumentation_begin_with_regs(regs);
>>                         ....
>>
>>                 instrumentation_begin_with_regs(regs);
>>                 ....
>>
>>         instrumentation_begin_with_regs(regs);
>>
>>         if (!do_syscall_x64(regs, nr) && !do_syscall_x32(regs, nr) && nr != -1) {
>>                 /* Invalid system call, but still a system call. */
>>                 regs->ax = __x64_sys_ni_syscall(regs);
>>         }
>>
>>         instrumentation_end();
>>
>>         syscall_exit_to_user_mode(regs);
>>                 instrumentation_begin_with_regs(regs);
>>                 __syscall_exit_to_user_mode_work(regs);
>>         instrumentation_end();
>>         __exit_to_user_mode();
>>
>> That means you memset state four times and unpoison regs four times. I'm
>> not sure whether that's desired.
>
> Regarding the regs, you are right. It should be enough to unpoison the
> regs at idtentry prologue instead.
> I tried that initially, but IIRC it required patching each of the
> DEFINE_IDTENTRY_XXX macros, which already use instrumentation_begin().

Exactly 4 instances :)

> This decision can probably be revisited.

It has to be revisited because the whole thing is incomplete if this is
not addressed.

> As for the state, what we are doing here is still not enough, although
> it appears to work.
>
> Every time an instrumented function calls another function, it sets up
> the metadata for the function arguments in the per-task struct
> kmsan_context_state.
> Similarly, every instrumented function expects its caller to put the
> metadata into that structure.
> Now, if a non-instrumented function (e.g. every `noinstr` function)
> calls an instrumented one (which happens inside the
> instrumentation_begin()/instrumentation_end() region), nobody sets up
> the state for that instrumented function, so it may report false
> positives when accessing its arguments, if there are leftover poisoned
> values in the state.
>
> To overcome this problem, ideally we need to wipe kmsan_context_state
> every time a call from the non-instrumented function occurs.
> But this cannot be done automatically exactly because we cannot
> instrument the named function :)
>
> We therefore apply an approximation, wiping the state at the point of
> the first transition between instrumented and non-instrumented code.
> Because poison values are generally rare, and instrumented regions
> tend to be short, it is unlikely that further calls from the same
> non-instrumented function will result in false positives.
> Yet it is not completely impossible, so wiping the state for the
> second/third etc. time won't hurt.

Understood. But if I understand you correctly:

> Similarly, every instrumented function expects its caller to put the
> metadata into that structure.

then

     instrumentation_begin();
     foo(fargs...);
     bar(bargs...);
     instrumentation_end();

is a source of potential false positives because the state is not
guaranteed to be correct, neither for foo() nor for bar(), even if you
wipe the state in instrumentation_begin(), right?

This approximation approach smells fishy and it's inevitably going to be
a constant source of 'add yet another kmsan annotation/fixup' patches,
which I'm not interested in at all.

As this needs compiler support anyway, then why not doing the obvious:

#define noinstr                                 \
        .... __kmsan_conditional

#define instrumentation_begin()                 \
        ..... __kmsan_cond_begin

#define instrumentation_end()                   \
        __kmsan_cond_end .......

and let the compiler stick whatever is required into that code section
between instrumentation_begin() and instrumentation_end()?

That's not violating any of the noinstr constraints at all. In fact we
allow _any_ instrumentation to be placed between this two points. We
have tracepoints there today.

We could also allow breakpoints, kprobes or whatever, but handling this
at that granularity level for a production kernel is just overkill and
the code in those instrumentable sections is usually not that
interesting as it's mostly function calls.

But if the compiler converts

     instrumentation_begin();
     foo(fargs...);
     bar(bargs...);
     instrumentation_end();

to

     instrumentation_begin();
     kmsan_instr_begin_magic();
     kmsan_magic(fargs...);
     foo(fargs...);
     kmsan_magic(bargs...);
     bar(bargs...);
     kmsan_instr_end_magic();
     instrumentation_end();

for the kmsan case and leaves anything outside of these sections alone,
then you have:

   - a minimal code change
   - the best possible coverage
   - the least false positive crap to chase and annotate

IOW, a solution which is solid and future proof.

I'm all for making use of advanced instrumentation, validation and
debugging features, but this mindset of 'make the code comply to what
the tool of today provides' is fundamentally wrong. Tools have to
provide value to the programmer and not the other way round.

Yes, it's more work on the tooling side, but the tooling side is mostly
a one time effort while chasing the false positives is a long term
nightmare.

Thanks,

        tglx


  reply	other threads:[~2022-05-02 22:00 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-26 16:42 [PATCH v3 00/46] Add KernelMemorySanitizer infrastructure Alexander Potapenko
2022-04-26 16:42 ` [PATCH v3 01/46] x86: add missing include to sparsemem.h Alexander Potapenko
2022-04-27 13:22   ` Alexander Potapenko
2022-04-26 16:42 ` [PATCH v3 02/46] stackdepot: reserve 5 extra bits in depot_stack_handle_t Alexander Potapenko
2022-04-26 16:42 ` [PATCH v3 03/46] kasan: common: adapt to the new prototype of __stack_depot_save() Alexander Potapenko
2022-04-27 12:47   ` Marco Elver
2022-04-26 16:42 ` [PATCH v3 04/46] instrumented.h: allow instrumenting both sides of copy_from_user() Alexander Potapenko
2022-04-26 16:42 ` [PATCH v3 05/46] x86: asm: instrument usercopy in get_user() and __put_user_size() Alexander Potapenko
2022-04-27  3:45   ` kernel test robot
2022-04-27  6:58   ` kernel test robot
2022-04-27  7:14   ` Arnd Bergmann
2022-06-02 11:20     ` Alexander Potapenko
2022-04-27 14:24   ` kernel test robot
2022-04-28  1:59   ` kernel test robot
2022-04-30 10:16   ` [x86] d216de19c8: kernel-selftests.x86.ioperm_32.fail kernel test robot
2022-04-26 16:42 ` [PATCH v3 06/46] asm-generic: instrument usercopy in cacheflush.h Alexander Potapenko
2022-04-26 16:42 ` [PATCH v3 07/46] kmsan: add ReST documentation Alexander Potapenko
2022-04-26 16:42 ` [PATCH v3 08/46] kmsan: introduce __no_sanitize_memory and __no_kmsan_checks Alexander Potapenko
2022-04-26 16:42 ` [PATCH v3 09/46] kmsan: mark noinstr as __no_sanitize_memory Alexander Potapenko
2022-04-26 16:42 ` [PATCH v3 10/46] x86: kmsan: pgtable: reduce vmalloc space Alexander Potapenko
2022-04-26 16:42 ` [PATCH v3 11/46] libnvdimm/pfn_dev: increase MAX_STRUCT_PAGE_SIZE Alexander Potapenko
2022-04-26 16:42 ` [PATCH v3 12/46] kmsan: add KMSAN runtime core Alexander Potapenko
2022-04-27 14:09   ` Marco Elver
2022-05-31 11:08     ` Alexander Potapenko
2022-04-26 16:42 ` [PATCH v3 13/46] kmsan: implement kmsan_init(), initialize READ_ONCE_NOCHECK() Alexander Potapenko
2022-04-26 16:42 ` [PATCH v3 14/46] kmsan: disable instrumentation of unsupported common kernel code Alexander Potapenko
2022-04-26 16:42 ` [PATCH v3 15/46] MAINTAINERS: add entry for KMSAN Alexander Potapenko
2022-04-26 16:42 ` [PATCH v3 16/46] kmsan: mm: maintain KMSAN metadata for page operations Alexander Potapenko
2022-04-26 16:42 ` [PATCH v3 17/46] kmsan: mm: call KMSAN hooks from SLUB code Alexander Potapenko
2022-04-26 16:42 ` [PATCH v3 18/46] kmsan: handle task creation and exiting Alexander Potapenko
2022-04-26 16:42 ` [PATCH v3 19/46] kmsan: init: call KMSAN initialization routines Alexander Potapenko
2022-04-26 16:42 ` [PATCH v3 20/46] instrumented.h: add KMSAN support Alexander Potapenko
2022-04-26 16:42 ` [PATCH v3 21/46] kmsan: unpoison @tlb in arch_tlb_gather_mmu() Alexander Potapenko
2022-04-26 16:42 ` [PATCH v3 22/46] kmsan: add iomap support Alexander Potapenko
2022-04-26 16:42 ` [PATCH v3 23/46] Input: libps2: mark data received in __ps2_command() as initialized Alexander Potapenko
2022-04-26 16:42 ` [PATCH v3 24/46] kmsan: dma: unpoison DMA mappings Alexander Potapenko
2022-04-26 16:42 ` [PATCH v3 25/46] kmsan: virtio: check/unpoison scatterlist in vring_map_one_sg() Alexander Potapenko
2022-04-26 16:42 ` [PATCH v3 26/46] kmsan: handle memory sent to/from USB Alexander Potapenko
2022-04-26 16:42 ` [PATCH v3 27/46] kmsan: instrumentation.h: add instrumentation_begin_with_regs() Alexander Potapenko
2022-04-27 13:28   ` Thomas Gleixner
2022-05-16 11:49     ` Alexander Potapenko
2022-04-26 16:42 ` [PATCH v3 28/46] kmsan: entry: handle register passing from uninstrumented code Alexander Potapenko
2022-04-27 13:32   ` Thomas Gleixner
2022-05-02 17:00     ` Alexander Potapenko
2022-05-02 22:00       ` Thomas Gleixner [this message]
2022-05-05 18:04         ` Alexander Potapenko
2022-05-05 21:56           ` Thomas Gleixner
2022-05-06 14:52             ` Alexander Potapenko
2022-05-06 16:14               ` Thomas Gleixner
2022-05-06 17:41                 ` Alexander Potapenko
2022-05-06 18:41                   ` Thomas Gleixner
2022-05-09 16:50                     ` Alexander Potapenko
2022-05-09 16:51                       ` Alexander Potapenko
2022-05-09 19:09                       ` Thomas Gleixner
2022-05-12 12:24                         ` Alexander Potapenko
2022-05-12 16:17                           ` Thomas Gleixner
2022-05-12 16:48                             ` Thomas Gleixner
2022-06-01 11:27                               ` Alexander Potapenko
2022-04-26 16:42 ` [PATCH v3 29/46] kmsan: add tests for KMSAN Alexander Potapenko
2022-04-26 16:42 ` [PATCH v3 30/46] kmsan: disable strscpy() optimization under KMSAN Alexander Potapenko
2022-04-26 16:43 ` [PATCH v3 31/46] crypto: kmsan: disable accelerated configs " Alexander Potapenko
2022-04-26 16:43 ` [PATCH v3 32/46] kmsan: disable physical page merging in biovec Alexander Potapenko
2022-04-26 16:43 ` [PATCH v3 33/46] kmsan: block: skip bio block merging logic for KMSAN Alexander Potapenko
2022-04-26 16:43 ` [PATCH v3 34/46] kmsan: kcov: unpoison area->list in kcov_remote_area_put() Alexander Potapenko
2022-04-26 16:43 ` [PATCH v3 35/46] security: kmsan: fix interoperability with auto-initialization Alexander Potapenko
2022-04-26 16:43 ` [PATCH v3 36/46] objtool: kmsan: list KMSAN API functions as uaccess-safe Alexander Potapenko
2022-04-26 16:43 ` [PATCH v3 37/46] x86: kmsan: make READ_ONCE_TASK_STACK() return initialized values Alexander Potapenko
2022-04-26 16:43 ` [PATCH v3 38/46] x86: kmsan: disable instrumentation of unsupported code Alexander Potapenko
2022-04-26 16:43 ` [PATCH v3 39/46] x86: kmsan: skip shadow checks in __switch_to() Alexander Potapenko
2022-04-26 16:43 ` [PATCH v3 40/46] x86: kmsan: handle open-coded assembly in lib/iomem.c Alexander Potapenko
2022-04-26 16:43 ` [PATCH v3 41/46] x86: kmsan: use __msan_ string functions where possible Alexander Potapenko
2022-04-26 16:43 ` [PATCH v3 42/46] x86: kmsan: sync metadata pages on page fault Alexander Potapenko
2022-04-26 16:43 ` [PATCH v3 43/46] x86: kasan: kmsan: support CONFIG_GENERIC_CSUM on x86, enable it for KASAN/KMSAN Alexander Potapenko
2022-04-26 16:43 ` [PATCH v3 44/46] x86: fs: kmsan: disable CONFIG_DCACHE_WORD_ACCESS Alexander Potapenko
2022-04-26 16:43 ` [PATCH v3 45/46] x86: kmsan: handle register passing from uninstrumented code Alexander Potapenko
2022-04-26 16:43 ` [PATCH v3 46/46] x86: kmsan: enable KMSAN builds for x86 Alexander Potapenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y1zjlhmj.ffs@tglx \
    --to=tglx@linutronix.de \
    --cc=akpm@linux-foundation.org \
    --cc=andreyknvl@google.com \
    --cc=arnd@arndb.de \
    --cc=axboe@kernel.dk \
    --cc=bp@alien8.de \
    --cc=cl@linux.com \
    --cc=dvyukov@google.com \
    --cc=edumazet@google.com \
    --cc=elver@google.com \
    --cc=glider@google.com \
    --cc=gor@linux.ibm.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=hch@lst.de \
    --cc=herbert@gondor.apana.org.au \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=iii@linux.ibm.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=keescook@chromium.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=mst@redhat.com \
    --cc=penberg@kernel.org \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=rientjes@google.com \
    --cc=rostedt@goodmis.org \
    --cc=vbabka@suse.cz \
    --cc=vegard.nossum@oracle.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).