public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Andy Lutomirski <luto@amacapital.net>,
	Linux List Kernel Mailing <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andy Lutomirski <luto@kernel.org>,
	Nicolai Stange <nstange@suse.de>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"the arch/x86 maintainers" <x86@kernel.org>,
	Josh Poimboeuf <jpoimboe@redhat.com>,
	Jiri Kosina <jikos@kernel.org>, Miroslav Benes <mbenes@suse.cz>,
	Petr Mladek <pmladek@suse.com>,
	Joe Lawrence <joe.lawrence@redhat.com>,
	Shuah Khan <shuah@kernel.org>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Mimi Zohar <zohar@linux.ibm.com>, Juergen Gross <jgross@suse.com>,
	Nick Desaulniers <ndesaulniers@google.com>,
	Nayna Jain <nayna@linux.ibm.com>,
	Masahiro Yamada <yamada.masahiro@socionext.com>,
	Joerg Roedel <jroedel@suse.de>,
	"open list:KERNEL SELFTEST FRAMEWORK" 
	<linux-kselftest@vger.kernel.org>,
	stable <stable@vger.kernel.org>,
	Masami Hiramatsu <mhiramat@kernel.org>
Subject: Re: [RFC][PATCH 1/2] x86: Allow breakpoints to emulate call functions
Date: Mon, 6 May 2019 20:10:14 -0400	[thread overview]
Message-ID: <20190506201014.484e7b65@oasis.local.home> (raw)
In-Reply-To: <CAHk-=wje38dbYFGNw0y==zd7Zo_4s2WOQjWaBDyr24RCdK2EPQ@mail.gmail.com>

On Mon, 6 May 2019 15:31:57 -0700
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Mon, May 6, 2019 at 3:06 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > Why are you emulating something different than what you are rewriting?  
> 
> Side note: I'm also finding another bug on the ftrace side, which is a
> simple race condition.
> 
> In particular, the logic with 'modifying_ftrace_code' is fundamentally racy.
> 
> What can happen is that on one CPU we rewrite one instruction:
> 
>         ftrace_update_func = ip;
>         /* Make sure the breakpoints see the ftrace_update_func update */
>         smp_wmb();
> 
>         /* See comment above by declaration of modifying_ftrace_code */
>         atomic_inc(&modifying_ftrace_code);
> 
>         ret = ftrace_modify_code(ip, old, new);
> 
>         atomic_dec(&modifying_ftrace_code);
> 
>    but then another CPU hits the 'int3' while the modification is
> going on, and takes the fault.
> 
> The fault handler does that
> 
>         if (unlikely(atomic_read(&modifying_ftrace_code))..
> 
> and sees that "yes, it's in the middle of modifying the ftrace code",
> and calls ftrace_int3_handler().  All good and "obviously correct" so
> far, no?
> 
> HOWEVER. It's actually buggy. Because in the meantime, the CPU that
> was rewriting instructions has finished, and decrements the
> modifying_ftrace_code, which doesn't hurt us (because we already saw
> that the int3 was due to the modification.

But the CPU that was rewriting instructions does a run_sync() after
removing the int3:

static void run_sync(void)
{
	int enable_irqs;

	/* No need to sync if there's only one CPU */
	if (num_online_cpus() == 1)
		return;

	enable_irqs = irqs_disabled();

	/* We may be called with interrupts disabled (on bootup). */
	if (enable_irqs)
		local_irq_enable();
	on_each_cpu(do_sync_core, NULL, 1);
	if (enable_irqs)
		local_irq_disable();
}

Which sends an IPI to all CPUs to make sure they no longer see the int3.

> 
> BUT! There are two different races here:
> 
>  (a) maybe the fault handling was slow, and we saw the 'int3' and took
> the fault, but the modifying CPU had already finished, so that
> atomic_read(&modifying_ftrace_code) didn't actually trigger at all.
> 
>  (b) maybe the int3-faulting CPU *did* see the proper value of
> modifying_ftrace_code, but the modifying CPU went on and started
> *another* modification, and has changed ftrace_update_func in the
> meantime, so now the int3 handling is looking at the wrong values!
> 
> In the case of (a), we'll die with an oops due to the inexplicable
> 'int3' we hit. And in the case of (b) we'll be fixing up using the
> wrong address.
> 
> Things like this is why I'm wondering how much of the problems are due
> to the entry code, and how much of it is due to simply races and
> timing differences?
> 
> Again, I don't actually know the ftrace code, and maybe I'm missing
> something, but this really looks like _another_ fundamental bug.
> 
> The way to handle that modifying_ftrace_code thing is most likely by
> using a sequence counter. For example, one way to actually do some
> thing like this might be
> 
>         ftrace_update_func = ip;
>         ftrace_update_target = func;
>         smp_wmb();
>         atomic_inc(&modifying_ftrace_head);
> 
>         ret = ftrace_modify_code(ip, old, new);
> 
>         atomic_inc(&modifying_ftrace_tail);
>         smp_wmb();
> 
> and now the int3 code could do something like
> 
>         int head, tail;
> 
>         head = atomic_read(&modifying_ftrace_head);
>         smp_rmb();
>         tail = atomic_read(&modifying_ftrace_tail);
> 
>         /* Are we still in the process of modification? */
>         if (unlikely(head != tail+1))
>                 return 0;
> 
>         ip = ftrace_update_func;
>         func = ftrace_update_target;
>         smp_rmb();
>         /* Need to re-check that the above two values are consistent
> and we didn't exit */
>         if (atomic_read(&modifying_ftrace_tail) != tail)
>                 return 0;
> 
>         *pregs int3_emulate_call(regs, ip, func);
>         return 1;
> 
> although it probably really would be better to use a seqcount instead
> of writing it out like the above.
> 
> NOTE! The above only fixes the (b) race. The (a) race is probably best
> handled by actually checking if the 'int3' instruction is still there
> before dying.
> 
> Again, maybe there's something I'm missing, but having looked at that
> patch now what feels like a million times, I'm finding more worrisome
> things in the ftrace code than in the kernel entry code..

I think you are missing the run_sync() which is the heavy hammer to
make sure all CPUs are in sync. And this is done at each stage:

	add int3
	run_sync();
	update call cite outside of int3
	run_sync()
	remove int3
	run_sync()

HPA said that the last run_sync() isn't needed, but I kept it because I
wanted to make sure. Looks like your analysis shows that it is needed.

-- Steve


  reply	other threads:[~2019-05-07  0:10 UTC|newest]

Thread overview: 103+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-01 20:28 [RFC][PATCH 0/2] ftrace/x86: Allow for breakpoint handlers to emulate call functions Steven Rostedt
2019-05-01 20:28 ` [RFC][PATCH 1/2] x86: Allow breakpoints " Steven Rostedt
2019-05-02  3:24   ` Steven Rostedt
2019-05-02 16:21     ` Peter Zijlstra
2019-05-02 16:29       ` Peter Zijlstra
2019-05-02 18:02       ` Linus Torvalds
2019-05-02 18:18         ` Peter Zijlstra
2019-05-02 18:30           ` Peter Zijlstra
2019-05-02 18:43           ` Linus Torvalds
2019-05-02 19:28             ` Jiri Kosina
2019-05-02 20:25               ` Andy Lutomirski
2019-05-02 20:21             ` Peter Zijlstra
2019-05-02 20:49               ` Linus Torvalds
2019-05-02 21:32                 ` Peter Zijlstra
2019-05-03 19:24                 ` Steven Rostedt
2019-05-03 21:46                   ` Linus Torvalds
2019-05-03 22:49                     ` Steven Rostedt
2019-05-03 23:07                       ` Linus Torvalds
2019-05-04  4:17                         ` Steven Rostedt
     [not found]                           ` <CAHk-=wiuSFbv_rELND-BLWcP0GSZ0yF=xOAEcf61GE3bU9d=yg@mail.gmail.com>
2019-05-04 18:59                             ` Linus Torvalds
2019-05-04 20:12                               ` Andy Lutomirski
2019-05-04 20:28                                 ` Linus Torvalds
2019-05-04 20:36                                 ` Linus Torvalds
2019-05-03 22:55                     ` Andy Lutomirski
2019-05-03 23:16                       ` Linus Torvalds
2019-05-03 23:32                         ` Andy Lutomirski
2019-05-02 22:52               ` Steven Rostedt
2019-05-02 23:31                 ` Steven Rostedt
2019-05-02 23:50                   ` Steven Rostedt
2019-05-03  1:51                     ` [RFC][PATCH 1/2 v2] " Steven Rostedt
2019-05-03  9:29                     ` [RFC][PATCH 1/2] " Peter Zijlstra
2019-05-03 13:22                       ` Steven Rostedt
2019-05-03 16:20                         ` Andy Lutomirski
2019-05-03 16:31                           ` Steven Rostedt
2019-05-03 16:35                             ` Peter Zijlstra
2019-05-03 16:44                               ` Andy Lutomirski
2019-05-03 16:49                                 ` Steven Rostedt
2019-05-03 16:32                           ` Peter Zijlstra
2019-05-03 18:57                           ` Linus Torvalds
2019-05-06  8:19                             ` Peter Zijlstra
2019-05-06 13:56                               ` Steven Rostedt
2019-05-06 16:17                                 ` Linus Torvalds
2019-05-06 16:19                                   ` Linus Torvalds
2019-05-06 17:06                                   ` Steven Rostedt
2019-05-06 18:06                                     ` Linus Torvalds
2019-05-06 18:57                                       ` Steven Rostedt
2019-05-06 19:46                                         ` Linus Torvalds
2019-05-06 20:29                                           ` Steven Rostedt
2019-05-06 20:42                                             ` Linus Torvalds
2019-05-06 20:44                                               ` Linus Torvalds
2019-05-06 21:45                                               ` Steven Rostedt
2019-05-06 22:06                                                 ` Linus Torvalds
2019-05-06 22:31                                                   ` Linus Torvalds
2019-05-07  0:10                                                     ` Steven Rostedt [this message]
2019-05-07  1:06                                                       ` Linus Torvalds
2019-05-07  1:04                                                   ` Steven Rostedt
2019-05-07  1:34                                                     ` Steven Rostedt
2019-05-07  1:34                                                     ` Linus Torvalds
2019-05-07  1:53                                                       ` Steven Rostedt
2019-05-07  2:22                                                         ` Linus Torvalds
2019-05-07  2:58                                                           ` Steven Rostedt
2019-05-07  3:05                                                             ` Linus Torvalds
2019-05-07  3:21                                                               ` Steven Rostedt
2019-05-07  3:28                                                                 ` Linus Torvalds
2019-05-07 14:54                                                                   ` Linus Torvalds
2019-05-07 15:12                                                                     ` Steven Rostedt
2019-05-07 15:25                                                                       ` Steven Rostedt
2019-05-07 16:25                                                                         ` Steven Rostedt
2019-05-07 15:31                                                                       ` Linus Torvalds
2019-05-07 15:45                                                                         ` Steven Rostedt
2019-05-07 16:34                                                                         ` Peter Zijlstra
2019-05-07 17:08                                                                           ` Linus Torvalds
2019-05-07 17:21                                                                             ` Josh Poimboeuf
2019-05-07 21:24                                                                               ` Steven Rostedt
2019-05-08  4:50                                                                                 ` Linus Torvalds
2019-05-08 16:37                                                                                   ` Steven Rostedt
2019-05-07 17:38                                                                             ` Peter Zijlstra
2019-05-07  9:51                                                           ` Peter Zijlstra
2019-05-07 14:48                                                           ` Andy Lutomirski
2019-05-07 14:57                                                             ` Linus Torvalds
2019-05-07 14:13                                                 ` Masami Hiramatsu
2019-05-07 17:15                                                   ` Masami Hiramatsu
2019-05-06 14:22                               ` Peter Zijlstra
2019-05-07  8:57                               ` Peter Zijlstra
2019-05-07  9:18                                 ` David Laight
2019-05-07 11:30                                   ` Peter Zijlstra
2019-05-07 12:57                                     ` David Laight
2019-05-07 13:14                                       ` Steven Rostedt
2019-05-07 14:50                                         ` David Laight
2019-05-07 14:57                                           ` Steven Rostedt
2019-05-07 15:46                                             ` David Laight
2019-05-07 13:32                                       ` Peter Zijlstra
2019-05-07  9:27                                 ` Peter Zijlstra
2019-05-07 12:27                                   ` Steven Rostedt
2019-05-07 12:41                                     ` Peter Zijlstra
2019-05-07 12:54                                       ` Steven Rostedt
2019-05-07 17:22                                         ` Masami Hiramatsu
2019-05-07 14:28                                 ` Peter Zijlstra
2019-05-02 20:48         ` Steven Rostedt
2019-05-06 15:14         ` Josh Poimboeuf
2019-05-01 20:28 ` [RFC][PATCH 2/2] ftrace/x86: Emulate call function while updating in breakpoint handler Steven Rostedt
2019-05-03 10:22 ` [RFC][PATCH 1.5/2] x86: Add int3_emulate_call() selftest Peter Zijlstra
2019-05-03 18:46   ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190506201014.484e7b65@oasis.local.home \
    --to=rostedt@goodmis.org \
    --cc=akpm@linux-foundation.org \
    --cc=bigeasy@linutronix.de \
    --cc=bp@alien8.de \
    --cc=hpa@zytor.com \
    --cc=jgross@suse.com \
    --cc=jikos@kernel.org \
    --cc=joe.lawrence@redhat.com \
    --cc=jpoimboe@redhat.com \
    --cc=jroedel@suse.de \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=luto@kernel.org \
    --cc=mbenes@suse.cz \
    --cc=mhiramat@kernel.org \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=nayna@linux.ibm.com \
    --cc=ndesaulniers@google.com \
    --cc=nstange@suse.de \
    --cc=peterz@infradead.org \
    --cc=pmladek@suse.com \
    --cc=shuah@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@linux.intel.com \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    --cc=yamada.masahiro@socionext.com \
    --cc=zohar@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox