public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sven Schnelle <svens@linux.ibm.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org
Subject: Re: ftrace startup tests crashing due to missing rcu_synchronize()
Date: Wed, 16 Feb 2022 20:23:55 +0100	[thread overview]
Message-ID: <yt9dee424nuc.fsf@linux.ibm.com> (raw)
In-Reply-To: <20220216135419.01d96fe1@gandalf.local.home> (Steven Rostedt's message of "Wed, 16 Feb 2022 13:54:19 -0500")

Steven Rostedt <rostedt@goodmis.org> writes:

> On Wed, 16 Feb 2022 19:39:03 +0100
> Sven Schnelle <svens@linux.ibm.com> wrote:
>> [    4.460091] Unable to handle kernel pointer dereference in virtual kernel address space
>> [    4.460375] Failing address: 6b6b6b6b6b6b6000 TEID: 6b6b6b6b6b6b6803
>> [    4.460458] Fault in home space mode while using kernel ASCE.
>> [    4.460695] AS:000000008561c007 R3:0000000000000024
>> [    4.461143] Oops: 0038 ilc:3 [#1] PREEMPT SMP
>> [    4.461162] Modules linked in:
>> [    4.461175] CPU: 245 PID: 0 Comm: swapper/245 Not tainted 5.17.0-rc4-00051-gc5d9ae265b10-dirty #4
>> [    4.461183] Hardware name: IBM 8561 T01 701 (KVM/Linux)
>
> I this a 390?

Yes.

>> Looking at unregister_ftrace_function(), i noticed that
>> ftrace_shutdown() is called with 0 as command. Given that the ftrace
>> function didn't change and ftrace is still enabled, the
>> rcu_synchronize() functions in ftrace_shutdown() are silently skipped.
>> So the caller frees ops already before other CPUs have gone through
>> quiesce, and may therefore use the old (now freed) list entry.
>> 
>> To fix this, i wonder whether we should change the code in
>> unregister_ftrace_function() to:
>> 
>> @@ -7827,7 +7837,7 @@ int unregister_ftrace_function(struct ftrace_ops *ops)
>>         int ret;
>>  
>>         mutex_lock(&ftrace_lock);
>> -       ret = ftrace_shutdown(ops, 0);
>> +       ret = ftrace_shutdown(ops, FTRACE_UPDATE_TRACE_FUNC);
>
> No, the ftrace_shutdown() will add that flag if it is needed.
>
>>         mutex_unlock(&ftrace_lock);
>>  
>>         return ret;
>> 
>> I haven't checked whether other callsites of unregister_ftrace_function()
>> also need to be adjusted. What do you think about that 'fix'?
>
> But what I'm thinking is, the function is being freed but has yet to be
> removed from the list. Or that a synchronization is missed.
>
> That is, shutdown is called, the item is removed from the list and freed,
> but something got preempted while on the ftrace trampoline, with a
> reference to the item, and then woke up and executed the item that was
> freed.
>
> I'll look into it. Thanks for the report.

With additional debugging i see:

@@ -2967,14 +2974,17 @@ int ftrace_shutdown(struct ftrace_ops *ops, int command)
        }
 
        if (!command || !ftrace_enabled) {
+               pr_err("%s: skipping rcu_synchronize(): ops=%pS command=%d ftrace_enabled=%d saved func=%pS ftrace_trace_func=%pS\n",
+                      __func__, ops, command, ftrace_enabled, saved_ftrace_func, ftrace_trace_function);
                /*
                 * If these are dynamic or per_cpu ops, they still
                 * need their data freed. Since, function tracing is
                 * not currently active, we can just free them
                 * without synchronizing all CPUs.
                 */

[  +0.000011] unregister_ftrace_function: 0x2f6792e00
[  +0.000023] removing ops 00000002f6792e00 trace_selftest_test_dyn_func+0x0/0x18
[  +0.000661] ftrace_shutdown: skipping rcu_synchronize(): ops=0x2f6792e00 command=0 ftrace_enabled=1 saved func=arch_ftrace_ops_list_func+0x0/0x1b0 ft>
[  +0.010032] unregister_ftrace_function: test_probe1+0x0/0x1b0
[  +0.000017] removing ops 000000009d876e40 trace_selftest_test_probe1_func+0x0/0x18
[  +0.000053] ftrace_shutdown: skipping rcu_synchronize(): ops=test_probe1+0x0/0x1b0 command=0 ftrace_enabled=1 saved func=arch_ftrace_ops_list_func+0x>
[  +0.000017] unregister_ftrace_function: test_probe2+0x0/0x1b0
[  +0.000081] removing ops 000000009d876ff0 trace_selftest_test_probe2_func+0x0/0x18
[  +0.000064] ftrace_shutdown: skipping rcu_synchronize(): ops=test_probe2+0x0/0x1b0 command=0 ftrace_enabled=1 saved func=arch_ftrace_ops_list_func+0x>
[  +0.000015] unregister_ftrace_function: test_probe3+0x0/0x1b0
[  +0.000011] removing ops 000000009d8771a0 trace_selftest_test_probe3_func+0x0/0x18
[  +0.000108] unregister_ftrace_function: global_ops+0x0/0x1b0
[  +0.000010] removing ops 000000009d9325b8 trace_selftest_test_global_func+0x0/0x18
[  +0.025759] PASSED

So the rcu_synchronize is definitely skipped. Another thing i was
wondering was whether we need to reset the next pointer in
the to-be-removed entry in remove_ftrace_ops(). But i haven't
investigated that in detail yet.

      parent reply	other threads:[~2022-02-16 19:24 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-16 18:39 ftrace startup tests crashing due to missing rcu_synchronize() Sven Schnelle
2022-02-16 18:54 ` Steven Rostedt
2022-02-16 18:58   ` Steven Rostedt
2022-02-16 19:31     ` Sven Schnelle
2022-02-16 19:23   ` Sven Schnelle [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=yt9dee424nuc.fsf@linux.ibm.com \
    --to=svens@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox