From: 王贇 <yun.wang@linux.alibaba.com>
To: Petr Mladek <pmladek@suse.com>
Cc: Guo Ren <guoren@kernel.org>, Steven Rostedt <rostedt@goodmis.org>,
Ingo Molnar <mingo@redhat.com>,
"James E.J. Bottomley" <James.Bottomley@hansenpartnership.com>,
Helge Deller <deller@gmx.de>,
Michael Ellerman <mpe@ellerman.id.au>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Paul Mackerras <paulus@samba.org>,
Paul Walmsley <paul.walmsley@sifive.com>,
Palmer Dabbelt <palmer@dabbelt.com>,
Albert Ou <aou@eecs.berkeley.edu>,
Thomas Gleixner <tglx@linutronix.de>,
Borislav Petkov <bp@alien8.de>,
x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
Josh Poimboeuf <jpoimboe@redhat.com>,
Jiri Kosina <jikos@kernel.org>, Miroslav Benes <mbenes@suse.cz>,
Joe Lawrence <joe.lawrence@redhat.com>,
Colin Ian King <colin.king@canonical.com>,
Masami Hiramatsu <mhiramat@kernel.org>,
"Peter Zijlstra (Intel)" <peterz@infradead.org>,
Nicholas Piggin <npiggin@gmail.com>,
Jisheng Zhang <jszhang@kernel.org>,
linux-csky@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
linux-riscv@lists.infradead.org, live-patching@vger.kernel.org
Subject: Re: [PATCH v3 1/2] ftrace: disable preemption between ftrace_test_recursion_trylock/unlock()
Date: Fri, 15 Oct 2021 17:12:26 +0800 [thread overview]
Message-ID: <3c87e825-e907-cba0-e95f-28878356fc71@linux.alibaba.com> (raw)
In-Reply-To: <YWktujP6DcMnRIXT@alley>
On 2021/10/15 下午3:28, Petr Mladek wrote:
> On Fri 2021-10-15 11:13:08, 王贇 wrote:
>>
>>
>> On 2021/10/14 下午11:14, Petr Mladek wrote:
>> [snip]
>>>> - return trace_test_and_set_recursion(ip, parent_ip, TRACE_FTRACE_START, TRACE_FTRACE_MAX);
>>>> + int bit;
>>>> +
>>>> + bit = trace_test_and_set_recursion(ip, parent_ip, TRACE_FTRACE_START, TRACE_FTRACE_MAX);
>>>> + /*
>>>> + * The zero bit indicate we are nested
>>>> + * in another trylock(), which means the
>>>> + * preemption already disabled.
>>>> + */
>>>> + if (bit > 0)
>>>> + preempt_disable_notrace();
>>>
>>> Is this safe? The preemption is disabled only when
>>> trace_test_and_set_recursion() was called by ftrace_test_recursion_trylock().
>>>
>>> We must either always disable the preemtion when bit >= 0.
>>> Or we have to disable the preemtion already in
>>> trace_test_and_set_recursion().
>>
>> Internal calling of trace_test_and_set_recursion() will disable preemption
>> on succeed, it should be safe.
>
> trace_test_and_set_recursion() does _not_ disable preemtion!
> It works only because all callers disable the preemption.
Yup.
>
> It means that the comment is wrong. It is not guarantted that the
> preemption will be disabled. It works only by chance.
>
>
>> We can also consider move the logical into trace_test_and_set_recursion()
>> and trace_clear_recursion(), but I'm not very sure about that... ftrace
>> internally already make sure preemption disabled
>
> How? Because it calls trace_test_and_set_recursion() via the trylock() API?
I mean currently all the direct caller of trace_test_and_set_recursion()
have disabled the preemption as you mentioned above, but yes if anyone later
write some kernel code to call trace_test_and_set_recursion() without
disabling preemption after, then promise broken.
>
>
>> , what uncovered is those users who call API trylock/unlock, isn't
>> it?
>
> And this is exactly the problem. trace_test_and_set_recursion() is in
> a public header. Anyone could use it. And if anyone uses it in the
> future without the trylock() and does not disable the preemtion
> explicitely then we are lost again.
>
> And it is even more dangerous. The original code disabled the
> preemtion on various layers. As a result, the preemtion was disabled
> several times for sure. It means that the deeper layers were
> always on the safe side.
>
> With this patch, if the first trace_test_and_set_recursion() caller
> does not disable preemtion then trylock() will not disable it either
> and the entire code is procceed with preemtion enabled.
Yes, what confusing me at first is that I think people who call
trace_test_and_set_recursion() without trylock() can only be a
ftrace hacker, not a user, but in case if anyone can use it without
respect to preemption stuff, then I think the logical should be inside
trace_test_and_set_recursion() rather than trylock().
>
>
>>> Finally, the comment confused me a lot. The difference between nesting and
>>> recursion is far from clear. And the code is tricky liky like hell :-)
>>> I propose to add some comments, see below for a proposal.
>> The comments do confusing, I'll make it something like:
>>
>> The zero bit indicate trace recursion happened, whatever
>> the recursively call was made by ftrace handler or ftrace
>> itself, the preemption already disabled.
>
> I am sorry but it is still confusing. We need to find a better way
> how to clearly explain the difference between the safe and
> unsafe recursion.
>
> My understanding is that the recursion is:
>
> + "unsafe" when the trace code recursively enters the same trace point.
>
> + "safe" when ftrace_test_recursion_trylock() is called recursivelly
> while still processing the same trace entry.
Maybe take some example would be easier to understand...
Currently there are two way of using ftrace_test_recursion_trylock(),
one with TRACE_FTRACE_XXX we mark as A, one with TRACE_LIST_XXX we mark
as B, then:
A followed by B on same context got bit > 0
B followed by A on any context got bit 0
A followed by A on same context got bit > 0
A followed by A followed by A on same context got bit -1
B followed by B on same context got bit > 0
B followed by B followed by B on same context got bit -1
If we get rid of the TRACE_TRANSITION_BIT which allowed recursion for
onetime, then it would be:
A followed by B on same context got bit > 0
B followed by A on any context got bit 0
A followed by A on same context got bit -1
B followed by B on same context got bit -1
So as long as no continuously AAA it's safe?
>
>>>> +
>>>> + return bit;
>>>> }
>>>> /**
>>>> @@ -222,9 +233,13 @@ static __always_inline int ftrace_test_recursion_trylock(unsigned long ip,
>>>> * @bit: The return of a successful ftrace_test_recursion_trylock()
>>>> *
>>>> * This is used at the end of a ftrace callback.
>>>> + *
>>>> + * Preemption will be enabled (if it was previously enabled).
>>>> */
>>>> static __always_inline void ftrace_test_recursion_unlock(int bit)
>>>> {
>>>> + if (bit)
>>>
>>> This is not symetric with trylock(). It should be:
>>>
>>> if (bit > 0)
>>>
>>> Anyway, trace_clear_recursion() quiently ignores bit != 0
>>
>> Yes, bit == 0 should not happen in here.
>
> Yes, it "should" not happen. My point is that we could make the API
> more safe. We could do the right thing when
> ftrace_test_recursion_unlock() is called with negative @bit.
> Ideally, we should also warn about the mis-use.
Agree with a WARN here on bit 0.
>
>
> Anyway, let's wait for Steven. It seems that he found another problem
> with the API that should be solved first. The fix will probably
> also help to better understand the "safe" vs "unsafe" recursion.
Cool~
Regards,
Michael Wang
>
> Best Regards,
> Petr
>
next prev parent reply other threads:[~2021-10-15 9:12 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-13 8:51 [PATCH v3 0/2] fix & prevent the missing preemption disabling 王贇
2021-10-13 8:51 ` [PATCH v3 1/2] ftrace: disable preemption between ftrace_test_recursion_trylock/unlock() 王贇
2021-10-14 9:13 ` Miroslav Benes
2021-10-14 9:22 ` 王贇
2021-10-14 13:28 ` Steven Rostedt
2021-10-14 15:14 ` Petr Mladek
2021-10-15 3:13 ` 王贇
2021-10-15 4:45 ` 王贇
2021-10-15 7:28 ` Petr Mladek
2021-10-15 9:12 ` 王贇 [this message]
2021-10-15 13:35 ` Steven Rostedt
2021-10-15 3:39 ` Steven Rostedt
2021-10-13 8:52 ` [PATCH v3 2/2] ftrace: do CPU checking after preemption disabled 王贇
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3c87e825-e907-cba0-e95f-28878356fc71@linux.alibaba.com \
--to=yun.wang@linux.alibaba.com \
--cc=James.Bottomley@hansenpartnership.com \
--cc=aou@eecs.berkeley.edu \
--cc=benh@kernel.crashing.org \
--cc=bp@alien8.de \
--cc=colin.king@canonical.com \
--cc=deller@gmx.de \
--cc=guoren@kernel.org \
--cc=hpa@zytor.com \
--cc=jikos@kernel.org \
--cc=joe.lawrence@redhat.com \
--cc=jpoimboe@redhat.com \
--cc=jszhang@kernel.org \
--cc=linux-csky@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-parisc@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=live-patching@vger.kernel.org \
--cc=mbenes@suse.cz \
--cc=mhiramat@kernel.org \
--cc=mingo@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=npiggin@gmail.com \
--cc=palmer@dabbelt.com \
--cc=paul.walmsley@sifive.com \
--cc=paulus@samba.org \
--cc=peterz@infradead.org \
--cc=pmladek@suse.com \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).