From: "tiejun.chen" <tiejun.chen@windriver.com>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Jim Keniston <jkenisto@linux.vnet.ibm.com>,
Anton Blanchard <anton@au1.ibm.com>,
linux-kernel <linux-kernel@vger.kernel.org>,
Steven Rostedt <rostedt@goodmis.org>,
Yong Zhang <yong.zhang0@gmail.com>,
paulus@samba.org, yrl.pp-manager.tt@hitachi.com,
Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>,
linuxppc-dev@lists.ozlabs.org
Subject: Re: [BUG?]3.0-rc4+ftrace+kprobe: set kprobe at instruction 'stwu' lead to system crash/freeze
Date: Wed, 30 Nov 2011 19:06:19 +0800 [thread overview]
Message-ID: <4ED60E2B.8030603@windriver.com> (raw)
In-Reply-To: <1322626752.21641.22.camel@pasglop>
Benjamin Herrenschmidt wrote:
> On Fri, 2011-07-01 at 18:03 +0800, tiejun.chen wrote:
>> Here emulate_step() is called to emulate 'stwu'. Actually this is equivalent to
>> 1> update pr_regs->gpr[1] = mem(old r1 + (-A))
>> 2> 'stw <old r1>, mem<(old r1 + (-A)) >
>>
>> You should notice the stack based on new r1 would be covered with mem<old r1
>> +(-A)>. So after this, the kernel exit from post_krpobe, something would be
>> broken. This should depend on sizeof(-A).
>>
>> For kprobe show_interrupts, you can see pregs->nip is re-written violently so
>> kernel issued.
>>
>> But sometimes we may only re-write some violate registers the kernel still
>> alive. And so this is just why the kernel works well for some kprobed point
>> after you change some kernel options/toolchains.
>>
>> If I'm correct its difficult to kprobe these stwu sp operation since the
>> sizeof(-A) is undermined for the kernel. So we have to implement in-depend
>> interrupt stack like PPC64.
>
> So I've spent a lot of time trying to find a better way to fix that bug
> and I think I might have finally found one :-)
I can understand what you mean in below since I remember you already clarified
this way previously.
>
> - When you try to emulate stwcx on the kernel stack (and only there),
I think it should be stwu/stdu.
> don't actually perform the store. Set a TIF flag instead to indicate
> special processing in the exception return path and store the address to
> update somewhere either in a free slot of the stack frame itself of
> somewhere in the thread struct (the former would be easier). You may as
> well do some sanity checking on the value while at it to catch errors
> early.
>
> - In the exception return code, we already test for various TIF flags
> (*** see comment below, it's even buggy today for preempt ***), so we
> add a test for that flag and go to do_work.
>
> - At the end of do_work, we check for this TIF flag. If it's not set or
> any other flag is set, move on as usual. However, if it's the only flag
> still set:
>
> - Copy the exception frame we're about to unwind to just -below- the
> new r1 value we want to write to. Then perform the write, and change
> r1 to point to that copy of the frame.
>
> - Branch to restore: which will unwind everything from that copy of
> the frame, and eventually set r1 to GPR(1) in the copy which contains
> the new value of r1.
We still can't restore this there. As you know this emulated store instruction
can touch any filed inside pt_regs. Sometimes nip would be involved for this
problematic location. And unfortunately, this is just that we meet currently. So
we have to go exc_exit_restart.
.globl exc_exit_restart
exc_exit_restart:
lwz r11,_NIP(r1)
lwz r12,_MSR(r1)
I mean we have to do that real restore here. So I'm really not sure if its a
good way to add such a codes including check TIF/copy-get new r1/restore
operation here since this is so deep for the exception return code.
exc_exit_start:
mtspr SPRN_SRR0,r11
mtspr SPRN_SRR1,r12
>
> This is the less intrusive approach and should work just fine, it's also
> more robust than anything I've been able to think of and the approach
> would work for 32 and 64-bit similarily.
>
> (***) Above comment about a bug: If you look at entry_64.S version of
> ret_from_except_lite you'll notice that in the !preempt case, after
> we've checked MSR_PR we test for any TIF flag in _TIF_USER_WORK_MASK to
> decide whether to go to do_work or not. However, in the preempt case, we
> do a convoluted trick to test SIGPENDING only if PR was set and always
> test NEED_RESCHED ... but we forget to test any other bit of
> _TIF_USER_WORK_MASK !!! So that means that with preempt, we completely
> fail to test for things like single step, syscall tracing, etc...
>
This is another problem we should address.
> I think this should be fixed at the same time, by simplifying the code
> by doing:
>
> - Test PR. If set, go to test_work_user, else continue (or the other
> way around and call it test_work_kernel)
>
> - In test_work_user, always test for _TIF_USER_WORK_MASK to decide to
> go to do_work, maybe call it do_user_work
>
> - In test_work_kernel, test for _TIF_KERNEL_WORK_MASK which is set to
> our new flag along with NEED_RESCHED if preempt is enabled and branch to
> do_kernel_work.
>
> do_user_work is basically the same as today's user_work
>
> do_kernel_work is basically the same as today preempt block with added
> code to handle the new flag as described above.
>
> Is anybody volunteering for fixing that ? I don't have the bandwidth
I always use one specific kprobe stack to fix this for BOOKE and work well in my
local tree :) Do you remember my v3 patch? I think its possible to extend this
for all PPC variants.
Anyway, I'd like to be this volunteer with our last solution.
Tiejun
> right now, but if nobody shows up I suppose I'll have to eventually deal
> with it myself :-)
>
> Cheers,
> Ben.
next prev parent reply other threads:[~2011-11-30 11:05 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-06-24 9:21 [BUG?]3.0-rc4+ftrace+kprobe: set kprobe at instruction 'stwu' lead to system crash/freeze Yong Zhang
2011-06-24 10:29 ` Steven Rostedt
2011-06-26 14:47 ` Masami Hiramatsu
2011-06-27 10:01 ` Ananth N Mavinakayanahalli
2011-06-28 10:41 ` Ananth N Mavinakayanahalli
2011-06-28 13:15 ` Steven Rostedt
2011-06-29 6:41 ` Yong Zhang
2011-06-29 6:23 ` Yong Zhang
2011-06-29 6:46 ` Ananth N Mavinakayanahalli
2011-06-30 7:08 ` Yong Zhang
2011-07-01 10:03 ` tiejun.chen
2011-07-04 2:23 ` Yong Zhang
2011-11-30 4:19 ` Benjamin Herrenschmidt
2011-11-30 11:06 ` tiejun.chen [this message]
2011-11-30 21:00 ` Benjamin Herrenschmidt
2011-12-01 10:44 ` tiejun.chen
2011-12-01 21:37 ` Benjamin Herrenschmidt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4ED60E2B.8030603@windriver.com \
--to=tiejun.chen@windriver.com \
--cc=anton@au1.ibm.com \
--cc=benh@kernel.crashing.org \
--cc=jkenisto@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=masami.hiramatsu.pt@hitachi.com \
--cc=paulus@samba.org \
--cc=rostedt@goodmis.org \
--cc=yong.zhang0@gmail.com \
--cc=yrl.pp-manager.tt@hitachi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).