linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Daniel Mack <zonque@gmail.com>
To: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: linux-arch@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Al Viro <viro@ZenIV.linux.org.uk>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	linux-kernel@vger.kernel.org
Subject: Re: [git pull] signals pile 3
Date: Mon, 15 Oct 2012 00:39:40 +0200	[thread overview]
Message-ID: <507B3F2C.1090500@gmail.com> (raw)
In-Reply-To: <20121014222449.GN21164@n2100.arm.linux.org.uk>

On 15.10.2012 00:24, Russell King - ARM Linux wrote:
> Okay, here's the post-mortem diagnosis.
> 
> What's happening is as follows (I'm very certain of this.)
> 
> We come through the usual init, and issue (see init/main.c):
> 
> 	kernel_thread(kernel_init, NULL, CLONE_FS | CLONE_SIGHAND);
> 
> This creates a new thread, which falls through to the ret_from_fork
> assembly, with r4 set NULL and r5 set to kernel_init.  You can see
> this in your oops dump register set - r5 is 0xc0344555, which is the
> address of kernel_init plus 1 which marks the function as Thumb code.
> 
> Now, let's look at this code a little closer - this is what the
> disassembly looks like:
> 
> c000d180 <ret_from_fork>:
> c000d180:       f03a fe08       bl      c0047d94 <schedule_tail>
> c000d184:       2d00            cmp     r5, #0
> c000d186:       bf1e            ittt    ne
> c000d188:       4620            movne   r0, r4
> c000d18a:       46fe            movne   lr, pc <-- XXXXXXX
> c000d18c:       46af            movne   pc, r5
> c000d18e:       46e9            mov     r9, sp
> c000d190:       ea4f 3959       mov.w   r9, r9, lsr #13
> c000d194:       ea4f 3949       mov.w   r9, r9, lsl #13
> c000d198:       e7c8            b.n     c000d12c <ret_to_user>
> c000d19a:       bf00            nop
> c000d19c:       f3af 8000       nop.w
> 
> I have marked one instruction, and it's the significant one.
> 
> Eventually, having had a successful call to kernel_execve(), kernel_init()
> returns zero.
> 
> In returning, it uses the value in 'lr' which was set by the instruction
> I marked above.  Unfortunately, this causes lr to contain 0xc000d18e -
> an even address.  This switches the ISA to ARM on return but with a non
> word aligned PC value.
> 
> So, what do we end up executing?  Well, not the instructions above - yes
> the opcodes, but they don't mean the same thing in ARM mode.  In ARM mode,
> it looks like this instead:
> 
> c000d18c:       46e946af        strbtmi r4, [r9], pc, lsr #13
> c000d190:       3959ea4f        ldmdbcc r9, {r0, r1, r2, r3, r6, r9, fp, sp, lr, pc}^
> c000d194:       3949ea4f        stmdbcc r9, {r0, r1, r2, r3, r6, r9, fp, sp, lr, pc}^
> c000d198:       bf00e7c8        svclt   0x0000e7c8
> c000d19c:       8000f3af        andhi   pc, r0, pc, lsr #7
> c000d1a0:       e88db092        stm     sp, {r1, r4, r7, ip, sp, pc}
> c000d1a4:       46e81fff                        ; <UNDEFINED> instruction: 0x46e81fff
> c000d1a8:       8a00f3ef        bhi     0xc004a16c
> c000d1ac:       0a0cf08a        beq     0xc03493dc
> 
> I have included more above, because it's relevant.  The PSR flags which we
> can see in the oops dump are nZCv, so Z and C are set.
> 
> All the above ARM instructions are not executed, except for two.  c000d1a0,
> which has no writeback, and writes below the current stack pointer (and
> that data is lost when we take the next exception.)  The other instruction
> which is executed is c000d1ac, which takes us to... 0xc03493dc.  However,
> remember that bit 1 of the PC got set.  So that makes it 0xc03493de.
> 
> And that value is the value we find in the oops dump for PC.  What is the
> instruction here when interpreted in ARM mode?
> 
>        0:       f71e150c                ; <UNDEFINED> instruction: 0xf71e150c
> 
> and there we have our undefined instruction (remember that the 'never'
> condition code, 0xf, has been deprecated and is now always executed.)
> 
> So, what we have above is a consistent and sane story for how we ended up
> at such a strange place in the kernel with such an odd register dump - with
> no unanswered questions about what happened to get us there.
> 
> In light of this, I'm 100% certain that the patch below will fix the issue
> you're seeing - please test this and get back to me ASAP, thanks.

Quite impressive analysis :) And it seems you really spotted the reason
here, as your patch fixes the problem.

>  arch/arm/kernel/entry-common.S |    4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
> index 417bac1..3471175 100644
> --- a/arch/arm/kernel/entry-common.S
> +++ b/arch/arm/kernel/entry-common.S
> @@ -88,9 +88,9 @@ ENTRY(ret_from_fork)
>  	bl	schedule_tail
>  	cmp	r5, #0
>  	movne	r0, r4
> -	movne	lr, pc
> +	adrne	lr, BSYM(1f)
>  	movne	pc, r5
> -	get_thread_info tsk
> +1:	get_thread_info tsk
>  	b	ret_slow_syscall
>  ENDPROC(ret_from_fork)

Tested-by: Daniel Mack <zonque@gmail.com>

Many thanks for the very prompt response!


Daniel

  reply	other threads:[~2012-10-14 22:39 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-13  0:53 [git pull] signals pile 3 Al Viro
2012-10-14 15:35 ` Daniel Mack
2012-10-14 16:40   ` [revert request for commit 9fff2fa] " Al Viro
2012-10-14 16:44     ` Daniel Mack
2012-10-14 17:26       ` Al Viro
2012-10-14 17:55         ` Al Viro
2012-10-14 18:21           ` Daniel Mack
2012-10-14 18:21             ` Daniel Mack
2012-10-14 19:06             ` Al Viro
2012-10-14 19:24         ` Al Viro
2012-10-14 19:56           ` Al Viro
2012-10-15 16:07             ` Catalin Marinas
2012-10-15 16:27               ` Al Viro
2012-10-15 17:06                 ` Catalin Marinas
2012-10-14 20:24   ` Russell King - ARM Linux
2012-10-14 22:24     ` Russell King - ARM Linux
2012-10-14 22:39       ` Daniel Mack [this message]
2012-10-14 23:16         ` Russell King - ARM Linux
2012-10-14 23:16           ` Russell King - ARM Linux
2012-10-16 14:04           ` Uwe Kleine-König
2012-10-16 14:05             ` Russell King - ARM Linux
2012-10-16 14:05               ` Russell King - ARM Linux

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=507B3F2C.1090500@gmail.com \
    --to=zonque@gmail.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@arm.linux.org.uk \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).