public inbox for linux-arm-kernel@lists.infradead.org
 help / color / mirror / Atom feed
* A bug about system call on ARM
       [not found]     ` <35FD53F367049845BC99AC72306C23D1610991B85E@CNBJMBX05.corpusers.net>
@ 2013-05-29  8:46       ` richard -rw- weinberger
  2013-05-29  9:48         ` Will Deacon
  0 siblings, 1 reply; 26+ messages in thread
From: richard -rw- weinberger @ 2013-05-29  8:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, May 29, 2013 at 10:24 AM, Wang, Yalin <Yalin.Wang@sonymobile.com> wrote:
> Hi
>
> I have download the latest linux kernel code  3.9.4
> And Compare with  3.4.0 kernel .
>
> It seems there is no change for this part ,
> So it will still happen .
> Does anyone know who is responsible for  arm arch part kernel code ?

See MAINTAINERS file.
CC'ing linux-arm-kernel at lists.infradead.org

>
> Thanks
>
>
> -----Original Message-----
> From: Wang, Yalin
> Sent: Wednesday, May 29, 2013 3:38 PM
> To: 'richard -rw- weinberger'
> Cc: linux-arch at vger.kernel.org; linux-kernel at vger.kernel.org
> Subject: RE: A bug about system call on ARM
>
> Hi  Richard,
>
> Thanks for your reply ,
> I will make a check for this .
>
>
> -----Original Message-----
> From: richard -rw- weinberger [mailto:richard.weinberger at gmail.com]
> Sent: Wednesday, May 29, 2013 3:35 PM
> To: Wang, Yalin
> Cc: linux-arch at vger.kernel.org; linux-kernel at vger.kernel.org
> Subject: Re: A bug about system call on ARM
>
> Hi!
>
> On Wed, May 29, 2013 at 8:52 AM, Wang, Yalin <Yalin.Wang@sonymobile.com> wrote:
>> Hi  all,
>>
>> I am a new comer to this mailing list , I am happy to join this
>> community .
>>
>> I have a bug reported from our android phones which is caused by  the system call .
>> It seems like kernel bugs from my view .
>
> Is this a unmodified Linux kernel from kernel.org? In other works, no (half broken) board support package from your hardware vendor?
> Did you try a more recent kernel? (At least 3.4.47).
> Maybe your problem is already known and fixed...
>
>> Crash in  file  arch\arm\kernel\ entry-common.S
>>
>> /***************************************************************/
>>
>> ENTRY(vector_swi)
>>         sub     sp, sp, #S_FRAME_SIZE
>>         stmia   sp, {r0 - r12}                  @ Calling r0 - r12
>>  ARM(   add     r8, sp, #S_PC           )
>>  ARM(   stmdb   r8, {sp, lr}^           )       @ Calling sp, lr
>>  THUMB( mov     r8, sp                  )
>>  THUMB( store_user_sp_lr r8, r10, S_SP  )       @ calling sp, lr
>>         mrs     r8, spsr                        @ called from non-FIQ mode, so ok.
>>         str     lr, [sp, #S_PC]                 @ Save calling PC
>>         str     r8, [sp, #S_PSR]                @ Save CPSR
>>         str     r0, [sp, #S_OLD_R0]             @ Save OLD_R0
>>         zero_fp
>>
>>         /*
>>          * Get the system call number.
>>          */
>>
>> #if defined(CONFIG_OABI_COMPAT)
>>
>>         /*
>>          * If we have CONFIG_OABI_COMPAT then we need to look at the swi
>>          * value to determine if it is an EABI or an old ABI call.
>>          */
>> #ifdef CONFIG_ARM_THUMB
>>         tst     r8, #PSR_T_BIT
>>         movne   r10, #0                         @ no thumb OABI emulation
>>         ldreq   r10, [lr, #-4]                  @ get SWI instruction          // crash at this instruction, when get SWI instruction
>> #else
>>         ldr     r10, [lr, #-4]                  @ get SWI instruction
>>   A710( and     ip, r10, #0x0f000000            @ check for SWI         )
>>   A710( teq     ip, #0x0f000000                                         )
>>   A710( bne     .Larm710bug                                             )
>> #endif
>> #ifdef CONFIG_CPU_ENDIAN_BE8
>>         rev     r10, r10                        @ little endian instruction
>> #endif
>>
>> /*********************************************************************
>> ******************************/
>>
>> Then reason why it will crash when get SWI instruction is maybe This
>> page is clear to aged by kernel, But this MMU fault happpened in
>> kernel, So the kernel do_page_fault function will not clear this page
>> to young, So that  will crash .
>>
>> It should poll this page to make it present or the fault should be
>> handled by fixup section , Anyway, this place should not crash by kernel .
>>
>> The kernel version I used  is  3.4.0
>> I have add the kernel log and the call stack recovered  by trace32
>> tools Pls have a look at it .
>>
>>
>> Thanks .
>>
>>
>>
>> Sony Mobile Communications
>> Tel: +86 10 5966 9819
>> Phone: 18610323092
>> Address: No.16 Guangshun South Street, Chaoyang, Beijing, P.R.C.
>>
>> sonymobile.com
>>
>>
>>
>
>
>
> --
> Thanks,
> //richard
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/



--
Thanks,
//richard

^ permalink raw reply	[flat|nested] 26+ messages in thread

* A bug about system call on ARM
  2013-05-29  8:46       ` A bug about system call on ARM richard -rw- weinberger
@ 2013-05-29  9:48         ` Will Deacon
       [not found]           ` <35FD53F367049845BC99AC72306C23D1610991B865@CNBJMBX05.corpusers.net>
  0 siblings, 1 reply; 26+ messages in thread
From: Will Deacon @ 2013-05-29  9:48 UTC (permalink / raw)
  To: linux-arm-kernel

Hello,

On Wed, May 29, 2013 at 09:46:42AM +0100, richard -rw- weinberger wrote:
> On Wed, May 29, 2013 at 10:24 AM, Wang, Yalin <Yalin.Wang@sonymobile.com> wrote:
> > I have download the latest linux kernel code  3.9.4
> > And Compare with  3.4.0 kernel .
> >
> > It seems there is no change for this part ,
> > So it will still happen .
> > Does anyone know who is responsible for  arm arch part kernel code ?
> 
> See MAINTAINERS file.
> CC'ing linux-arm-kernel at lists.infradead.org

Cheers for adding us to CC.

> >> #ifdef CONFIG_ARM_THUMB
> >>         tst     r8, #PSR_T_BIT
> >>         movne   r10, #0                         @ no thumb OABI emulation
> >>         ldreq   r10, [lr, #-4]                  @ get SWI instruction          // crash at this instruction, when get SWI instruction

Do you have the panic log please? Also, which SoC are you using and how are
you reproducing this?

> >>         ldr     r10, [lr, #-4]                  @ get SWI instruction
> >>   A710( and     ip, r10, #0x0f000000            @ check for SWI         )
> >>   A710( teq     ip, #0x0f000000                                         )
> >>   A710( bne     .Larm710bug                                             )
> >> #endif
> >> #ifdef CONFIG_CPU_ENDIAN_BE8
> >>         rev     r10, r10                        @ little endian instruction
> >> #endif
> >>
> >> /*********************************************************************
> >> ******************************/
> >>
> >> Then reason why it will crash when get SWI instruction is maybe This
> >> page is clear to aged by kernel, But this MMU fault happpened in
> >> kernel, So the kernel do_page_fault function will not clear this page
> >> to young, So that  will crash .

Sounds like we might need some USER annotations around the instruction
loads, but we should also rework the code so that we re-enable interrupts
first.

Will

^ permalink raw reply	[flat|nested] 26+ messages in thread

* A bug about system call on ARM
       [not found]           ` <35FD53F367049845BC99AC72306C23D1610991B865@CNBJMBX05.corpusers.net>
@ 2013-05-30  1:41             ` Wang, Yalin
  2013-05-30  9:09               ` Will Deacon
  0 siblings, 1 reply; 26+ messages in thread
From: Wang, Yalin @ 2013-05-30  1:41 UTC (permalink / raw)
  To: linux-arm-kernel

Hi  Will,

Have you received the log files?

And is there someone looking at this issue now ?
This issue happened on Qcom Scorpoin CPUs,
And  it just happened in our stability test occasionally .

If you have some patch for this issue,
I can do the test for it .

Thanks for your help very much !

-----Original Message-----
From: Wang, Yalin 
Sent: Wednesday, May 29, 2013 5:51 PM
To: 'Will Deacon'; richard -rw- weinberger
Cc: linux-arch at vger.kernel.org; linux-kernel at vger.kernel.org; linux-arm-kernel at lists.infradead.org
Subject: RE: A bug about system call on ARM

Hi 

This is kernel.log  and  the stack which is recovered by Trace32 tools.
Please have a look at it .

Thanks 

-----Original Message-----
From: Will Deacon [mailto:will.deacon at arm.com]
Sent: Wednesday, May 29, 2013 5:48 PM
To: richard -rw- weinberger
Cc: Wang, Yalin; linux-arch at vger.kernel.org; linux-kernel at vger.kernel.org; linux-arm-kernel at lists.infradead.org
Subject: Re: A bug about system call on ARM

Hello,

On Wed, May 29, 2013 at 09:46:42AM +0100, richard -rw- weinberger wrote:
> On Wed, May 29, 2013 at 10:24 AM, Wang, Yalin <Yalin.Wang@sonymobile.com> wrote:
> > I have download the latest linux kernel code  3.9.4 And Compare with
> > 3.4.0 kernel .
> >
> > It seems there is no change for this part , So it will still happen 
> > .
> > Does anyone know who is responsible for  arm arch part kernel code ?
> 
> See MAINTAINERS file.
> CC'ing linux-arm-kernel at lists.infradead.org

Cheers for adding us to CC.

> >> #ifdef CONFIG_ARM_THUMB
> >>         tst     r8, #PSR_T_BIT
> >>         movne   r10, #0                         @ no thumb OABI emulation
> >>         ldreq   r10, [lr, #-4]                  @ get SWI instruction          // crash at this instruction, when get SWI instruction

Do you have the panic log please? Also, which SoC are you using and how are you reproducing this?

> >>         ldr     r10, [lr, #-4]                  @ get SWI instruction
> >>   A710( and     ip, r10, #0x0f000000            @ check for SWI         )
> >>   A710( teq     ip, #0x0f000000                                         )
> >>   A710( bne     .Larm710bug                                             )
> >> #endif
> >> #ifdef CONFIG_CPU_ENDIAN_BE8
> >>         rev     r10, r10                        @ little endian instruction
> >> #endif
> >>
> >> /******************************************************************
> >> ***
> >> ******************************/
> >>
> >> Then reason why it will crash when get SWI instruction is maybe 
> >> This page is clear to aged by kernel, But this MMU fault happpened 
> >> in kernel, So the kernel do_page_fault function will not clear this 
> >> page to young, So that  will crash .

Sounds like we might need some USER annotations around the instruction loads, but we should also rework the code so that we re-enable interrupts first.

Will

^ permalink raw reply	[flat|nested] 26+ messages in thread

* A bug about system call on ARM
  2013-05-30  1:41             ` Wang, Yalin
@ 2013-05-30  9:09               ` Will Deacon
  2013-05-30 11:41                 ` Will Deacon
  0 siblings, 1 reply; 26+ messages in thread
From: Will Deacon @ 2013-05-30  9:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, May 30, 2013 at 02:41:42AM +0100, Wang, Yalin wrote:
> Hi  Will,

Hello,

> Have you received the log files?

Yep, and you seem to be completely correct: CPU0 ages the page from which
CPU1 just executed a system call, so we explode trying to load the swi
instruction in order to retrieve the immediate.

> And is there someone looking at this issue now ?

It's on my list, but I'm pretty busy right now and OABI-compat isn't high
priority. Are you actually running OABI binaries? If not, you can simply
turn that option off (in fact, a quick fix to this issue is to make that
depend on !SMP).

> This issue happened on Qcom Scorpoin CPUs,
> And  it just happened in our stability test occasionally .
> 
> If you have some patch for this issue,
> I can do the test for it .

I'll have a look at cooking something which uses an exception table entry
to rewind the PC and retry the system call. That's simpler than directly
injecting a user page fault from the system call path.

Will

^ permalink raw reply	[flat|nested] 26+ messages in thread

* A bug about system call on ARM
  2013-05-30  9:09               ` Will Deacon
@ 2013-05-30 11:41                 ` Will Deacon
  2013-05-31  2:56                   ` Wang, Yalin
                                     ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Will Deacon @ 2013-05-30 11:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, May 30, 2013 at 10:09:49AM +0100, Will Deacon wrote:
> On Thu, May 30, 2013 at 02:41:42AM +0100, Wang, Yalin wrote:
> > If you have some patch for this issue,
> > I can do the test for it .
> 
> I'll have a look at cooking something which uses an exception table entry
> to rewind the PC and retry the system call. That's simpler than directly
> injecting a user page fault from the system call path.

Ok, please can you try the following?

Will

--->8

diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index bc5bc0a..855926e 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -361,6 +361,15 @@ ENTRY(vector_swi)
 	str	r8, [sp, #S_PSR]		@ Save CPSR
 	str	r0, [sp, #S_OLD_R0]		@ Save OLD_R0
 	zero_fp
+	enable_irq
+	ct_user_exit
+
+#ifdef CONFIG_ALIGNMENT_TRAP
+	ldr	ip, __cr_alignment
+	ldr	ip, [ip]
+	mcr	p15, 0, ip, c1, c0		@ update control register
+#endif
+	get_thread_info tsk
 
 	/*
 	 * Get the system call number.
@@ -375,9 +384,9 @@ ENTRY(vector_swi)
 #ifdef CONFIG_ARM_THUMB
 	tst	r8, #PSR_T_BIT
 	movne	r10, #0				@ no thumb OABI emulation
-	ldreq	r10, [lr, #-4]			@ get SWI instruction
+ USER(	ldreq	r10, [lr, #-4]		)	@ get SWI instruction
 #else
-	ldr	r10, [lr, #-4]			@ get SWI instruction
+ USER(	ldr	r10, [lr, #-4]		)	@ get SWI instruction
 #endif
 #ifdef CONFIG_CPU_ENDIAN_BE8
 	rev	r10, r10			@ little endian instruction
@@ -392,22 +401,13 @@ ENTRY(vector_swi)
 	/* Legacy ABI only, possibly thumb mode. */
 	tst	r8, #PSR_T_BIT			@ this is SPSR from save_user_regs
 	addne	scno, r7, #__NR_SYSCALL_BASE	@ put OS number in
-	ldreq	scno, [lr, #-4]
+ USER(	ldreq	scno, [lr, #-4]		)
 
 #else
 	/* Legacy ABI only. */
-	ldr	scno, [lr, #-4]			@ get SWI instruction
-#endif
-
-#ifdef CONFIG_ALIGNMENT_TRAP
-	ldr	ip, __cr_alignment
-	ldr	ip, [ip]
-	mcr	p15, 0, ip, c1, c0		@ update control register
+ USER(	ldr	scno, [lr, #-4]		)	@ get SWI instruction
 #endif
-	enable_irq
-	ct_user_exit
 
-	get_thread_info tsk
 	adr	tbl, sys_call_table		@ load syscall table pointer
 
 #if defined(CONFIG_OABI_COMPAT)
@@ -442,6 +442,18 @@ local_restart:
 	eor	r0, scno, #__NR_SYSCALL_BASE	@ put OS number back
 	bcs	arm_syscall	
 	b	sys_ni_syscall			@ not private func
+
+#if defined(CONFIG_OABI_COMPAT) || !defined(CONFIG_AEABI)
+	/*
+	 * We may have faulted trying to load the SWI instruction due to
+	 * concurrent page aging on another CPU. In this case, return
+	 * back to the swi instruction and fault the page back.
+	 */
+9001:
+	sub	lr, lr, #4
+	str	lr, [sp, #S_PC]
+	b	ret_fast_syscall
+#endif
 ENDPROC(vector_swi)
 
 	/*

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* A bug about system call on ARM
  2013-05-30 11:41                 ` Will Deacon
@ 2013-05-31  2:56                   ` Wang, Yalin
  2013-05-31  8:46                     ` Will Deacon
  2013-05-31  3:54                   ` Nicolas Pitre
  2013-06-03 10:18                   ` Russell King - ARM Linux
  2 siblings, 1 reply; 26+ messages in thread
From: Wang, Yalin @ 2013-05-31  2:56 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will,

Thanks for your patch ,

But I found  I don't have ct_user_exit  macro 
In my arch/arm/kernel/entry-common.S 

My kernel version is 3.4.0 

I have add the file as attachment,

Could you make a patch for this file ?

Thank you !

-----Original Message-----
From: Will Deacon [mailto:will.deacon at arm.com] 
Sent: Thursday, May 30, 2013 7:41 PM
To: Wang, Yalin
Cc: 'richard -rw- weinberger'; 'linux-arch at vger.kernel.org'; 'linux-kernel at vger.kernel.org'; 'linux-arm-kernel at lists.infradead.org'
Subject: Re: A bug about system call on ARM

On Thu, May 30, 2013 at 10:09:49AM +0100, Will Deacon wrote:
> On Thu, May 30, 2013 at 02:41:42AM +0100, Wang, Yalin wrote:
> > If you have some patch for this issue, I can do the test for it .
> 
> I'll have a look at cooking something which uses an exception table 
> entry to rewind the PC and retry the system call. That's simpler than 
> directly injecting a user page fault from the system call path.

Ok, please can you try the following?

Will

--->8

diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S index bc5bc0a..855926e 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -361,6 +361,15 @@ ENTRY(vector_swi)
 	str	r8, [sp, #S_PSR]		@ Save CPSR
 	str	r0, [sp, #S_OLD_R0]		@ Save OLD_R0
 	zero_fp
+	enable_irq
+	ct_user_exit
+
+#ifdef CONFIG_ALIGNMENT_TRAP
+	ldr	ip, __cr_alignment
+	ldr	ip, [ip]
+	mcr	p15, 0, ip, c1, c0		@ update control register
+#endif
+	get_thread_info tsk
 
 	/*
 	 * Get the system call number.
@@ -375,9 +384,9 @@ ENTRY(vector_swi)
 #ifdef CONFIG_ARM_THUMB
 	tst	r8, #PSR_T_BIT
 	movne	r10, #0				@ no thumb OABI emulation
-	ldreq	r10, [lr, #-4]			@ get SWI instruction
+ USER(	ldreq	r10, [lr, #-4]		)	@ get SWI instruction
 #else
-	ldr	r10, [lr, #-4]			@ get SWI instruction
+ USER(	ldr	r10, [lr, #-4]		)	@ get SWI instruction
 #endif
 #ifdef CONFIG_CPU_ENDIAN_BE8
 	rev	r10, r10			@ little endian instruction
@@ -392,22 +401,13 @@ ENTRY(vector_swi)
 	/* Legacy ABI only, possibly thumb mode. */
 	tst	r8, #PSR_T_BIT			@ this is SPSR from save_user_regs
 	addne	scno, r7, #__NR_SYSCALL_BASE	@ put OS number in
-	ldreq	scno, [lr, #-4]
+ USER(	ldreq	scno, [lr, #-4]		)
 
 #else
 	/* Legacy ABI only. */
-	ldr	scno, [lr, #-4]			@ get SWI instruction
-#endif
-
-#ifdef CONFIG_ALIGNMENT_TRAP
-	ldr	ip, __cr_alignment
-	ldr	ip, [ip]
-	mcr	p15, 0, ip, c1, c0		@ update control register
+ USER(	ldr	scno, [lr, #-4]		)	@ get SWI instruction
 #endif
-	enable_irq
-	ct_user_exit
 
-	get_thread_info tsk
 	adr	tbl, sys_call_table		@ load syscall table pointer
 
 #if defined(CONFIG_OABI_COMPAT)
@@ -442,6 +442,18 @@ local_restart:
 	eor	r0, scno, #__NR_SYSCALL_BASE	@ put OS number back
 	bcs	arm_syscall	
 	b	sys_ni_syscall			@ not private func
+
+#if defined(CONFIG_OABI_COMPAT) || !defined(CONFIG_AEABI)
+	/*
+	 * We may have faulted trying to load the SWI instruction due to
+	 * concurrent page aging on another CPU. In this case, return
+	 * back to the swi instruction and fault the page back.
+	 */
+9001:
+	sub	lr, lr, #4
+	str	lr, [sp, #S_PC]
+	b	ret_fast_syscall
+#endif
 ENDPROC(vector_swi)
 
 	/*
-------------- next part --------------
A non-text attachment was scrubbed...
Name: entry-common.S
Type: application/octet-stream
Size: 15071 bytes
Desc: entry-common.S
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20130531/acc75b68/attachment-0001.obj>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* A bug about system call on ARM
  2013-05-30 11:41                 ` Will Deacon
  2013-05-31  2:56                   ` Wang, Yalin
@ 2013-05-31  3:54                   ` Nicolas Pitre
  2013-05-31  8:45                     ` Will Deacon
  2013-06-03 10:18                   ` Russell King - ARM Linux
  2 siblings, 1 reply; 26+ messages in thread
From: Nicolas Pitre @ 2013-05-31  3:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 30 May 2013, Will Deacon wrote:

> On Thu, May 30, 2013 at 10:09:49AM +0100, Will Deacon wrote:
> > On Thu, May 30, 2013 at 02:41:42AM +0100, Wang, Yalin wrote:
> > > If you have some patch for this issue,
> > > I can do the test for it .
> > 
> > I'll have a look at cooking something which uses an exception table entry
> > to rewind the PC and retry the system call. That's simpler than directly
> > injecting a user page fault from the system call path.
> 
> Ok, please can you try the following?
> 
> Will
> 
> --->8
> 
> diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
> index bc5bc0a..855926e 100644
> --- a/arch/arm/kernel/entry-common.S
> +++ b/arch/arm/kernel/entry-common.S
> @@ -361,6 +361,15 @@ ENTRY(vector_swi)
>  	str	r8, [sp, #S_PSR]		@ Save CPSR
>  	str	r0, [sp, #S_OLD_R0]		@ Save OLD_R0
>  	zero_fp
> +	enable_irq
> +	ct_user_exit
> +
> +#ifdef CONFIG_ALIGNMENT_TRAP
> +	ldr	ip, __cr_alignment
> +	ldr	ip, [ip]
> +	mcr	p15, 0, ip, c1, c0		@ update control register
> +#endif

This is wrong.  you must set up the align bit in the control register 
_before_ enabling IRQs or an IRQ handler might run without alignment 
fixup.

Otherwise the patch looks good to me.


Nicolas

^ permalink raw reply	[flat|nested] 26+ messages in thread

* A bug about system call on ARM
  2013-05-31  3:54                   ` Nicolas Pitre
@ 2013-05-31  8:45                     ` Will Deacon
  0 siblings, 0 replies; 26+ messages in thread
From: Will Deacon @ 2013-05-31  8:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, May 31, 2013 at 04:54:56AM +0100, Nicolas Pitre wrote:
> On Thu, 30 May 2013, Will Deacon wrote:
> 
> > On Thu, May 30, 2013 at 10:09:49AM +0100, Will Deacon wrote:
> > > On Thu, May 30, 2013 at 02:41:42AM +0100, Wang, Yalin wrote:
> > > > If you have some patch for this issue,
> > > > I can do the test for it .
> > > 
> > > I'll have a look at cooking something which uses an exception table entry
> > > to rewind the PC and retry the system call. That's simpler than directly
> > > injecting a user page fault from the system call path.
> > 
> > Ok, please can you try the following?
> > 
> > Will
> > 
> > --->8
> > 
> > diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
> > index bc5bc0a..855926e 100644
> > --- a/arch/arm/kernel/entry-common.S
> > +++ b/arch/arm/kernel/entry-common.S
> > @@ -361,6 +361,15 @@ ENTRY(vector_swi)
> >  	str	r8, [sp, #S_PSR]		@ Save CPSR
> >  	str	r0, [sp, #S_OLD_R0]		@ Save OLD_R0
> >  	zero_fp
> > +	enable_irq
> > +	ct_user_exit
> > +
> > +#ifdef CONFIG_ALIGNMENT_TRAP
> > +	ldr	ip, __cr_alignment
> > +	ldr	ip, [ip]
> > +	mcr	p15, 0, ip, c1, c0		@ update control register
> > +#endif
> 
> This is wrong.  you must set up the align bit in the control register 
> _before_ enabling IRQs or an IRQ handler might run without alignment 
> fixup.

Okey doke, I can fix that up. I thought it was only needed for the network
layer, but I suppose they have interrupts over there too :)

> Otherwise the patch looks good to me.

Thanks Nicolas.

Will

^ permalink raw reply	[flat|nested] 26+ messages in thread

* A bug about system call on ARM
  2013-05-31  2:56                   ` Wang, Yalin
@ 2013-05-31  8:46                     ` Will Deacon
  2013-05-31 11:02                       ` Wang, Yalin
  0 siblings, 1 reply; 26+ messages in thread
From: Will Deacon @ 2013-05-31  8:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, May 31, 2013 at 03:56:31AM +0100, Wang, Yalin wrote:
> Hi Will,
> 
> Thanks for your patch ,
> 
> But I found  I don't have ct_user_exit  macro 
> In my arch/arm/kernel/entry-common.S 
> 
> My kernel version is 3.4.0 

Well things have moved on this since then (we're approaching 3.10, so you
might consider an upgrade!).

For the purposes of this patch, you can just delete the ct_user_exit line.

Will

^ permalink raw reply	[flat|nested] 26+ messages in thread

* A bug about system call on ARM
  2013-05-31  8:46                     ` Will Deacon
@ 2013-05-31 11:02                       ` Wang, Yalin
  2013-05-31 11:13                         ` Will Deacon
  0 siblings, 1 reply; 26+ messages in thread
From: Wang, Yalin @ 2013-05-31 11:02 UTC (permalink / raw)
  To: linux-arm-kernel

Hi  Will,

I have merge your code ,
But there is a different ,

+	
+	ct_user_exit
+
+#ifdef CONFIG_ALIGNMENT_TRAP
+	ldr	ip, __cr_alignment
+	ldr	ip, [ip]
+	mcr	p15, 0, ip, c1, c0		@ update control register
+#endif 

+	enable_irq
+	get_thread_info tsk

Is this change ok ?


Thanks 

-----Original Message-----
From: linux-arch-owner@vger.kernel.org [mailto:linux-arch-owner at vger.kernel.org] On Behalf Of Will Deacon
Sent: Friday, May 31, 2013 4:47 PM
To: Wang, Yalin
Cc: 'richard -rw- weinberger'; 'linux-arch at vger.kernel.org'; 'linux-kernel at vger.kernel.org'; 'linux-arm-kernel at lists.infradead.org'
Subject: Re: A bug about system call on ARM

On Fri, May 31, 2013 at 03:56:31AM +0100, Wang, Yalin wrote:
> Hi Will,
> 
> Thanks for your patch ,
> 
> But I found  I don't have ct_user_exit  macro In my 
> arch/arm/kernel/entry-common.S
> 
> My kernel version is 3.4.0

Well things have moved on this since then (we're approaching 3.10, so you might consider an upgrade!).

For the purposes of this patch, you can just delete the ct_user_exit line.

Will
--
To unsubscribe from this list: send the line "unsubscribe linux-arch" in the body of a message to majordomo at vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 26+ messages in thread

* A bug about system call on ARM
  2013-05-31 11:02                       ` Wang, Yalin
@ 2013-05-31 11:13                         ` Will Deacon
  2013-05-31 11:30                           ` Wang, Yalin
  2013-05-31 16:48                           ` Nicolas Pitre
  0 siblings, 2 replies; 26+ messages in thread
From: Will Deacon @ 2013-05-31 11:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, May 31, 2013 at 12:02:49PM +0100, Wang, Yalin wrote:
> Hi  Will,
> 
> I have merge your code ,
> But there is a different ,
> 
> +	
> +	ct_user_exit

I thought you didn't have ct_user_exit? In which case, just delete this
line.

> +#ifdef CONFIG_ALIGNMENT_TRAP
> +	ldr	ip, __cr_alignment
> +	ldr	ip, [ip]
> +	mcr	p15, 0, ip, c1, c0		@ update control register
> +#endif 
> 
> +	enable_irq
> +	get_thread_info tsk

Hard to tell without context. You can take a look at my git tree if you
like (I fixed it up based on Nico's comment):

  https://git.kernel.org/cgit/linux/kernel/git/will/linux.git/commit/?h=misc-patches

Will

^ permalink raw reply	[flat|nested] 26+ messages in thread

* A bug about system call on ARM
  2013-05-31 11:13                         ` Will Deacon
@ 2013-05-31 11:30                           ` Wang, Yalin
  2013-06-03  5:25                             ` Wang, Yalin
  2013-05-31 16:48                           ` Nicolas Pitre
  1 sibling, 1 reply; 26+ messages in thread
From: Wang, Yalin @ 2013-05-31 11:30 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will,

I see,
I will make one more test .

Thanks for your clarification .

-----Original Message-----
From: Will Deacon [mailto:will.deacon at arm.com] 
Sent: Friday, May 31, 2013 7:13 PM
To: Wang, Yalin
Cc: 'richard -rw- weinberger'; 'linux-arch at vger.kernel.org'; 'linux-kernel at vger.kernel.org'; 'linux-arm-kernel at lists.infradead.org'
Subject: Re: A bug about system call on ARM

On Fri, May 31, 2013 at 12:02:49PM +0100, Wang, Yalin wrote:
> Hi  Will,
> 
> I have merge your code ,
> But there is a different ,
> 
> +	
> +	ct_user_exit

I thought you didn't have ct_user_exit? In which case, just delete this line.

> +#ifdef CONFIG_ALIGNMENT_TRAP
> +	ldr	ip, __cr_alignment
> +	ldr	ip, [ip]
> +	mcr	p15, 0, ip, c1, c0		@ update control register
> +#endif
> 
> +	enable_irq
> +	get_thread_info tsk

Hard to tell without context. You can take a look at my git tree if you like (I fixed it up based on Nico's comment):

  https://git.kernel.org/cgit/linux/kernel/git/will/linux.git/commit/?h=misc-patches

Will

^ permalink raw reply	[flat|nested] 26+ messages in thread

* A bug about system call on ARM
  2013-05-31 11:13                         ` Will Deacon
  2013-05-31 11:30                           ` Wang, Yalin
@ 2013-05-31 16:48                           ` Nicolas Pitre
  2013-05-31 16:52                             ` Will Deacon
  1 sibling, 1 reply; 26+ messages in thread
From: Nicolas Pitre @ 2013-05-31 16:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 31 May 2013, Will Deacon wrote:

> Hard to tell without context. You can take a look at my git tree if you
> like (I fixed it up based on Nico's comment):
> 
>   https://git.kernel.org/cgit/linux/kernel/git/will/linux.git/commit/?h=misc-patches

If you like, you may add "Reviewed-by: Nicolas Pitre <nico@linaro.org>" 
to it.


Nicolas

^ permalink raw reply	[flat|nested] 26+ messages in thread

* A bug about system call on ARM
  2013-05-31 16:48                           ` Nicolas Pitre
@ 2013-05-31 16:52                             ` Will Deacon
  0 siblings, 0 replies; 26+ messages in thread
From: Will Deacon @ 2013-05-31 16:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, May 31, 2013 at 05:48:56PM +0100, Nicolas Pitre wrote:
> On Fri, 31 May 2013, Will Deacon wrote:
> 
> > Hard to tell without context. You can take a look at my git tree if you
> > like (I fixed it up based on Nico's comment):
> > 
> >   https://git.kernel.org/cgit/linux/kernel/git/will/linux.git/commit/?h=misc-patches
> 
> If you like, you may add "Reviewed-by: Nicolas Pitre <nico@linaro.org>" 
> to it.

Cheers Nicolas, will do.

Will

^ permalink raw reply	[flat|nested] 26+ messages in thread

* A bug about system call on ARM
  2013-05-31 11:30                           ` Wang, Yalin
@ 2013-06-03  5:25                             ` Wang, Yalin
  2013-06-03  9:54                               ` Will Deacon
  0 siblings, 1 reply; 26+ messages in thread
From: Wang, Yalin @ 2013-06-03  5:25 UTC (permalink / raw)
  To: linux-arm-kernel

Hi  Will,

I have a question about this patch .

If the user space is thumb mode,
The PC should be rewind by 2 bytes,
So the fix_up code should be 

Sub lr, lr, #2 .


Am I right ?


Thanks for your help .

-----Original Message-----
From: Wang, Yalin 
Sent: Friday, May 31, 2013 7:31 PM
To: 'Will Deacon'
Cc: 'richard -rw- weinberger'; 'linux-arch at vger.kernel.org'; 'linux-kernel at vger.kernel.org'; 'linux-arm-kernel at lists.infradead.org'
Subject: RE: A bug about system call on ARM

Hi Will,

I see,
I will make one more test .

Thanks for your clarification .

-----Original Message-----
From: Will Deacon [mailto:will.deacon at arm.com] 
Sent: Friday, May 31, 2013 7:13 PM
To: Wang, Yalin
Cc: 'richard -rw- weinberger'; 'linux-arch at vger.kernel.org'; 'linux-kernel at vger.kernel.org'; 'linux-arm-kernel at lists.infradead.org'
Subject: Re: A bug about system call on ARM

On Fri, May 31, 2013 at 12:02:49PM +0100, Wang, Yalin wrote:
> Hi  Will,
> 
> I have merge your code ,
> But there is a different ,
> 
> +	
> +	ct_user_exit

I thought you didn't have ct_user_exit? In which case, just delete this line.

> +#ifdef CONFIG_ALIGNMENT_TRAP
> +	ldr	ip, __cr_alignment
> +	ldr	ip, [ip]
> +	mcr	p15, 0, ip, c1, c0		@ update control register
> +#endif
> 
> +	enable_irq
> +	get_thread_info tsk

Hard to tell without context. You can take a look at my git tree if you like (I fixed it up based on Nico's comment):

  https://git.kernel.org/cgit/linux/kernel/git/will/linux.git/commit/?h=misc-patches

Will

^ permalink raw reply	[flat|nested] 26+ messages in thread

* A bug about system call on ARM
  2013-06-03  5:25                             ` Wang, Yalin
@ 2013-06-03  9:54                               ` Will Deacon
  2013-06-03  9:58                                 ` Wang, Yalin
  2013-06-14  6:53                                 ` Wang, Yalin
  0 siblings, 2 replies; 26+ messages in thread
From: Will Deacon @ 2013-06-03  9:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 03, 2013 at 06:25:26AM +0100, Wang, Yalin wrote:
> Hi  Will,
> 
> I have a question about this patch .
> 
> If the user space is thumb mode,
> The PC should be rewind by 2 bytes,
> So the fix_up code should be 
> 
> Sub lr, lr, #2 .
> 
> 
> Am I right ?

No, because we don't have OABI-compat support for Thumb applications and
force everything down the EABI path instead.

Did you manage to test the patch?

Will

^ permalink raw reply	[flat|nested] 26+ messages in thread

* A bug about system call on ARM
  2013-06-03  9:54                               ` Will Deacon
@ 2013-06-03  9:58                                 ` Wang, Yalin
  2013-06-04  5:33                                   ` Wang, Yalin
  2013-06-14  6:53                                 ` Wang, Yalin
  1 sibling, 1 reply; 26+ messages in thread
From: Wang, Yalin @ 2013-06-03  9:58 UTC (permalink / raw)
  To: linux-arm-kernel

Hi  Will

Oh  I see,
Thanks for your reply 

Yes ,  we are testing for it ,
But need some time to wait for the result ,
Because The stability test need some time to reproduce
this issue , And this issue doesn't reproduce 100% .


-----Original Message-----
From: Will Deacon [mailto:will.deacon at arm.com] 
Sent: Monday, June 03, 2013 5:54 PM
To: Wang, Yalin
Cc: 'richard -rw- weinberger'; 'linux-arch at vger.kernel.org'; 'linux-kernel at vger.kernel.org'; 'linux-arm-kernel at lists.infradead.org'
Subject: Re: A bug about system call on ARM

On Mon, Jun 03, 2013 at 06:25:26AM +0100, Wang, Yalin wrote:
> Hi  Will,
> 
> I have a question about this patch .
> 
> If the user space is thumb mode,
> The PC should be rewind by 2 bytes,
> So the fix_up code should be
> 
> Sub lr, lr, #2 .
> 
> 
> Am I right ?

No, because we don't have OABI-compat support for Thumb applications and force everything down the EABI path instead.

Did you manage to test the patch?

Will

^ permalink raw reply	[flat|nested] 26+ messages in thread

* A bug about system call on ARM
  2013-05-30 11:41                 ` Will Deacon
  2013-05-31  2:56                   ` Wang, Yalin
  2013-05-31  3:54                   ` Nicolas Pitre
@ 2013-06-03 10:18                   ` Russell King - ARM Linux
  2013-06-03 10:27                     ` Will Deacon
  2 siblings, 1 reply; 26+ messages in thread
From: Russell King - ARM Linux @ 2013-06-03 10:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, May 30, 2013 at 12:41:12PM +0100, Will Deacon wrote:
> +#if defined(CONFIG_OABI_COMPAT) || !defined(CONFIG_AEABI)
> +	/*
> +	 * We may have faulted trying to load the SWI instruction due to
> +	 * concurrent page aging on another CPU. In this case, return
> +	 * back to the swi instruction and fault the page back.
> +	 */
> +9001:
> +	sub	lr, lr, #4
> +	str	lr, [sp, #S_PC]
> +	b	ret_fast_syscall
> +#endif

The comment is wrong.  If we get here, it means that the fault from
trying to loading the instruction can't be fixed up.  Arguably, that
should result in a SIGSEGV being sent immediately, but we'll get to
that when we then try to re-load the instruction.

What it means is that the page we were trying to execute has been
unmapped beneath us.

BTW, I notice that the kernel oops was never posted to the list, so it's
impossible for other people following this thread to see what the real
problem is...

^ permalink raw reply	[flat|nested] 26+ messages in thread

* A bug about system call on ARM
  2013-06-03 10:18                   ` Russell King - ARM Linux
@ 2013-06-03 10:27                     ` Will Deacon
  2013-06-03 10:45                       ` Russell King - ARM Linux
  0 siblings, 1 reply; 26+ messages in thread
From: Will Deacon @ 2013-06-03 10:27 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Russell,

On Mon, Jun 03, 2013 at 11:18:09AM +0100, Russell King - ARM Linux wrote:
> On Thu, May 30, 2013 at 12:41:12PM +0100, Will Deacon wrote:
> > +#if defined(CONFIG_OABI_COMPAT) || !defined(CONFIG_AEABI)
> > +	/*
> > +	 * We may have faulted trying to load the SWI instruction due to
> > +	 * concurrent page aging on another CPU. In this case, return
> > +	 * back to the swi instruction and fault the page back.
> > +	 */
> > +9001:
> > +	sub	lr, lr, #4
> > +	str	lr, [sp, #S_PC]
> > +	b	ret_fast_syscall
> > +#endif
> 
> The comment is wrong.  If we get here, it means that the fault from
> trying to loading the instruction can't be fixed up.  Arguably, that
> should result in a SIGSEGV being sent immediately, but we'll get to
> that when we then try to re-load the instruction.

Why would we kill the application in this case? The reported problem is
where one CPU ages the page containing the swi instruction (mkold =>
clears L_PTE_YOUNG => write 0 to the pte) in between the other CPU executing
the swi and the kernel trying to read the immediate. The VMA is fine.

> What it means is that the page we were trying to execute has been
> unmapped beneath us.

Yes, as a result of the kernel aging it.

> BTW, I notice that the kernel oops was never posted to the list, so it's
> impossible for other people following this thread to see what the real
> problem is...

It was sent as an attachment I think, so I've pasted the log below (you can
see CPU0 unmapping the page from which CPU1 is assumedly executing).

Will

--->8

<1>[44330.850628] Unable to handle kernel paging request at virtual address 4020841c
<1>[44330.850750] pgd = c490c000
<1>[44330.850841] [4020841c] *pgd=84451831, *pte=bf05859d, *ppte=00000000
<0>[44330.851055] Internal error: Oops: 17 [#1] PREEMPT SMP ARM
<4>[44330.851146] Modules linked in: hid_sony(O)
<4>[44330.851330] CPU: 1    Tainted: G        W  O  (3.4.0-perf-gf496dca-01162-gcbcc62b #1)
<4>[44330.851421] PC is at vector_swi+0x28/0x88
<4>[44330.851482] LR is at 0x40208420
<4>[44330.851604] pc : [<c000dfe8>]    lr : [<40208420>]    psr: 60000093
<4>[44330.851604] sp : e0601fb0  ip : 40092f78  fp : befcd6fc
<4>[44330.851757] r10: 4005a040  r9 : 4012fa70  r8 : 60000010
<4>[44330.851818] r7 : 00000107  r6 : 00000003  r5 : 4005a030  r4 : 5746d378
<4>[44330.851909] r3 : 5883fbf8  r2 : 00000000  r1 : befcd6d0  r0 : 00000001
<4>[44330.851971] Flags: nZCv  IRQs off  FIQs on  Mode SVC_32  ISA ARM  Segment user
<4>[44330.852093] Control: 10c5787d  Table: 84b0c06a  DAC: 00000015
<4>[44330.852154] 
<4>[44330.852154] PC: 0xc000df68:
<4>[44330.852306] df68  e31100ff 1afffff0 e59d1040 e5bde03c e16ff001 f57ff01f e95d7fff e1a00000
<4>[44330.852825] df88  e28dd00c e1b0f00e eb025aba e1a096ad e1a09689 e5991000 e3a08001 e3110c03
<4>[44330.853405] dfa8  0affffec e1a0100d e3a00001 eb000911 eaffffe8 e320f000 e24dd048 e88d1fff
<4>[44330.853954] dfc8  e28d803c e9486000 e14f8000 e58de03c e58d8040 e58d0044 e3180020 13a0a000
<4>[44330.854534] dfe8  051ea004 e59fc0ac e59cc000 ee01cf10 f1080080 e1a096ad e1a09689 e28f809c
<4>[44330.855053] e008  e3daa4ff 122a7609 159f808c e599a000 e92d0030 e31a0c03 1a000008 e3570f5f
<4>[44330.855602] e028  e24fee13 3798f107 e28d1008 e3a08000 e357080f e2270000 2a000fac ea022fd4
<4>[44330.856152] e048  e1a02007 e28d1008 e3a00000 eb0008e9 e28fe014 e1a07000 e28d1008 e3570f5f
<4>[44330.856732] 
<4>[44330.856732] SP: 0xe0601f30:
<4>[44330.856823] 1f30  00000000 00000000 c0d44f98 00000001 e18fce48 00000002 e565fa80 00000000
<4>[44330.857373] 1f50  c000dfe8 60000093 ffffffff e0601f9c 60000010 c0731518 00000001 befcd6d0
<4>[44330.857891] 1f70  00000000 5883fbf8 5746d378 4005a030 00000003 00000107 60000010 4012fa70
<4>[44330.858471] 1f90  4005a040 befcd6fc 40092f78 e0601fb0 40208420 c000dfe8 60000093 ffffffff
<4>[44330.859021] 1fb0  00000001 befcd6d0 00000000 5883fbf8 5746d378 4005a030 00000003 00000107
<4>[44330.859570] 1fd0  befcd6e8 4012fa70 4005a040 befcd6fc 40092f78 befcd6c8 4008aef9 40208420
<4>[44330.860089] 1ff0  60000010 00000001 ffe6e6e6 ffe6e6e6 00000000 00000000 00000000 00000000
<4>[44330.860669] 2010  00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<0>[44330.861218] Process ndroid.settings (pid: 25518, stack limit = 0xe06002f0)
<0>[44330.861279] Stack: (0xe0601fb0 to 0xe0602000)
<0>[44330.861401] 1fa0:                                     00000001 befcd6d0 00000000 5883fbf8
<0>[44330.861462] 1fc0: 5746d378 4005a030 00000003 00000107 befcd6e8 4012fa70 4005a040 befcd6fc
<0>[44330.861584] 1fe0: 40092f78 befcd6c8 4008aef9 40208420 60000010 00000001 ffe6e6e6 ffe6e6e6
<0>[44330.861707] Code: e58d8040 e58d0044 e3180020 13a0a000 (051ea004) 
<4>[44330.861890] ---[ end trace da227214a82491c0 ]---
<0>[44330.862012] Kernel panic - not syncing: Fatal exception
<2>[44330.862073] CPU0: stopping
<4>[44330.862195] [<c0014df8>] (unwind_backtrace+0x0/0x11c) from [<c001339c>] (handle_IPI+0x110/0x224)
<4>[44330.862256] [<c001339c>] (handle_IPI+0x110/0x224) from [<c000868c>] (gic_handle_irq+0x104/0x110)
<4>[44330.862378] [<c000868c>] (gic_handle_irq+0x104/0x110) from [<c0731580>] (__irq_svc+0x40/0x70)
<4>[44330.862470] Exception stack(0xc7439b10 to 0xc7439b58)
<4>[44330.862531] 9b00:                                     00000001 bf05f000 00000004 1a55b000
<4>[44330.862653] 9b20: bf05f59d d2b4ac08 000bf05f ca5ca03c 00000000 c7439c24 4020f000 00000001
<4>[44330.862744] 9b40: c0f2ab50 c7439b58 c0123038 c0123040 00000113 ffffffff
<4>[44330.862836] [<c0731580>] (__irq_svc+0x40/0x70) from [<c0123040>] (memblock_is_memory+0x18/0x20)
<4>[44330.862958] [<c0123040>] (memblock_is_memory+0x18/0x20) from [<c00198b4>] (__sync_icache_dcache+0x40/0x9c)
<4>[44330.863049] [<c00198b4>] (__sync_icache_dcache+0x40/0x9c) from [<c0121a44>] (ptep_clear_flush_young+0x3c/0x60)
<4>[44330.863171] [<c0121a44>] (ptep_clear_flush_young+0x3c/0x60) from [<c011d3b0>] (page_referenced_one+0x6c/0xfc)
<4>[44330.863294] [<c011d3b0>] (page_referenced_one+0x6c/0xfc) from [<c011e9e4>] (page_referenced+0x1a8/0x200)
<4>[44330.863416] [<c011e9e4>] (page_referenced+0x1a8/0x200) from [<c010522c>] (shrink_active_list.isra.49+0x1ec/0x2e8)
<4>[44330.863477] [<c010522c>] (shrink_active_list.isra.49+0x1ec/0x2e8) from [<c0106554>] (shrink_mem_cgroup_zone+0x338/0x4c4)
<4>[44330.863599] [<c0106554>] (shrink_mem_cgroup_zone+0x338/0x4c4) from [<c010740c>] (try_to_free_pages+0x2a0/0x570)
<4>[44330.863690] [<c010740c>] (try_to_free_pages+0x2a0/0x570) from [<c00fbff0>] (__alloc_pages_nodemask+0x424/0x758)
<4>[44330.863812] [<c00fbff0>] (__alloc_pages_nodemask+0x424/0x758) from [<c0116dcc>] (handle_pte_fault+0x184/0x7c8)
<4>[44330.863934] [<c0116dcc>] (handle_pte_fault+0x184/0x7c8) from [<c011751c>] (handle_mm_fault+0x10c/0x128)
<4>[44330.864057] [<c011751c>] (handle_mm_fault+0x10c/0x128) from [<c0732cc0>] (do_page_fault+0x180/0x3c0)
<4>[44330.864148] [<c0732cc0>] (do_page_fault+0x180/0x3c0) from [<c000847c>] (do_DataAbort+0x134/0x1a8)
<4>[44330.864270] [<c000847c>] (do_DataAbort+0x134/0x1a8) from [<c07316f4>] (__dabt_usr+0x34/0x40)
<4>[44330.864331] Exception stack(0xc7439fb0 to 0xc7439ff8)
<4>[44330.864453] 9fa0:                                     40baa000 40fab040 0002d440 00000000
<4>[44330.864545] 9fc0: 41ed0f10 00000001 408d72c0 41ed0f2c 0000001c 00000000 40fab400 00000001
<4>[44330.864606] 9fe0: 00000380 59c70e10 00000400 40209364 20000010 ffffffff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* A bug about system call on ARM
  2013-06-03 10:27                     ` Will Deacon
@ 2013-06-03 10:45                       ` Russell King - ARM Linux
  2013-06-03 12:39                         ` Will Deacon
  0 siblings, 1 reply; 26+ messages in thread
From: Russell King - ARM Linux @ 2013-06-03 10:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 03, 2013 at 11:27:23AM +0100, Will Deacon wrote:
> Hi Russell,
> 
> On Mon, Jun 03, 2013 at 11:18:09AM +0100, Russell King - ARM Linux wrote:
> > On Thu, May 30, 2013 at 12:41:12PM +0100, Will Deacon wrote:
> > > +#if defined(CONFIG_OABI_COMPAT) || !defined(CONFIG_AEABI)
> > > +	/*
> > > +	 * We may have faulted trying to load the SWI instruction due to
> > > +	 * concurrent page aging on another CPU. In this case, return
> > > +	 * back to the swi instruction and fault the page back.
> > > +	 */
> > > +9001:
> > > +	sub	lr, lr, #4
> > > +	str	lr, [sp, #S_PC]
> > > +	b	ret_fast_syscall
> > > +#endif
> > 
> > The comment is wrong.  If we get here, it means that the fault from
> > trying to loading the instruction can't be fixed up.  Arguably, that
> > should result in a SIGSEGV being sent immediately, but we'll get to
> > that when we then try to re-load the instruction.
> 
> Why would we kill the application in this case? The reported problem is
> where one CPU ages the page containing the swi instruction (mkold =>
> clears L_PTE_YOUNG => write 0 to the pte) in between the other CPU executing
> the swi and the kernel trying to read the immediate. The VMA is fine.

If you mark the instruction was a user-accessing instruction, the kernel
will handle the resulting exception, trying to make the page accessible.
If it is successful, then execution resumes as normal at the faulting
instruction and continues as if nothing happened.

If it can't make the page accessible (eg, out of memory) the exception
handler path (your code above) will be called instead.  Normal action in
that case would be for a system call to return -EFAULT, but in this case
we can't know what the syscall was, so we don't know if userspace will
even pay attention to the returned error code.  In any case, if the page
is no longer accessible, it's going to end up being killed by a SEGV
when we eventually return to userspace anyway.

> > What it means is that the page we were trying to execute has been
> > unmapped beneath us.
> 
> Yes, as a result of the kernel aging it.

No - see above.  The exception path is for more serious conditions than
that.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* A bug about system call on ARM
  2013-06-03 10:45                       ` Russell King - ARM Linux
@ 2013-06-03 12:39                         ` Will Deacon
  0 siblings, 0 replies; 26+ messages in thread
From: Will Deacon @ 2013-06-03 12:39 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Jun 03, 2013 at 11:45:34AM +0100, Russell King - ARM Linux wrote:
> On Mon, Jun 03, 2013 at 11:27:23AM +0100, Will Deacon wrote:
> > On Mon, Jun 03, 2013 at 11:18:09AM +0100, Russell King - ARM Linux wrote:
> > > On Thu, May 30, 2013 at 12:41:12PM +0100, Will Deacon wrote:
> > > > +#if defined(CONFIG_OABI_COMPAT) || !defined(CONFIG_AEABI)
> > > > +	/*
> > > > +	 * We may have faulted trying to load the SWI instruction due to
> > > > +	 * concurrent page aging on another CPU. In this case, return
> > > > +	 * back to the swi instruction and fault the page back.
> > > > +	 */
> > > > +9001:
> > > > +	sub	lr, lr, #4
> > > > +	str	lr, [sp, #S_PC]
> > > > +	b	ret_fast_syscall
> > > > +#endif
> > > 
> > > The comment is wrong.  If we get here, it means that the fault from
> > > trying to loading the instruction can't be fixed up.  Arguably, that
> > > should result in a SIGSEGV being sent immediately, but we'll get to
> > > that when we then try to re-load the instruction.
> > 
> > Why would we kill the application in this case? The reported problem is
> > where one CPU ages the page containing the swi instruction (mkold =>
> > clears L_PTE_YOUNG => write 0 to the pte) in between the other CPU executing
> > the swi and the kernel trying to read the immediate. The VMA is fine.
> 
> If you mark the instruction was a user-accessing instruction, the kernel
> will handle the resulting exception, trying to make the page accessible.
> If it is successful, then execution resumes as normal at the faulting
> instruction and continues as if nothing happened.
> 
> If it can't make the page accessible (eg, out of memory) the exception
> handler path (your code above) will be called instead.  Normal action in
> that case would be for a system call to return -EFAULT, but in this case
> we can't know what the syscall was, so we don't know if userspace will
> even pay attention to the returned error code.  In any case, if the page
> is no longer accessible, it's going to end up being killed by a SEGV
> when we eventually return to userspace anyway.

Yes, of course, the fault handling will sort out non-fatal faults for us, so
I'll update the comment.

Thanks,

Will

^ permalink raw reply	[flat|nested] 26+ messages in thread

* A bug about system call on ARM
  2013-06-03  9:58                                 ` Wang, Yalin
@ 2013-06-04  5:33                                   ` Wang, Yalin
  2013-06-04  8:48                                     ` Will Deacon
  0 siblings, 1 reply; 26+ messages in thread
From: Wang, Yalin @ 2013-06-04  5:33 UTC (permalink / raw)
  To: linux-arm-kernel

Hi  Will,

Could I know  what's your git branch is  mainly used for ?

https://git.kernel.org/cgit/linux/kernel/git/will/linux.git


I mean if the branch is used for ARM arch maintenance ?
If yes, I think I can send future bugs about ARM to you directly,
And do not need ping-pang in the mail list .

Thanks for your help .

-----Original Message-----
From: Wang, Yalin 
Sent: Monday, June 03, 2013 5:58 PM
To: 'Will Deacon'
Cc: 'richard -rw- weinberger'; 'linux-arch at vger.kernel.org'; 'linux-kernel at vger.kernel.org'; 'linux-arm-kernel at lists.infradead.org'
Subject: RE: A bug about system call on ARM

Hi  Will

Oh  I see,
Thanks for your reply 

Yes ,  we are testing for it ,
But need some time to wait for the result , Because The stability test need some time to reproduce this issue , And this issue doesn't reproduce 100% .


-----Original Message-----
From: Will Deacon [mailto:will.deacon at arm.com]
Sent: Monday, June 03, 2013 5:54 PM
To: Wang, Yalin
Cc: 'richard -rw- weinberger'; 'linux-arch at vger.kernel.org'; 'linux-kernel at vger.kernel.org'; 'linux-arm-kernel at lists.infradead.org'
Subject: Re: A bug about system call on ARM

On Mon, Jun 03, 2013 at 06:25:26AM +0100, Wang, Yalin wrote:
> Hi  Will,
> 
> I have a question about this patch .
> 
> If the user space is thumb mode,
> The PC should be rewind by 2 bytes,
> So the fix_up code should be
> 
> Sub lr, lr, #2 .
> 
> 
> Am I right ?

No, because we don't have OABI-compat support for Thumb applications and force everything down the EABI path instead.

Did you manage to test the patch?

Will

^ permalink raw reply	[flat|nested] 26+ messages in thread

* A bug about system call on ARM
  2013-06-04  5:33                                   ` Wang, Yalin
@ 2013-06-04  8:48                                     ` Will Deacon
  2013-06-04  9:30                                       ` Wang, Yalin
  0 siblings, 1 reply; 26+ messages in thread
From: Will Deacon @ 2013-06-04  8:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jun 04, 2013 at 06:33:20AM +0100, Wang, Yalin wrote:
> Hi  Will,

Hello,

> Could I know  what's your git branch is  mainly used for ?
> 
> https://git.kernel.org/cgit/linux/kernel/git/will/linux.git
> 
> 
> I mean if the branch is used for ARM arch maintenance ?

I send most of my patches via Russell, and the tree above is where I keep my
patches whilst they're not yet in mainline.

> If yes, I think I can send future bugs about ARM to you directly,
> And do not need ping-pang in the mail list .

Quite the opposite! Having discussions on the mailing list is key to how the
kernel is developed, so please continue to send questions, bug reports and
patches there. As has been pointed out previously in this thread, you need
to choose the right list in the first place, rather than just sending to
LKML.

Of course, you can always CC me on arm/arm64 patches if you like.

Cheers,

Will

^ permalink raw reply	[flat|nested] 26+ messages in thread

* A bug about system call on ARM
  2013-06-04  8:48                                     ` Will Deacon
@ 2013-06-04  9:30                                       ` Wang, Yalin
  2013-06-04 11:27                                         ` Will Deacon
  0 siblings, 1 reply; 26+ messages in thread
From: Wang, Yalin @ 2013-06-04  9:30 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will,

Thanks for your reply,
I see your meaning ,
But it seems my apply for joining into 'linux-arm-kernel at lists.infradead.org'
Is not approved ,  
How to join in this mail-list ?

Thanks for your help .

-----Original Message-----
From: Will Deacon [mailto:will.deacon at arm.com] 
Sent: Tuesday, June 04, 2013 4:49 PM
To: Wang, Yalin
Cc: 'richard -rw- weinberger'; 'linux-arch at vger.kernel.org'; 'linux-kernel at vger.kernel.org'; 'linux-arm-kernel at lists.infradead.org'
Subject: Re: A bug about system call on ARM

On Tue, Jun 04, 2013 at 06:33:20AM +0100, Wang, Yalin wrote:
> Hi  Will,

Hello,

> Could I know  what's your git branch is  mainly used for ?
> 
> https://git.kernel.org/cgit/linux/kernel/git/will/linux.git
> 
> 
> I mean if the branch is used for ARM arch maintenance ?

I send most of my patches via Russell, and the tree above is where I keep my patches whilst they're not yet in mainline.

> If yes, I think I can send future bugs about ARM to you directly, And 
> do not need ping-pang in the mail list .

Quite the opposite! Having discussions on the mailing list is key to how the kernel is developed, so please continue to send questions, bug reports and patches there. As has been pointed out previously in this thread, you need to choose the right list in the first place, rather than just sending to LKML.

Of course, you can always CC me on arm/arm64 patches if you like.

Cheers,

Will

^ permalink raw reply	[flat|nested] 26+ messages in thread

* A bug about system call on ARM
  2013-06-04  9:30                                       ` Wang, Yalin
@ 2013-06-04 11:27                                         ` Will Deacon
  0 siblings, 0 replies; 26+ messages in thread
From: Will Deacon @ 2013-06-04 11:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Jun 04, 2013 at 10:30:04AM +0100, Wang, Yalin wrote:
> Hi Will,
> 
> Thanks for your reply,
> I see your meaning ,
> But it seems my apply for joining into 'linux-arm-kernel at lists.infradead.org'
> Is not approved ,  
> How to join in this mail-list ?

There's a mailman frontend here:

  http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

Will

^ permalink raw reply	[flat|nested] 26+ messages in thread

* A bug about system call on ARM
  2013-06-03  9:54                               ` Will Deacon
  2013-06-03  9:58                                 ` Wang, Yalin
@ 2013-06-14  6:53                                 ` Wang, Yalin
  1 sibling, 0 replies; 26+ messages in thread
From: Wang, Yalin @ 2013-06-14  6:53 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will,

We have tested the patch,
It seems ok in the stability test .

We have merged it into our main branch .

Thanks for your patch !

-----Original Message-----
From: Will Deacon [mailto:will.deacon at arm.com] 
Sent: Monday, June 03, 2013 5:54 PM
To: Wang, Yalin
Cc: 'richard -rw- weinberger'; 'linux-arch at vger.kernel.org'; 'linux-kernel at vger.kernel.org'; 'linux-arm-kernel at lists.infradead.org'
Subject: Re: A bug about system call on ARM

On Mon, Jun 03, 2013 at 06:25:26AM +0100, Wang, Yalin wrote:
> Hi  Will,
> 
> I have a question about this patch .
> 
> If the user space is thumb mode,
> The PC should be rewind by 2 bytes,
> So the fix_up code should be
> 
> Sub lr, lr, #2 .
> 
> 
> Am I right ?

No, because we don't have OABI-compat support for Thumb applications and force everything down the EABI path instead.

Did you manage to test the patch?

Will

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2013-06-14  6:53 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <35FD53F367049845BC99AC72306C23D1610991B85B@CNBJMBX05.corpusers.net>
     [not found] ` <CAFLxGvy39xWdZmtiVHP+y=zH1coCVmMuREcmD25wSb=w-VK7Xg@mail.gmail.com>
     [not found]   ` <35FD53F367049845BC99AC72306C23D1610991B85D@CNBJMBX05.corpusers.net>
     [not found]     ` <35FD53F367049845BC99AC72306C23D1610991B85E@CNBJMBX05.corpusers.net>
2013-05-29  8:46       ` A bug about system call on ARM richard -rw- weinberger
2013-05-29  9:48         ` Will Deacon
     [not found]           ` <35FD53F367049845BC99AC72306C23D1610991B865@CNBJMBX05.corpusers.net>
2013-05-30  1:41             ` Wang, Yalin
2013-05-30  9:09               ` Will Deacon
2013-05-30 11:41                 ` Will Deacon
2013-05-31  2:56                   ` Wang, Yalin
2013-05-31  8:46                     ` Will Deacon
2013-05-31 11:02                       ` Wang, Yalin
2013-05-31 11:13                         ` Will Deacon
2013-05-31 11:30                           ` Wang, Yalin
2013-06-03  5:25                             ` Wang, Yalin
2013-06-03  9:54                               ` Will Deacon
2013-06-03  9:58                                 ` Wang, Yalin
2013-06-04  5:33                                   ` Wang, Yalin
2013-06-04  8:48                                     ` Will Deacon
2013-06-04  9:30                                       ` Wang, Yalin
2013-06-04 11:27                                         ` Will Deacon
2013-06-14  6:53                                 ` Wang, Yalin
2013-05-31 16:48                           ` Nicolas Pitre
2013-05-31 16:52                             ` Will Deacon
2013-05-31  3:54                   ` Nicolas Pitre
2013-05-31  8:45                     ` Will Deacon
2013-06-03 10:18                   ` Russell King - ARM Linux
2013-06-03 10:27                     ` Will Deacon
2013-06-03 10:45                       ` Russell King - ARM Linux
2013-06-03 12:39                         ` Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox