public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Query: Crash is coming during /prod/PID/stat and do_exit of same task
@ 2018-01-09 13:33 Kohli, Gaurav
  2018-01-15 10:04 ` Kohli, Gaurav
  2018-01-15 11:02 ` John Ogness
  0 siblings, 2 replies; 8+ messages in thread
From: Kohli, Gaurav @ 2018-01-09 13:33 UTC (permalink / raw)
  To: peterz, john.ogness, mingo; +Cc: linux-kernel, linux-arm-msm

HI ,

We are seeing crash in do_task_stat while accessing stack pointer, It 
seems same task has already completed do_exit call.
So it seems a race between them:

Below is the crash trace:
49750.534377] Kernel BUG at ffffff8e7a4c53a8 [verbose debug info 
unavailable]
[49750.534394] task: ffffffe7b4475580 task.stack: ffffffe7a5f0c000
[49750.534400] PC is at do_task_stat+0x740/0x908
[49750.534402] LR is at do_task_stat+0xa4/0x908
[49750.534403] pc : [<ffffff8e7a4c53a8>] lr : [<ffffff8e7a4c4d0c>] 
pstate: 80400145
[49750.534404] sp : ffffffe7a5f0fbd0

and here is stack trace on that core:

-000|user_stack_pointer(inline)
-000|do_task_stat(
     |    m = 0xFFFFFFE7A5CD7380,
     |    ns = 0xFFFFFF8E7C43C748,
     |  ?,
     |    task = 0xFFFFFFE80D8C2280,
     |  ?)
     |  tty_pgrp = 0
     |  ppid = 2084696064
     |  sid = 0
     |  mm = 0xFFFFFFE7B4424140
     |  tcomm = (84, 9, 71, 122, 142, 255, 255, 255, 48, 253, 240, 165, 
231, 255, 255, 255)
     |  flags = 18446743969119403392
-001|proc_tgid_stat(
     |    m = 0xFFFFFFE7A5CD7380,
     |  ?,

Below are task stats which shows , process completed the do_exit call:
struct task_struct.flags -x 0xFFFFFFE80D8C2280
   flags = 0x40870c

crash_64> struct task_struct.exit_code -x 0xFFFFFFE80D8C2280
   exit_code = 0x6

    struct task_struct.state -x 0xFFFFFFE80D8C2280
   state = 0x40

In our build both patches are there ,
fs/proc: report eip/esp in /prod/PID/stat for coredumping

and also  task.state has already set PF_DUMPCORE as it got the sigabrt 
signal.

Regards
Gaurav


-- Qualcomm India Private Limited, on behalf of Qualcomm Innovation 
Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation 
Collaborative Project.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Query: Crash is coming during /prod/PID/stat and do_exit of same task
@ 2018-01-10  5:20 Alexey Dobriyan
  2018-01-16  5:36 ` Kohli, Gaurav
  0 siblings, 1 reply; 8+ messages in thread
From: Alexey Dobriyan @ 2018-01-10  5:20 UTC (permalink / raw)
  To: gkohli; +Cc: linux-kernel

> We are seeing crash in do_task_stat while accessing stack pointer, It
> seems same task has already completed do_exit call.
> So it seems a race between them:

Please, post exact kernel version and struct task_struct::usage if you
still have that kernel core (or even full task_struct)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Query: Crash is coming during /prod/PID/stat and do_exit of same task
  2018-01-09 13:33 Kohli, Gaurav
@ 2018-01-15 10:04 ` Kohli, Gaurav
  2018-01-15 11:02 ` John Ogness
  1 sibling, 0 replies; 8+ messages in thread
From: Kohli, Gaurav @ 2018-01-15 10:04 UTC (permalink / raw)
  To: peterz, john.ogness, mingo; +Cc: linux-kernel, linux-arm-msm

Hi John, Ingo

As still we are seeing race between do_task_stat and do_exit of task, 
Can't we have to
put more strict check in case, if stack pointer is NULL in below code :

                 if (permitted && (task->flags & PF_DUMPCORE)) {
                         eip = KSTK_EIP(task);
                         esp = KSTK_ESP(task);
                 }

Regards
Gaurav


On 1/9/2018 7:03 PM, Kohli, Gaurav wrote:
> HI ,
>
> We are seeing crash in do_task_stat while accessing stack pointer, It 
> seems same task has already completed do_exit call.
> So it seems a race between them:
>
> Below is the crash trace:
> 49750.534377] Kernel BUG at ffffff8e7a4c53a8 [verbose debug info 
> unavailable]
> [49750.534394] task: ffffffe7b4475580 task.stack: ffffffe7a5f0c000
> [49750.534400] PC is at do_task_stat+0x740/0x908
> [49750.534402] LR is at do_task_stat+0xa4/0x908
> [49750.534403] pc : [<ffffff8e7a4c53a8>] lr : [<ffffff8e7a4c4d0c>] 
> pstate: 80400145
> [49750.534404] sp : ffffffe7a5f0fbd0
>
> and here is stack trace on that core:
>
> -000|user_stack_pointer(inline)
> -000|do_task_stat(
>     |    m = 0xFFFFFFE7A5CD7380,
>     |    ns = 0xFFFFFF8E7C43C748,
>     |  ?,
>     |    task = 0xFFFFFFE80D8C2280,
>     |  ?)
>     |  tty_pgrp = 0
>     |  ppid = 2084696064
>     |  sid = 0
>     |  mm = 0xFFFFFFE7B4424140
>     |  tcomm = (84, 9, 71, 122, 142, 255, 255, 255, 48, 253, 240, 165, 
> 231, 255, 255, 255)
>     |  flags = 18446743969119403392
> -001|proc_tgid_stat(
>     |    m = 0xFFFFFFE7A5CD7380,
>     |  ?,
>
> Below are task stats which shows , process completed the do_exit call:
> struct task_struct.flags -x 0xFFFFFFE80D8C2280
>   flags = 0x40870c
>
> crash_64> struct task_struct.exit_code -x 0xFFFFFFE80D8C2280
>   exit_code = 0x6
>
>    struct task_struct.state -x 0xFFFFFFE80D8C2280
>   state = 0x40
>
> In our build both patches are there ,
> fs/proc: report eip/esp in /prod/PID/stat for coredumping
>
> and also  task.state has already set PF_DUMPCORE as it got the sigabrt 
> signal.
>
> Regards
> Gaurav
>
>
> -- Qualcomm India Private Limited, on behalf of Qualcomm Innovation 
> Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation 
> Collaborative Project.

-- Qualcomm India Private Limited, on behalf of Qualcomm Innovation 
Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation 
Collaborative Project.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Query: Crash is coming during /prod/PID/stat and do_exit of same task
  2018-01-09 13:33 Kohli, Gaurav
  2018-01-15 10:04 ` Kohli, Gaurav
@ 2018-01-15 11:02 ` John Ogness
  2018-01-15 12:30   ` Kohli, Gaurav
  1 sibling, 1 reply; 8+ messages in thread
From: John Ogness @ 2018-01-15 11:02 UTC (permalink / raw)
  To: Kohli, Gaurav; +Cc: peterz, mingo, linux-kernel, linux-arm-msm

Hello Gaurav.

On 2018-01-09, Kohli, Gaurav <gkohli@codeaurora.org> wrote:
> We are seeing crash in do_task_stat while accessing stack pointer, It
> seems same task has already completed do_exit call.
> So it seems a race between them:
>
> Below is the crash trace:
> 49750.534377] Kernel BUG at ffffff8e7a4c53a8 [verbose debug info
> unavailable]
> [49750.534394] task: ffffffe7b4475580 task.stack: ffffffe7a5f0c000
> [49750.534400] PC is at do_task_stat+0x740/0x908
> [49750.534402] LR is at do_task_stat+0xa4/0x908
> [49750.534403] pc : [<ffffff8e7a4c53a8>] lr : [<ffffff8e7a4c4d0c>]
> pstate: 80400145
> [49750.534404] sp : ffffffe7a5f0fbd0
>
> and here is stack trace on that core:
>
> -000|user_stack_pointer(inline)
> -000|do_task_stat(
>     |    m = 0xFFFFFFE7A5CD7380,
>     |    ns = 0xFFFFFF8E7C43C748,
>     |  ?,
>     |    task = 0xFFFFFFE80D8C2280,
>     |  ?)
>     |  tty_pgrp = 0
>     |  ppid = 2084696064
>     |  sid = 0
>     |  mm = 0xFFFFFFE7B4424140
>     |  tcomm = (84, 9, 71, 122, 142, 255, 255, 255, 48, 253, 240, 165,
> 231, 255, 255, 255)
>     |  flags = 18446743969119403392
> -001|proc_tgid_stat(
>     |    m = 0xFFFFFFE7A5CD7380,
>     |  ?,
>
> Below are task stats which shows , process completed the do_exit call:
> struct task_struct.flags -x 0xFFFFFFE80D8C2280
>   flags = 0x40870c
>
> crash_64> struct task_struct.exit_code -x 0xFFFFFFE80D8C2280
>   exit_code = 0x6
>
>    struct task_struct.state -x 0xFFFFFFE80D8C2280
>   state = 0x40

I am confused why this task is in the TASK_PARKED state. What kind of
task is this?

> In our build both patches are there ,
> fs/proc: report eip/esp in /prod/PID/stat for coredumping
>
> and also  task.state has already set PF_DUMPCORE as it got the sigabrt
> signal.

John Ogness

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Query: Crash is coming during /prod/PID/stat and do_exit of same task
  2018-01-15 11:02 ` John Ogness
@ 2018-01-15 12:30   ` Kohli, Gaurav
  0 siblings, 0 replies; 8+ messages in thread
From: Kohli, Gaurav @ 2018-01-15 12:30 UTC (permalink / raw)
  To: John Ogness; +Cc: peterz, mingo, linux-kernel, linux-arm-msm

On 1/15/2018 4:32 PM, John Ogness wrote:

> Hello Gaurav.
>
> On 2018-01-09, Kohli, Gaurav <gkohli@codeaurora.org> wrote:
>> We are seeing crash in do_task_stat while accessing stack pointer, It
>> seems same task has already completed do_exit call.
>> So it seems a race between them:
>>
>> Below is the crash trace:
>> 49750.534377] Kernel BUG at ffffff8e7a4c53a8 [verbose debug info
>> unavailable]
>> [49750.534394] task: ffffffe7b4475580 task.stack: ffffffe7a5f0c000
>> [49750.534400] PC is at do_task_stat+0x740/0x908
>> [49750.534402] LR is at do_task_stat+0xa4/0x908
>> [49750.534403] pc : [<ffffff8e7a4c53a8>] lr : [<ffffff8e7a4c4d0c>]
>> pstate: 80400145
>> [49750.534404] sp : ffffffe7a5f0fbd0
>>
>> and here is stack trace on that core:
>>
>> -000|user_stack_pointer(inline)
>> -000|do_task_stat(
>>      |    m = 0xFFFFFFE7A5CD7380,
>>      |    ns = 0xFFFFFF8E7C43C748,
>>      |  ?,
>>      |    task = 0xFFFFFFE80D8C2280,
>>      |  ?)
>>      |  tty_pgrp = 0
>>      |  ppid = 2084696064
>>      |  sid = 0
>>      |  mm = 0xFFFFFFE7B4424140
>>      |  tcomm = (84, 9, 71, 122, 142, 255, 255, 255, 48, 253, 240, 165,
>> 231, 255, 255, 255)
>>      |  flags = 18446743969119403392
>> -001|proc_tgid_stat(
>>      |    m = 0xFFFFFFE7A5CD7380,
>>      |  ?,
>>
>> Below are task stats which shows , process completed the do_exit call:
>> struct task_struct.flags -x 0xFFFFFFE80D8C2280
>>    flags = 0x40870c
>>
>> crash_64> struct task_struct.exit_code -x 0xFFFFFFE80D8C2280
>>    exit_code = 0x6
>>
>>     struct task_struct.state -x 0xFFFFFFE80D8C2280
>>    state = 0x40
> I am confused why this task is in the TASK_PARKED state. What kind of
> task is this?

Hi John,

This is android HAL layer service and also before bug, i am seeing lot of service exited in logs also,
although not seeing for this pid 6807

.452202:   <2> init: starting service 'limits-hal-1-0'...

  49749.460039:   <2> init: property_set("ro.boottime.limits-hal-1-0", "61591320967789") failed: property already set

  49749.607496:   <6> sh (2422): drop_caches: 3

  49750.281635:   <6> sh (2422): drop_caches: 3

  49750.533853:   <2> init: Untracked pid 6811 exited with status 0

And why it is parked , that is not clear as state is already updated of task.

Regards

Gaurav

>
>> In our build both patches are there ,
>> fs/proc: report eip/esp in /prod/PID/stat for coredumping
>>
>> and also  task.state has already set PF_DUMPCORE as it got the sigabrt
>> signal.
> John Ogness
>
-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Query: Crash is coming during /prod/PID/stat and do_exit of same task
  2018-01-10  5:20 Query: Crash is coming during /prod/PID/stat and do_exit of same task Alexey Dobriyan
@ 2018-01-16  5:36 ` Kohli, Gaurav
  2018-01-16  7:20   ` Alexey Dobriyan
  0 siblings, 1 reply; 8+ messages in thread
From: Kohli, Gaurav @ 2018-01-16  5:36 UTC (permalink / raw)
  To: Alexey Dobriyan; +Cc: linux-kernel, linux-arm-msm

On 1/10/2018 10:50 AM, Alexey Dobriyan wrote:

>> We are seeing crash in do_task_stat while accessing stack pointer, It
>> seems same task has already completed do_exit call.
>> So it seems a race between them:
> Please, post exact kernel version and struct task_struct::usage if you
> still have that kernel core (or even full task_struct)

Hi Alexey,

We are working on 4.9.65 and Please find below usage value and other task_struct value,
please let me know if some other data required as well.

crash_64> struct task_struct.usage -x  0xFFFFFFE80D8C2280

   usage = {

     counter = 0x4

   }

struct task_struct.flags -x 0xFFFFFFE80D8C2280

   flags = 0x40870c

crash_64> struct task_struct.exit_code -x 0xFFFFFFE80D8C2280

   exit_code = 0x6

  struct task_struct.state -x 0xFFFFFFE80D8C2280

   state = 0x40
  

Please find below crash stack:

-000|user_stack_pointer(inline)

-000|do_task_stat(

     |    m = 0xFFFFFFE7A5CD7380,

     |    ns = 0xFFFFFF8E7C43C748,

     |  ?,

     |    task = 0xFFFFFFE80D8C2280,

     |  ?)

     |  tty_pgrp = 0

     |  ppid = 2084696064

     |  sid = 0

     |  mm = 0xFFFFFFE7B4424140

     |  tcomm = (84, 9, 71, 122, 142, 255, 255, 255, 48, 253, 240, 165, 231, 255, 255, 255)

     |  flags = 18446743969119403392

-001|proc_tgid_stat(

     |    m = 0xFFFFFFE7A5CD7380,

     |  ?,

     |  ?,

     |  ?)

-002|atomic_sub_return(inline)

Regards
Gaurav

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Query: Crash is coming during /prod/PID/stat and do_exit of same task
  2018-01-16  5:36 ` Kohli, Gaurav
@ 2018-01-16  7:20   ` Alexey Dobriyan
  2018-01-16  9:44     ` Kohli, Gaurav
  0 siblings, 1 reply; 8+ messages in thread
From: Alexey Dobriyan @ 2018-01-16  7:20 UTC (permalink / raw)
  To: Kohli, Gaurav; +Cc: linux-kernel, linux-arm-msm

On Tue, Jan 16, 2018 at 11:06:47AM +0530, Kohli, Gaurav wrote:
> On 1/10/2018 10:50 AM, Alexey Dobriyan wrote:
> 
> >> We are seeing crash in do_task_stat while accessing stack pointer, It
> >> seems same task has already completed do_exit call.
> >> So it seems a race between them:
> > Please, post exact kernel version and struct task_struct::usage if you
> > still have that kernel core (or even full task_struct)
> 
> Hi Alexey,
> 
> We are working on 4.9.65 and Please find below usage value and other task_struct value,
> please let me know if some other data required as well.

Kernel stacks live their own lives nowadays, the code needs try_get_task_stack().

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Query: Crash is coming during /prod/PID/stat and do_exit of same task
  2018-01-16  7:20   ` Alexey Dobriyan
@ 2018-01-16  9:44     ` Kohli, Gaurav
  0 siblings, 0 replies; 8+ messages in thread
From: Kohli, Gaurav @ 2018-01-16  9:44 UTC (permalink / raw)
  To: Alexey Dobriyan; +Cc: linux-kernel, linux-arm-msm

On 1/16/2018 12:50 PM, Alexey Dobriyan wrote:

> On Tue, Jan 16, 2018 at 11:06:47AM +0530, Kohli, Gaurav wrote:
>> On 1/10/2018 10:50 AM, Alexey Dobriyan wrote:
>>
>>>> We are seeing crash in do_task_stat while accessing stack pointer, It
>>>> seems same task has already completed do_exit call.
>>>> So it seems a race between them:
>>> Please, post exact kernel version and struct task_struct::usage if you
>>> still have that kernel core (or even full task_struct)
>> Hi Alexey,
>>
>> We are working on 4.9.65 and Please find below usage value and other task_struct value,
>> please let me know if some other data required as well.
> Kernel stacks live their own lives nowadays, the code needs try_get_task_stack().
>
Hi Alexey,

Yes , agree we have to put some check like below

   if (permitted && (task->flags & PF_DUMPCORE) && try_get_task_stack(task)) {

                         eip = KSTK_EIP(task);

                         esp = KSTK_ESP(task);

                 }

Or instead of this also , can't we check whether task is in exiting path or not by checking some flags like PF_EXITING.

Regards

Gaurav

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-01-16  9:45 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-10  5:20 Query: Crash is coming during /prod/PID/stat and do_exit of same task Alexey Dobriyan
2018-01-16  5:36 ` Kohli, Gaurav
2018-01-16  7:20   ` Alexey Dobriyan
2018-01-16  9:44     ` Kohli, Gaurav
  -- strict thread matches above, loose matches on Subject: below --
2018-01-09 13:33 Kohli, Gaurav
2018-01-15 10:04 ` Kohli, Gaurav
2018-01-15 11:02 ` John Ogness
2018-01-15 12:30   ` Kohli, Gaurav

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox