linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Kernel v4.14.52-rt34 triggered a kernel-oops
@ 2018-10-22 13:30 Roosen Henri
  2018-10-23  1:56 ` Sergey Senozhatsky
  0 siblings, 1 reply; 2+ messages in thread
From: Roosen Henri @ 2018-10-22 13:30 UTC (permalink / raw)
  To: linux-rt-users@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 1090 bytes --]

Hi RT-experts,

One of our ARM iMX6Q systems running an SMP PREEMPT RT kernel version
based on v4.14.52-rt34 triggered a kernel-oops after about 60 days
duration test (running cyclic-test and additional load apps).

Looking at the trace (see https://paste.debian.net/1048486/), PC is
incorrect. LR is c016816c, so the processor just executed the
instruction at c0168168:

c0168168:       e12fff33        blx     r3
(kernel/printk/printk.c:1666)

With r3 being ee4fe01c, the processor has jumped to that location
causing the OOPS.

Looking at the console_unlock disassembly (see
https://paste.debian.net/1048485/), r3 should have the con->write()
function pointer. The console pointer con is retrieved while walking
the console_drivers list.

So, I guess the list gets corrupted, maybe some kind of concurrency
issue? Unfortunately there are a lot of locking primitives used in the
console_unlock() function and some RT-specific code, so it's not so
easy to find the root-cause at first glance. Could someone of you have
a look?

Thanks!
Henri

[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 3608 bytes --]

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Kernel v4.14.52-rt34 triggered a kernel-oops
  2018-10-22 13:30 Kernel v4.14.52-rt34 triggered a kernel-oops Roosen Henri
@ 2018-10-23  1:56 ` Sergey Senozhatsky
  0 siblings, 0 replies; 2+ messages in thread
From: Sergey Senozhatsky @ 2018-10-23  1:56 UTC (permalink / raw)
  To: Roosen Henri; +Cc: linux-rt-users@vger.kernel.org

On (10/22/18 13:30), Roosen Henri wrote:
> Hi RT-experts,
> 
> One of our ARM iMX6Q systems running an SMP PREEMPT RT kernel version
> based on v4.14.52-rt34 triggered a kernel-oops after about 60 days
> duration test (running cyclic-test and additional load apps).
> 
> Looking at the trace (see https://paste.debian.net/1048486/), PC is
> incorrect. LR is c016816c, so the processor just executed the
> instruction at c0168168:
> 
> c0168168:       e12fff33        blx     r3
> (kernel/printk/printk.c:1666)
> 
> With r3 being ee4fe01c, the processor has jumped to that location
> causing the OOPS.
> 
> Looking at the console_unlock disassembly (see
> https://paste.debian.net/1048485/), r3 should have the con->write()
> function pointer. The console pointer con is retrieved while walking
> the console_drivers list.
> 
> So, I guess the list gets corrupted, maybe some kind of concurrency
> issue? Unfortunately there are a lot of locking primitives used in the
> console_unlock() function and some RT-specific code, so it's not so
> easy to find the root-cause at first glance. Could someone of you have
> a look?

Did you register/unregister consoles during the test? Nothing else
should modify the console drivers list.

	-ss

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2018-10-23 10:17 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-10-22 13:30 Kernel v4.14.52-rt34 triggered a kernel-oops Roosen Henri
2018-10-23  1:56 ` Sergey Senozhatsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).