From: Russell King - ARM Linux <linux@arm.linux.org.uk>
To: "Dr. H. Nikolaus Schaller" <hns@goldelico.com>
Cc: Nishanth Menon <nm@ti.com>, Tony Lindgren <tony@atomide.com>,
Grazvydas Ignotas <notasas@gmail.com>,
Marek Belisko <marek@goldelico.com>,
"linux-omap@vger.kernel.org" <linux-omap@vger.kernel.org>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>
Subject: Re: mysterious crashes on OMAP5 uevm
Date: Thu, 10 Sep 2015 09:30:26 +0100 [thread overview]
Message-ID: <20150910083026.GA21098@n2100.arm.linux.org.uk> (raw)
In-Reply-To: <E0068B56-0DEE-4E2F-94CF-DB104AE16D63@goldelico.com>
On Thu, Sep 10, 2015 at 08:42:57AM +0200, Dr. H. Nikolaus Schaller wrote:
>
> Am 08.09.2015 um 23:07 schrieb Tony Lindgren <tony@atomide.com>:
>
> > * Grazvydas Ignotas <notasas@gmail.com> [150908 13:44]:
> >> On Tue, Sep 8, 2015 at 4:38 PM, Tony Lindgren <tony@atomide.com> wrote:
> >>> * Grazvydas Ignotas <notasas@gmail.com> [150908 05:50]:
> >>>> Hi,
> >>>>
> >>>> this is a longstanding problem I'm seeing since the very beginning,
> >>>> which was around 3.12 or so (when I've first got the hardware) and it
> >>>> seems 4.2 is affected by it still. Basically what happens is Xorg
> >>>> randomly segfaults at some "impossible" location. I don't have the
> >>>> details at the moment (could get them is needed), but from what I
> >>>> examined with gdb some time ago the situation did not make any sense.
> >>>>
> >>>> There are 2 workarounds that I know which make the problem go away
> >>>> (one is enough):
> >>>> - recompile Xorg with -marm (I'm using Debian armhf so it's thumb2 by default)
> >>>> - disable ARCH_MULTI_V6 in the kernel config
> >>>>
> >>>> Because of the above workarounds I have forgotten about it several
> >>>> times, but it regularly comes back and bites again. It would look like
> >>>> some missing erratum workaround, but I have all of them enabled in the
> >>>> kernel.
> >>>>
> >>>> Does anyone know about this? Perhaps some missing erratum workaround
> >>>> in the bootloader? u-boot isn't too old here (2015.07).
> >>>
> >>> Seems like some incorrect handling with CONFIG_CPU_V6 compiled in..
> >>> Maybe try to narrow it down by commenting out some CONFIG_CPU_V6 and
> >>> __LINUX_ARM_ARCH__ = 6 ifdefs in the git grep CONFIG_CPU_V6
> >>> places ignoring uncompress and davinci code.
> >>
> >> ok with that it was quite easy to find. On a kernel with ARCH_MULTI_V6
> >> disabled, it is enough to just do this:
> >>
> >> --- a/arch/arm/kernel/signal.c
> >> +++ b/arch/arm/kernel/signal.c
> >> @@ -340,13 +340,13 @@ setup_return(struct pt_regs *regs, struct ksignal *ksig,
> >> /*
> >> * The LSB of the handler determines if we're going to
> >> * be using THUMB or ARM mode for this signal handler.
> >> */
> >> thumb = handler & 1;
> >>
> >> -#if __LINUX_ARM_ARCH__ >= 7
> >> +#if 0 //__LINUX_ARM_ARCH__ >= 7
> >> /*
> >> * Clear the If-Then Thumb-2 execution state
> >> * ARM spec requires this to be all 000s in ARM mode
> >> * Snapdragon S4/Krait misbehaves on a Thumb=>ARM
> >> * signal transition without this.
> >> */
> >>
> >> ... and the problem appears, so I guess this needs some real
> >> multiplatform handling,.
> >
> > OK nice to hear you found it. Yeah looks like some runtime
> > capability check is needed.
> >
> >>> Do you have some easy way to reproduce this issue?
> >>
> >> Just moving a browser window around with mouse usually triggers it
> >> within a minute.
> >
> > OK good to know.
>
> It looks as if this is the solution for the same symptom on our OMAP3 board (gta04).
> There, it suffices to draw on the touch screen for ~10 seconds to make the xserver segfault.
>
> [we are using the binary xserver from debian wheezy
> ii xserver-xorg-core 2:1.12.4-6+deb7u5 armhf Xorg X server - core server]
>
> We know about this bug for a while, but so far did think that some touch screen
> event bit has changed and we have to fix our touch screen driver.
>
> Now, disabling CONFIG_ARCH_MULTI_V6 also makes the bug go away and adding the
> >> #if 0 //__LINUX_ARM_ARCH__ >= 7
> makes it re-appear.
>
> A while ago I tried to debug running the x-server under strace and could find that it also has
> something to do with SIGALRM.
>
> And that is very consistent with “enable/disable” by modifying arch/arm/kernel/signal.c
It would be really nice if someone could diagnose what's going on here.
What exception is causing the X server to be killed (someone said a
segfault)? What is the register state at the point that happens? What
does the code look like Is it happening inside the SIGALRM handler, or
when the SIGALRM handler has returned?
I'd suggest attaching gdb to the X server, but remember to set gdb to
ignore SIGPIPEs.
--
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2015-09-10 8:30 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-08 12:46 mysterious crashes on OMAP5 uevm Grazvydas Ignotas
2015-09-08 14:38 ` Tony Lindgren
2015-09-08 20:41 ` Grazvydas Ignotas
2015-09-08 21:07 ` Tony Lindgren
2015-09-10 6:42 ` Dr. H. Nikolaus Schaller
2015-09-10 8:30 ` Russell King - ARM Linux [this message]
2015-09-10 8:57 ` Dr. H. Nikolaus Schaller
2015-09-10 23:33 ` Woodruff, Richard
2015-09-11 13:27 ` Grazvydas Ignotas
2015-09-11 14:03 ` Russell King - ARM Linux
2015-09-11 16:12 ` Woodruff, Richard
2015-09-11 17:48 ` Russell King - ARM Linux
2015-09-11 18:34 ` Woodruff, Richard
2015-09-14 12:12 ` Russell King - ARM Linux
2015-09-14 19:02 ` Tony Lindgren
2015-09-14 19:35 ` Dr. H. Nikolaus Schaller
2015-09-15 17:31 ` Grazvydas Ignotas
2015-09-16 10:07 ` Russell King - ARM Linux
2015-09-18 17:48 ` Tony Lindgren
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150910083026.GA21098@n2100.arm.linux.org.uk \
--to=linux@arm.linux.org.uk \
--cc=hns@goldelico.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-omap@vger.kernel.org \
--cc=marek@goldelico.com \
--cc=nm@ti.com \
--cc=notasas@gmail.com \
--cc=tony@atomide.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).