linux-mips.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bruce Ashfield <bruce.ashfield@gmail.com>
To: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: linux-mips@vger.kernel.org, paul.burton@mips.com,
	Richard Purdie <richard.purdie@linuxfoundation.org>
Subject: Re: v5.4-rcX: qemu-system-mips64 userspace segfault
Date: Wed, 6 Nov 2019 12:45:49 -0500	[thread overview]
Message-ID: <CADkTA4M2K3WOTDVRvLNVNQ0f-rcQvmK-5L-iNoFxjYwHeaHkLQ@mail.gmail.com> (raw)
In-Reply-To: <CADkTA4OYGhR-5wP7Q9Z_L=7nR-vwog129AKZOqJXdU09BNxkSQ@mail.gmail.com>

On Fri, Oct 25, 2019 at 9:04 AM Bruce Ashfield <bruce.ashfield@gmail.com> wrote:
>
> On Fri, Oct 25, 2019 at 5:06 AM Vincenzo Frascino
> <vincenzo.frascino@arm.com> wrote:
> >
> > Hi Bruce,
> >
> > On 10/24/19 5:37 PM, Bruce Ashfield wrote:
> > > On Thu, Oct 24, 2019 at 9:29 AM Vincenzo Frascino
> > > <vincenzo.frascino@arm.com> wrote:
> > >>
> > >> Hi Bruce,
> > >>
> > >> On 10/24/19 2:12 PM, Bruce Ashfield wrote:
> > >>> Hi all,
> > >>>
> > >>> I'm not sure if anyone else is running qemu-system-mips64 regularly,
> > >>> but for the past 4 (or more) years, it has been the primary way that
> > >>> we run QA on the mips64 Yocto Project reference kernel(s). I take care
> > >>> of the kernel for the project, so I always have the fun of running
> > >>> into issues first :D
> > >>>
> > >>> That's enough preamble ...
> > >>>
> > >>> I wanted to see if anyone recognized the issue that I'm seeing when I
> > >>> bumped the linux-yocto dev kernel to the v5.4-rc series.
> > >>>
> > >>> The one line summary is that I'm seeing a segfault as soon as  the
> > >>> kernel hands off to userspace during boot. It doesn't matter if it is
> > >>> systemd, sysvinit, or init=/bin/sh .. I always get a segfault.
> > >> [...]
> > >>
> > >> Could you please share the .config you are using?
> > >
> > > attached (hopefully this won't cause my reply to bounce).
> > >
> >
> > It seems that the .config you shared was generated for a version of the kernel
> > that is older then the one in which we introduced the unified vDSO hence, since
> > the options to enable correctly the generic vdso library are selected by the
> > architecture, this result in a mis-configuration of the vDSO library which leads
> > to the issues you are seeing.
>
> Parts of that .config have been around for years, and others would have been
> from my v5.3-dev kernel work. So most definitely there are older
> elements floating
> around.
>
> >
> > My advise is to start from a fresh defconfig and then enable the options you
> > need one by one. I did it with buildroot and it seems working.
>
> We don't use defconfigs (at least not in a typical config flow), but absolutely,
> I can start stepping through the options again. I've been maintaining this
> platform and moving through kernel versions for a few years now, so there
> could be something funky with the way the option was introduced and how
> it interacts with my uprev workflow. I should have gotten a warning about
> it in my config sanity step ... but I'll have a closer look at that  (obviously
> my issue) once I'm up and booting.
>
> It's also possible I grabbed the bad .config from the middle of my bisect,
> which as I mentioned was toggling the VDSO options (and having some
> build issues) due to changing dependencies. I'll compare a clean .config to
> the one I sent and follow up if there's something obvious.
>
> >
> > Another thing I noticed and this seems confirmed by the patch series you had to
> > revert is that you are missing a fix that I submitted last week:
> >
> > 8a1bef4193e81c8afae4d2f107f1c09c8ce89470
> > ("mips: vdso: Fix __arch_get_hw_counter()")
>
> Right, if it isn't already in -rcX, I don't have it yet, since I'm
> uprev'ing the -dev
> kernel and sanity testing the rc releases. Only if I have issues like this do I
> start digging around for patches to apply.
>
> I can definitely do that. It seems like gmail only decided to deliver 3 messages
> on the 16 of October, so I don't have a copy of that patch locally, but I was
> able to find the archive and will track down the patch later today.
>
>
> >
> > Could you please apply it before regenerating the .config? Seems the qemu falls
> > back on VDSO_CLOCK_NONE at least in the case I reproduced.
> >
> > > When debugging (and bisecting), as expected, the VDSO configs bounced
> > > around a bit with the move to generic VDSO, etc.  So there very well
> > > may be something that with 5.4 I need to enable now and missed in my
> > > debug.
> > >
> > > I don't have GENERIC_COMPAT_VDSO enabled, but can easily do a boot
> > > test with it on, similarly with the different vdso boot option. I know
> > > I had tried a lot of different combos, but would have to redo the
> > > tests now.
> > >
> >
> > This seems confirming my suspect of the wrong .config.
>
> It was on in some of my testing, it just wasn't on for some of the
> bisect runs. I may have grabbed the bad config in my haste. When I
> dive back into this, I'll see what I managed to mess up.

Hi again, and sorry for 13 days in between replies!

I was traveling for the past week and a half and didn't get a chance
to do more boot testing.

I haven't updated to the latest v5.4-rc (but will later today), but I
did cherry pick the  ("mips: vdso: Fix __arch_get_hw_counter()") fix
you mentioned. I can report that my boot still segfaulted in the same
place where it pulled in.

I also attempted to manually set GENERIC_COMPAT_VDSO in my .config,
but as we know, without a prompt it has to be selected by another
Kconfig, and in the platform that I'm building it isn't selected. So I
did a quick one-liner to select it, and even with it on, I'm still
seeing the segfault.

I'm trying with a defconfig base at the moment, but running into some
compilation issues, while I sort those out, I was wondering if I could
get a copy of a working .config from you ? So I can compare and debug
from there. (I'm booting a 64bit malta configuration in qemu).

I'm also looking into the vdso clock_mode you mentioned earlier ... I
see it in clocksource.h, but I've not yet figured out how to influence
what mode my qemu boot is using (some clocksource driver config .. I
do have CLKSRC_MIPS_GIC set in my .config, so that should at least be
present). Booting with vdso=0 didn't change anything either.

Summary: there's definitely something up with my .config that didn't
like the transition to generic VDSO, and hopefully a working .config
will point me in the right direction and limit my flailing :D

Cheers,

Bruce

>
> Cheers,
>
> Bruce
>
> >
> > >
> > >>
> > >> Do you know by any change which vdso clock_mode is set in this scenario?
> > >
> > > Unfortunately not, it isn't something that we've explicitly set in the
> > > past, so I haven't looked into it. But can do more digging.
> > >
> > > Bruce
> > >
> > >>
> > >> --
> > >> Regards,
> > >> Vincenzo
> > >
> > >
> > >
> >
> > Please let us know how your investigation proceeds.
>
> I definitely will, thanks for the time spent and the confirmation that you
> aren't seeing the same thing.
>
> Bruce
>
> >
> > --
> > Regards,
> > Vincenzo
>
>
> --
> - Thou shalt not follow the NULL pointer, for chaos and madness await
> thee at its end
> - "Use the force Harry" - Gandalf, Star Trek II



--
- Thou shalt not follow the NULL pointer, for chaos and madness await
thee at its end
- "Use the force Harry" - Gandalf, Star Trek II

      reply	other threads:[~2019-11-06 17:46 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-24 13:12 v5.4-rcX: qemu-system-mips64 userspace segfault Bruce Ashfield
2019-10-24 13:31 ` Vincenzo Frascino
     [not found]   ` <CADkTA4N1UzrHRZi4j6MUxxT4yWsv1BSHDb11SaKqtbW_gihZ-g@mail.gmail.com>
2019-10-25  9:08     ` Vincenzo Frascino
2019-10-25 13:04       ` Bruce Ashfield
2019-11-06 17:45         ` Bruce Ashfield [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CADkTA4M2K3WOTDVRvLNVNQ0f-rcQvmK-5L-iNoFxjYwHeaHkLQ@mail.gmail.com \
    --to=bruce.ashfield@gmail.com \
    --cc=linux-mips@vger.kernel.org \
    --cc=paul.burton@mips.com \
    --cc=richard.purdie@linuxfoundation.org \
    --cc=vincenzo.frascino@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).