Openembedded Core Discussions
 help / color / mirror / Atom feed
From: Mike Crowe <mac@mcrowe.com>
To: openembedded-core@lists.openembedded.org
Cc: Paul Barker <paul.barker@commagility.com>
Subject: Re: [PATCH RFC] module.bbclass: Fix potential do_compile/do_make_scripts race condition
Date: Mon, 8 May 2017 11:29:03 +0100	[thread overview]
Message-ID: <20170508102903.GA9571@mcrowe.com> (raw)
In-Reply-To: <20151215140434.75233de0@centos-7>

On Tuesday 15 December 2015 at 14:04:34 +0000, Paul Barker wrote:
> On Sun, 6 Dec 2015 11:26:33 +0000
> Paul Barker <paul.barker@commagility.com> wrote:
> 
> > I ran into a race condition building multiple external modules against a 3.10.y
> > series kernel using the dylan branch of OpenEmbedded. This is difficult to
> > reproduce as it requires very specific timing: the do_make_scripts task for one
> > module was linking the modpost script whilst the do_compile task for another
> > module was attempting to use the modpost script. This resulted in a permission
> > error:
> > 
> > ERROR: Function failed: do_compile (see /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/work/amc_d24a4-oe-linux-gnueabi/ti-hplib-mod/01.01.00.04-r3d/temp/log.do_compile.25434 for further information)
> > ERROR: Logfile of failure stored in: /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/work/amc_d24a4-oe-linux-gnueabi/ti-hplib-mod/01.01.00.04-r3d/temp/log.do_compile.25434
> > Log data follows:
> > | DEBUG: Executing shell function do_compile
> > | make -C /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel M=$PWD clean
> > | make[1]: Entering directory `/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel'
> > | make[1]: Leaving directory `/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel'
> > | make -C /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel M=$PWD modules
> > | make[1]: Entering directory `/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel'
> > |   CC [M]  /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/work/amc_d24a4-oe-linux-gnueabi/ti-hplib-mod/01.01.00.04-r3d/git/ti/runtime/hplib/module/hplibmod.o
> > |   Building modules, stage 2.
> > |   MODPOST 1 modules
> > | /bin/sh: scripts/mod/modpost: Permission denied
> > | make[2]: *** [__modpost] Error 126
> > | make[1]: *** [modules] Error 2
> > | make[1]: Leaving directory `/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/sysroots/amc-d24a4/usr/src/kernel'
> > | make: *** [default] Error 2
> > | ERROR: Function failed: do_compile (see /home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/build/tmp/work/amc_d24a4-oe-linux-gnueabi/ti-hplib-mod/01.01.00.04-r3d/temp/log.do_compile.25434 for further information)
> > ERROR: Task 1284 (/home/COMMAGILITY/paul.barker/linux-bsp/work-1.0.1/ca-linux-bsp/meta-mcsdk/meta-arago-extras/recipes-bsp/ti-hplib/ti-hplib-mod_git.bb, do_compile) failed with exit code '1'
> > 
> > Later kernel versions do not rebuild the modpost script every time that 'make
> > scripts' is invoked so they should be safe from this particular failure. However
> > I'm not convinced that running 'make scripts' whilst also building an
> > out-of-tree module is always safe on later kernels and there is always the
> > potential for vendor kernels to have different behaviour here.
> > 
> > Although this was seen on the dylan branch the behaviour of master and jethro
> > looks to be the same here - do_make_scripts is locked so that only one instance
> > of it may run at one time but there is nothing to prevent one instance of
> > do_make_scripts running at the same time as an instance of do_compile.
> > 
> > The patch I'm sending attempts to solve this issue by locking the do_compile
> > task with the same lockfile as the do_make_scripts task in module.bbclass so
> > that an instance of do_copile can't run at the same time as an instance of
> > do_make_scripts. I don't know enough about the task locking to guarantee that
> > this is the right solution or to be able to test that it works as expected so
> > I'm marking the patch as an RFC.
> > 
> > Please let me know if this is the right approach and if there is any easy way to
> > test this.
> > 
> > Paul Barker (1):
> >   module.bbclass: Fix potential do_compile/do_make_scripts race
> >     condition
> > 
> >  meta/classes/module.bbclass | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> 
> ping on this.
> 
> I've just got bitten by this again so it's not a one-off. Is anyone able to
> give me some feedback on the patch, whether this is the right approach to fix
> the problem and whether this is applicable to jethro/master.

We've started seeing the same symptom, but with a v3.14 kernel. We have
several recipes that build out-of-tree modules and I can see
do_make_scripts for one running at the same time as do_compile for the one
that fails.

If I try to reproduce the problem by hand, I cannot. However, I only see
modpost being compiled for one of the tasks in the logs.

I can't really explain why I see the problem with a newer kernel.
Regardless, it seems unwise to even attempt to run do_make_tasks and
do_compile in parallel.

It looks this patch was reviewed favourably, but doesn't seem to have made
it into master.

In the meantime, I'll try this patch and see if it makes the problem go
away for us.

Thanks.

Mike.

Original patch at https://patchwork.openembedded.org/patch/109269/ and
thread at
http://lists.openembedded.org/pipermail/openembedded-core/2015-December/113752.html
for those without long-term email archives.


  parent reply	other threads:[~2017-05-08 10:36 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-06 11:26 [PATCH RFC] module.bbclass: Fix potential do_compile/do_make_scripts race condition Paul Barker
2015-12-06 11:26 ` Paul Barker
2015-12-15 14:04 ` Paul Barker
2015-12-15 14:42   ` Bruce Ashfield
2017-05-08 10:29   ` Mike Crowe [this message]
2017-05-14  9:07     ` Mike Crowe
2018-03-28  8:24       ` Javier Viguera

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170508102903.GA9571@mcrowe.com \
    --to=mac@mcrowe.com \
    --cc=openembedded-core@lists.openembedded.org \
    --cc=paul.barker@commagility.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox