All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] cooker: release lockfile on process exit
@ 2015-05-19 14:14 Lucas Dutra Nunes
  2015-05-19 21:07 ` Richard Purdie
  0 siblings, 1 reply; 4+ messages in thread
From: Lucas Dutra Nunes @ 2015-05-19 14:14 UTC (permalink / raw)
  To: bitbake-devel-request; +Cc: clarson, openembedded-core

This fixes problems caused by the bitbake process exiting without
releasing the lockfile. This is most apparent while running scripts that
call bitbake several times, like the "cleanup-workdir" script on
oe-core.

Signed-off-by: Lucas Dutra Nunes <ldnunes@ossystems.com.br>
---
 lib/bb/cooker.py | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/lib/bb/cooker.py b/lib/bb/cooker.py
index ddf5fed..627ad4a 100644
--- a/lib/bb/cooker.py
+++ b/lib/bb/cooker.py
@@ -40,6 +40,7 @@ import Queue
 import signal
 import prserv.serv
 import pyinotify
+import atexit
 
 logger      = logging.getLogger("BitBake")
 collectlog  = logging.getLogger("BitBake.Collection")
@@ -164,6 +165,9 @@ class BBCooker:
         except:
             pass
 
+        # Register the unlocking of the file at exit:
+        atexit.register(bb.utils.unlockfile, self.lock)
+
         # TOSTOP must not be set or our children will hang when they output
         fd = sys.stdout.fileno()
         if os.isatty(fd):
-- 
2.1.4



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] cooker: release lockfile on process exit
  2015-05-19 14:14 [PATCH] cooker: release lockfile on process exit Lucas Dutra Nunes
@ 2015-05-19 21:07 ` Richard Purdie
  2015-05-19 21:24   ` Otavio Salvador
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Purdie @ 2015-05-19 21:07 UTC (permalink / raw)
  To: Lucas Dutra Nunes; +Cc: bitbake-devel-request, clarson, openembedded-core

On Tue, 2015-05-19 at 11:14 -0300, Lucas Dutra Nunes wrote:
> This fixes problems caused by the bitbake process exiting without
> releasing the lockfile. This is most apparent while running scripts that
> call bitbake several times, like the "cleanup-workdir" script on
> oe-core.
> 
> Signed-off-by: Lucas Dutra Nunes <ldnunes@ossystems.com.br>
> ---
>  lib/bb/cooker.py | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/lib/bb/cooker.py b/lib/bb/cooker.py
> index ddf5fed..627ad4a 100644
> --- a/lib/bb/cooker.py
> +++ b/lib/bb/cooker.py
> @@ -40,6 +40,7 @@ import Queue
>  import signal
>  import prserv.serv
>  import pyinotify
> +import atexit
>  
>  logger      = logging.getLogger("BitBake")
>  collectlog  = logging.getLogger("BitBake.Collection")
> @@ -164,6 +165,9 @@ class BBCooker:
>          except:
>              pass
>  
> +        # Register the unlocking of the file at exit:
> +        atexit.register(bb.utils.unlockfile, self.lock)
> +
>          # TOSTOP must not be set or our children will hang when they output
>          fd = sys.stdout.fileno()
>          if os.isatty(fd):

I'm not convinced that atexit is the right way to handle this.

Currently the lock ends up being held by any of the running bitbake
processes and it is released when they all exit. If any processes "get
left behind" accidentally, they continue to hold the lock. In many ways
this is actually quite a desirable behaviour. A common usecase for me is
where qemu breaks in something like a bitbake <image> -c testimage. The
bitbake command returns but qemu is left in the background and the lock
remains since it holds a copy of the original lock fd. I can't run a new
bitbake command until I clean up all the processes. Whilst not intended
as part of the design, this does happen to work quite well.

With the atexit approach, I suspect it will attach to exit of any of the
subprocesses and the first to exit will free the lock, not the last.
This is likely not what we want.

So what is the right way to handle things? I'm not sure.

The question is "who" owns the lock. In the traditional model, its owned
by cooker. As long as the cooker runs, the lock remains. This works well
with memory resident bitbake.

The other model would be the lock being owned by the UI. This breaks
down with memory resident bitbake. The problem is the controlling
terminal is owned by the UI, not cooker so control returns to it before
cooker has necessarily cleaned up and exited, depending on the speed of
the machine and the phase of the moon.

If we switch completely to memory resident bitbake, I'd note the problem
goes away to a large extent since either we connect to an existing
cooker or the cooker gets started.

So having thought more about this, my proposal is actually that rather
than fix this and cause all kinds of other potential problems, we just
make memory resident bitbake the default which is something we'd like to
ideally do in the 1.9 timeframe anyway.

Cheers,

Richard






^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] cooker: release lockfile on process exit
  2015-05-19 21:07 ` Richard Purdie
@ 2015-05-19 21:24   ` Otavio Salvador
  2015-05-20 12:11     ` Richard Purdie
  0 siblings, 1 reply; 4+ messages in thread
From: Otavio Salvador @ 2015-05-19 21:24 UTC (permalink / raw)
  To: Richard Purdie
  Cc: bitbake-devel-request, Christopher Larson,
	Patches and discussions about the oe-core layer

On Tue, May 19, 2015 at 6:07 PM, Richard Purdie
<richard.purdie@linuxfoundation.org> wrote:
...
> So having thought more about this, my proposal is actually that rather
> than fix this and cause all kinds of other potential problems, we just
> make memory resident bitbake the default which is something we'd like to
> ideally do in the 1.9 timeframe anyway.

I understand your background ideas here however the non-deterministic
lock release is driving us crazy at the autobuilder management. The
sstate cleanup script, workdir cleanup script and in-house scripts all
need to be teach to deal with this and this is far from optimal.

Would be acceptable for you a command line option to 'wait' it to
release the lock before exit? This could be done easily and we avoid
side effects. This also provide a solution which is isolated enough to
be safely backported for previous releases and provide more
deterministic script output.

How does it sounds?

-- 
Otavio Salvador                             O.S. Systems
http://www.ossystems.com.br        http://code.ossystems.com.br
Mobile: +55 (53) 9981-7854            Mobile: +1 (347) 903-9750


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] cooker: release lockfile on process exit
  2015-05-19 21:24   ` Otavio Salvador
@ 2015-05-20 12:11     ` Richard Purdie
  0 siblings, 0 replies; 4+ messages in thread
From: Richard Purdie @ 2015-05-20 12:11 UTC (permalink / raw)
  To: Otavio Salvador
  Cc: bitbake-devel-request, Christopher Larson,
	Patches and discussions about the oe-core layer

On Tue, 2015-05-19 at 18:24 -0300, Otavio Salvador wrote:
> On Tue, May 19, 2015 at 6:07 PM, Richard Purdie
> <richard.purdie@linuxfoundation.org> wrote:
> ...
> > So having thought more about this, my proposal is actually that rather
> > than fix this and cause all kinds of other potential problems, we just
> > make memory resident bitbake the default which is something we'd like to
> > ideally do in the 1.9 timeframe anyway.
> 
> I understand your background ideas here however the non-deterministic
> lock release is driving us crazy at the autobuilder management. The
> sstate cleanup script, workdir cleanup script and in-house scripts all
> need to be teach to deal with this and this is far from optimal.
> 
> Would be acceptable for you a command line option to 'wait' it to
> release the lock before exit? This could be done easily and we avoid
> side effects. This also provide a solution which is isolated enough to
> be safely backported for previous releases and provide more
> deterministic script output.
> 
> How does it sounds?

I don't know how you'd implement this safely. I did give this some
further thought last night and concluded that if we want to backport
something, the safest approach might be to change the lock acquisition
code.

Basically, if we can't obtain the lock, sleep for one second and try
again. If on the second try we can't obtain the lock, error/exit.

If we need a longer delay than one second, we need to figure out what is
exiting that slowly and fix that instead.

Seems reasonable?

Cheers,

Richard






^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2015-05-20 12:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-19 14:14 [PATCH] cooker: release lockfile on process exit Lucas Dutra Nunes
2015-05-19 21:07 ` Richard Purdie
2015-05-19 21:24   ` Otavio Salvador
2015-05-20 12:11     ` Richard Purdie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.