All of lore.kernel.org
 help / color / mirror / Atom feed
* gitsm shared DL_DIR and race conditions
@ 2019-01-09 13:48 Stefan Agner
  2019-01-09 15:38 ` Mark Hatle
  0 siblings, 1 reply; 4+ messages in thread
From: Stefan Agner @ 2019-01-09 13:48 UTC (permalink / raw)
  To: bitbake-devel

Hi,

We came across race conditions while fetching a repository using gitsm:
ERROR: aktualizr-native-1.0+gitAUTOINC+d00d1a04cc-7 do_fetch: Fetcher
failure: Fetch command export PSEUDO_DISABLED=1; export
PATH="/workdir/oe/layers/openembedded-core/scripts/native-intercept:/workdir/oe/layers/openembedded-core/scripts:/workdir/oe/tmp/work/x86_64-linux/aktualizr-native/1.0+gitAUTOINC+d00d1a04cc-7/recipe-sysroot-native/usr/bin/x86_64-linux:/workdir/oe/tmp/work/x86_64-linux/aktualizr-native/1.0+gitAUTOINC+d00d1a04cc-7/recipe-sysroot-native/usr/bin:/workdir/oe/tmp/work/x86_64-linux/aktualizr-native/1.0+gitAUTOINC+d00d1a04cc-7/recipe-sysroot-native/usr/sbin:/workdir/oe/tmp/work/x86_64-linux/aktualizr-native/1.0+gitAUTOINC+d00d1a04cc-7/recipe-sysroot-native/usr/bin:/workdir/oe/tmp/work/x86_64-linux/aktualizr-native/1.0+gitAUTOINC+d00d1a04cc-7/recipe-sysroot-native/sbin:/workdir/oe/tmp/work/x86_64-linux/aktualizr-native/1.0+gitAUTOINC+d00d1a04cc-7/recipe-sysroot-native/bin:/workdir/oe/bitbake/bin:/workdir/oe/tmp/hosttools";
export HOME="/home/yocto"; git -c core.fsyncobjectfiles=0 config
submodule.tests/tuf-test-vectors.url
/workdir/downloads/git2/github.com.advancedtelematic.tuf-test-vectors.
failed with exit code 255, output:
error: could not lock config file config: File exists

We share the same DL_DIR located on a NFS across multiple builders. We
are using latest state of the 1.40 branch  (thud) of bitbake.

It seems that two git config invocations raced in this case. Is there
locking required in the current gitsm implementation?

--
Stefan


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: gitsm shared DL_DIR and race conditions
  2019-01-09 13:48 gitsm shared DL_DIR and race conditions Stefan Agner
@ 2019-01-09 15:38 ` Mark Hatle
  2019-01-09 16:21   ` Stefan Agner
  0 siblings, 1 reply; 4+ messages in thread
From: Mark Hatle @ 2019-01-09 15:38 UTC (permalink / raw)
  To: Stefan Agner, bitbake-devel

On 1/9/19 7:48 AM, Stefan Agner wrote:
> Hi,
> 
> We came across race conditions while fetching a repository using gitsm:
> ERROR: aktualizr-native-1.0+gitAUTOINC+d00d1a04cc-7 do_fetch: Fetcher
> failure: Fetch command export PSEUDO_DISABLED=1; export
> PATH="/workdir/oe/layers/openembedded-core/scripts/native-intercept:/workdir/oe/layers/openembedded-core/scripts:/workdir/oe/tmp/work/x86_64-linux/aktualizr-native/1.0+gitAUTOINC+d00d1a04cc-7/recipe-sysroot-native/usr/bin/x86_64-linux:/workdir/oe/tmp/work/x86_64-linux/aktualizr-native/1.0+gitAUTOINC+d00d1a04cc-7/recipe-sysroot-native/usr/bin:/workdir/oe/tmp/work/x86_64-linux/aktualizr-native/1.0+gitAUTOINC+d00d1a04cc-7/recipe-sysroot-native/usr/sbin:/workdir/oe/tmp/work/x86_64-linux/aktualizr-native/1.0+gitAUTOINC+d00d1a04cc-7/recipe-sysroot-native/usr/bin:/workdir/oe/tmp/work/x86_64-linux/aktualizr-native/1.0+gitAUTOINC+d00d1a04cc-7/recipe-sysroot-native/sbin:/workdir/oe/tmp/work/x86_64-linux/aktualizr-native/1.0+gitAUTOINC+d00d1a04cc-7/recipe-sysroot-native/bin:/workdir/oe/bitbake/bin:/workdir/oe/tmp/hosttools";
> export HOME="/home/yocto"; git -c core.fsyncobjectfiles=0 config
> submodule.tests/tuf-test-vectors.url
> /workdir/downloads/git2/github.com.advancedtelematic.tuf-test-vectors.
> failed with exit code 255, output:
> error: could not lock config file config: File exists
> 
> We share the same DL_DIR located on a NFS across multiple builders. We
> are using latest state of the 1.40 branch  (thud) of bitbake.
> 
> It seems that two git config invocations raced in this case. Is there
> locking required in the current gitsm implementation?

Did not expect git config to have locking issue... but it is showing that two
fetches appear to have happened together... gitsm uses the regular git fetcher
for that bit, and then after the fetch is complete updates the config (like git
submodule init would) to point to downloaded components.  It's this
configuration step that appears to have failed.

It should be fairly trivial to catch this issue and retry.

--Mark

> --
> Stefan
> 



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: gitsm shared DL_DIR and race conditions
  2019-01-09 15:38 ` Mark Hatle
@ 2019-01-09 16:21   ` Stefan Agner
  2019-01-09 16:46     ` Mark Hatle
  0 siblings, 1 reply; 4+ messages in thread
From: Stefan Agner @ 2019-01-09 16:21 UTC (permalink / raw)
  To: Mark Hatle; +Cc: bitbake-devel

On 09.01.2019 16:38, Mark Hatle wrote:
> On 1/9/19 7:48 AM, Stefan Agner wrote:
>> Hi,
>>
>> We came across race conditions while fetching a repository using gitsm:
>> ERROR: aktualizr-native-1.0+gitAUTOINC+d00d1a04cc-7 do_fetch: Fetcher
>> failure: Fetch command export PSEUDO_DISABLED=1; export
>> PATH="/workdir/oe/layers/openembedded-core/scripts/native-intercept:/workdir/oe/layers/openembedded-core/scripts:/workdir/oe/tmp/work/x86_64-linux/aktualizr-native/1.0+gitAUTOINC+d00d1a04cc-7/recipe-sysroot-native/usr/bin/x86_64-linux:/workdir/oe/tmp/work/x86_64-linux/aktualizr-native/1.0+gitAUTOINC+d00d1a04cc-7/recipe-sysroot-native/usr/bin:/workdir/oe/tmp/work/x86_64-linux/aktualizr-native/1.0+gitAUTOINC+d00d1a04cc-7/recipe-sysroot-native/usr/sbin:/workdir/oe/tmp/work/x86_64-linux/aktualizr-native/1.0+gitAUTOINC+d00d1a04cc-7/recipe-sysroot-native/usr/bin:/workdir/oe/tmp/work/x86_64-linux/aktualizr-native/1.0+gitAUTOINC+d00d1a04cc-7/recipe-sysroot-native/sbin:/workdir/oe/tmp/work/x86_64-linux/aktualizr-native/1.0+gitAUTOINC+d00d1a04cc-7/recipe-sysroot-native/bin:/workdir/oe/bitbake/bin:/workdir/oe/tmp/hosttools";
>> export HOME="/home/yocto"; git -c core.fsyncobjectfiles=0 config
>> submodule.tests/tuf-test-vectors.url
>> /workdir/downloads/git2/github.com.advancedtelematic.tuf-test-vectors.
>> failed with exit code 255, output:
>> error: could not lock config file config: File exists
>>
>> We share the same DL_DIR located on a NFS across multiple builders. We
>> are using latest state of the 1.40 branch  (thud) of bitbake.
>>
>> It seems that two git config invocations raced in this case. Is there
>> locking required in the current gitsm implementation?
> 
> Did not expect git config to have locking issue... but it is showing that two
> fetches appear to have happened together... gitsm uses the regular git fetcher
> for that bit, and then after the fetch is complete updates the config (like git
> submodule init would) to point to downloaded components.  It's this
> configuration step that appears to have failed.
> 
> It should be fairly trivial to catch this issue and retry.

It does not happen often. It happened only once in the last ~100 builds
or so...

Since this is in the CI environment I need to check whether I can jump
into that environment.

As far as I can see git holds a lock to write its config file:
https://github.com/git/git/blob/master/config.c#L2715

Shouldn't we use a blocking lock on bitbake level to avoid running into
the git lock and erroring out?

--
Stefan


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: gitsm shared DL_DIR and race conditions
  2019-01-09 16:21   ` Stefan Agner
@ 2019-01-09 16:46     ` Mark Hatle
  0 siblings, 0 replies; 4+ messages in thread
From: Mark Hatle @ 2019-01-09 16:46 UTC (permalink / raw)
  To: Stefan Agner; +Cc: bitbake-devel

On 1/9/19 10:21 AM, Stefan Agner wrote:
> On 09.01.2019 16:38, Mark Hatle wrote:
>> On 1/9/19 7:48 AM, Stefan Agner wrote:
>>> Hi,
>>>
>>> We came across race conditions while fetching a repository using gitsm:
>>> ERROR: aktualizr-native-1.0+gitAUTOINC+d00d1a04cc-7 do_fetch: Fetcher
>>> failure: Fetch command export PSEUDO_DISABLED=1; export
>>> PATH="/workdir/oe/layers/openembedded-core/scripts/native-intercept:/workdir/oe/layers/openembedded-core/scripts:/workdir/oe/tmp/work/x86_64-linux/aktualizr-native/1.0+gitAUTOINC+d00d1a04cc-7/recipe-sysroot-native/usr/bin/x86_64-linux:/workdir/oe/tmp/work/x86_64-linux/aktualizr-native/1.0+gitAUTOINC+d00d1a04cc-7/recipe-sysroot-native/usr/bin:/workdir/oe/tmp/work/x86_64-linux/aktualizr-native/1.0+gitAUTOINC+d00d1a04cc-7/recipe-sysroot-native/usr/sbin:/workdir/oe/tmp/work/x86_64-linux/aktualizr-native/1.0+gitAUTOINC+d00d1a04cc-7/recipe-sysroot-native/usr/bin:/workdir/oe/tmp/work/x86_64-linux/aktualizr-native/1.0+gitAUTOINC+d00d1a04cc-7/recipe-sysroot-native/sbin:/workdir/oe/tmp/work/x86_64-linux/aktualizr-native/1.0+gitAUTOINC+d00d1a04cc-7/recipe-sysroot-native/bin:/workdir/oe/bitbake/bin:/workdir/oe/tmp/hosttools";
>>> export HOME="/home/yocto"; git -c core.fsyncobjectfiles=0 config
>>> submodule.tests/tuf-test-vectors.url
>>> /workdir/downloads/git2/github.com.advancedtelematic.tuf-test-vectors.
>>> failed with exit code 255, output:
>>> error: could not lock config file config: File exists
>>>
>>> We share the same DL_DIR located on a NFS across multiple builders. We
>>> are using latest state of the 1.40 branch  (thud) of bitbake.
>>>
>>> It seems that two git config invocations raced in this case. Is there
>>> locking required in the current gitsm implementation?
>>
>> Did not expect git config to have locking issue... but it is showing that two
>> fetches appear to have happened together... gitsm uses the regular git fetcher
>> for that bit, and then after the fetch is complete updates the config (like git
>> submodule init would) to point to downloaded components.  It's this
>> configuration step that appears to have failed.
>>
>> It should be fairly trivial to catch this issue and retry.
> 
> It does not happen often. It happened only once in the last ~100 builds
> or so...
> 
> Since this is in the CI environment I need to check whether I can jump
> into that environment.
> 
> As far as I can see git holds a lock to write its config file:
> https://github.com/git/git/blob/master/config.c#L2715
> 
> Shouldn't we use a blocking lock on bitbake level to avoid running into
> the git lock and erroring out?

Any locking that occurs would need to be synchronized between the git and gitsm
fetchers.  I'm not sure how easy or hard this would be to do.  If we can
coordinate the locking there, that would be best... but baring that catch and
retry should be a reasonable alternative as it is highly unlikely that this file
will be locked for an extended period of time.

--Mark

> --
> Stefan
> 



^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-01-09 16:47 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-01-09 13:48 gitsm shared DL_DIR and race conditions Stefan Agner
2019-01-09 15:38 ` Mark Hatle
2019-01-09 16:21   ` Stefan Agner
2019-01-09 16:46     ` Mark Hatle

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.