All of lore.kernel.org
 help / color / mirror / Atom feed
* Issue with bitbake fetch/unpack when using MIRRORS rewrite
@ 2018-03-09  9:40 Bach, Pascal
  2018-03-11 13:36 ` Richard Purdie
  0 siblings, 1 reply; 9+ messages in thread
From: Bach, Pascal @ 2018-03-09  9:40 UTC (permalink / raw)
  To: bitbake-devel@lists.openembedded.org

Hello Everyone

I finally got to pin down an issue that I occasionally observed in the past:

When using the git fetcher with shallow tarballs (BB_GIT_SHALLOW,BB_GIT_SHALLOW_DEPTH) experience the following issues during the do_unpack task:

ERROR: pseudo-native-1.9.0+gitAUTOINC+d7c31a25e4-r0 do_unpack: Fetcher failure: [...]
tar (child): /home/projects/ccp3-labs/oe/build/../../../downloads2/gitshallow_git.yoctoproject.org.pseudo_d7c31a2-1_master.tar.gz: Cannot open: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now

The issue is that the do_unpack is trying to find the following file in the downloads directory:
downloads2/gitshallow_git.yoctoproject.org.pseudo_d7c31a2-1_master.tar.gz

However this file doesn't exist, instead the following file was created by the fetch task:
downloads2/gitshallow_git.yoctoproject.org.git.pseudo_d7c31a2-1_master.tar.gz

After some digging I suspect that the following rewrite rule is causing the issue:
MIRRORS += "git://git.yoctoproject.org/.*                               git://git.yoctoproject.org/git/PATH;protocol=https \n"

The line exists because we don't have access to git.yoctoproject.org via the git protocol. So we use the above rule to make access work via HTTPS.
However due to the way the Yocto git server is structured the rewrite causes the URL path to change from / to /git which causes the additional .git in the
resulting tarball, which then makes the unpack task look for the wrong file.

I think the proper behavior would be to always name the tarball after the original SRC_URI not the rewritten one.

BTW: I was able to reproduce the issue with the shallow git tarballs but I think it also applies to normal git tarballs and possibly other fetchers.

The issue still exists with bitbake 1.37.0 on Poky master.

Regards
Pascal


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Issue with bitbake fetch/unpack when using MIRRORS rewrite
  2018-03-09  9:40 Issue with bitbake fetch/unpack when using MIRRORS rewrite Bach, Pascal
@ 2018-03-11 13:36 ` Richard Purdie
  2018-07-02 12:29   ` Urs Fässler
  0 siblings, 1 reply; 9+ messages in thread
From: Richard Purdie @ 2018-03-11 13:36 UTC (permalink / raw)
  To: Bach, Pascal, bitbake-devel@lists.openembedded.org

On Fri, 2018-03-09 at 09:40 +0000, Bach, Pascal wrote:
> Hello Everyone
> 
> I finally got to pin down an issue that I occasionally observed in
> the past:
> 
> When using the git fetcher with shallow tarballs
> (BB_GIT_SHALLOW,BB_GIT_SHALLOW_DEPTH) experience the following issues
> during the do_unpack task:
> 
> ERROR: pseudo-native-1.9.0+gitAUTOINC+d7c31a25e4-r0 do_unpack:
> Fetcher failure: [...]
> tar (child): /home/projects/ccp3-
> labs/oe/build/../../../downloads2/gitshallow_git.yoctoproject.org.pse
> udo_d7c31a2-1_master.tar.gz: Cannot open: No such file or directory
> tar (child): Error is not recoverable: exiting now
> tar: Child returned status 2
> tar: Error is not recoverable: exiting now
> 
> The issue is that the do_unpack is trying to find the following file
> in the downloads directory:
> downloads2/gitshallow_git.yoctoproject.org.pseudo_d7c31a2-
> 1_master.tar.gz
> 
> However this file doesn't exist, instead the following file was
> created by the fetch task:
> downloads2/gitshallow_git.yoctoproject.org.git.pseudo_d7c31a2-
> 1_master.tar.gz
> 
> After some digging I suspect that the following rewrite rule is
> causing the issue:
> MIRRORS +=
> "git://git.yoctoproject.org/.*                               git://gi
> t.yoctoproject.org/git/PATH;protocol=https \n"
> 
> The line exists because we don't have access to git.yoctoproject.org
> via the git protocol. So we use the above rule to make access work
> via HTTPS.
> However due to the way the Yocto git server is structured the rewrite
> causes the URL path to change from / to /git which causes the
> additional .git in the
> resulting tarball, which then makes the unpack task look for the
> wrong file.
> 
> I think the proper behavior would be to always name the tarball after
> the original SRC_URI not the rewritten one.
> 
> BTW: I was able to reproduce the issue with the shallow git tarballs
> but I think it also applies to normal git tarballs and possibly other
> fetchers.
> 
> The issue still exists with bitbake 1.37.0 on Poky master.

I do think there are likely problems in this area.

This gets tricky as one mirror tarball could in theory match against
multiple different parent urls. I'm therefore torn on whether it should
use the "parent" naming for the tarball, or create a chain of symlinks
back to the high level name. 

Its something I'd have to sit and look at a bit further and understand
what the code is doing. I know when Chris added the shallow code, he
did add multiple mirror tarball support so it could be related to that
too...

Just thinking out loud, you could have a "full" mirror tarball as well
as a shallow one too so it does get quite complex. Which corner cases
we care about is a tricky one too.

Did you have a fix to propose?

With the fetcher, I do ask when we make changes, we do add unit tests
so that we can protect use cases going forward.

Cheers,

Richard





^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Issue with bitbake fetch/unpack when using MIRRORS rewrite
  2018-03-11 13:36 ` Richard Purdie
@ 2018-07-02 12:29   ` Urs Fässler
  2018-07-03 13:59     ` Urs Fässler
  0 siblings, 1 reply; 9+ messages in thread
From: Urs Fässler @ 2018-07-02 12:29 UTC (permalink / raw)
  To: bitbake-devel

On Sun, 2018-03-11 at 06:36 -0700, richard.purdiea wrote:
> On Fri, 2018-03-09 at 09:40 +0000, Bach, Pascal wrote:
> > Hello Everyone
> > 
> > I finally got to pin down an issue that I occasionally observed in
> > the past:
> > 
> > When using the git fetcher with shallow tarballs
> > (BB_GIT_SHALLOW,BB_GIT_SHALLOW_DEPTH) experience the following
> > issues
> > during the do_unpack task:
> > 
> > ERROR: pseudo-native-1.9.0+gitAUTOINC+d7c31a25e4-r0 do_unpack:
> > Fetcher failure: [...]
> > tar (child): /home/projects/ccp3-
> > labs/oe/build/../../../downloads2/gitshallow_git.yoctoproject.org.p
> > se
> > udo_d7c31a2-1_master.tar.gz: Cannot open: No such file or directory
> > tar (child): Error is not recoverable: exiting now
> > tar: Child returned status 2
> > tar: Error is not recoverable: exiting now
> > 
> > The issue is that the do_unpack is trying to find the following
> > file
> > in the downloads directory:
> > downloads2/gitshallow_git.yoctoproject.org.pseudo_d7c31a2-
> > 1_master.tar.gz
> > 
> > However this file doesn't exist, instead the following file was
> > created by the fetch task:
> > downloads2/gitshallow_git.yoctoproject.org.git.pseudo_d7c31a2-
> > 1_master.tar.gz
> > 
> > After some digging I suspect that the following rewrite rule is
> > causing the issue:
> > MIRRORS +=
> > "git://git.yoctoproject.org/.*???????????????????????????????git://
> > gi
> > t.yoctoproject.org/git/PATH;protocol=https \n"
> > 
> > The line exists because we don't have access to
> > git.yoctoproject.org
> > via the git protocol. So we use the above rule to make access work
> > via HTTPS.
> > However due to the way the Yocto git server is structured the
> > rewrite
> > causes the URL path to change from / to /git which causes the
> > additional .git in the
> > resulting tarball, which then makes the unpack task look for the
> > wrong file.
> > 
> > I think the proper behavior would be to always name the tarball
> > after
> > the original SRC_URI not the rewritten one.
> > 
> > BTW: I was able to reproduce the issue with the shallow git
> > tarballs
> > but I think it also applies to normal git tarballs and possibly
> > other
> > fetchers.
> > 
> > The issue still exists with bitbake 1.37.0 on Poky master.
> 
> I do think there are likely problems in this area.
> 
> This gets tricky as one mirror tarball could in theory match against
> multiple different parent urls. I'm therefore torn on whether it
> should
> use the "parent" naming for the tarball, or create a chain of
> symlinks
> back to the high level name.?
> 
> Its something I'd have to sit and look at a bit further and
> understand
> what the code is doing. I know when Chris added the shallow code, he
> did add multiple mirror tarball support so it could be related to
> that
> too...
> 
> Just thinking out loud, you could have a "full" mirror tarball as
> well
> as a shallow one too so it does get quite complex. Which corner cases
> we care about is a tricky one too.
> 
> Did you have a fix to propose?
> 
> With the fetcher, I do ask when we make changes, we do add unit tests
> so that we can protect use cases going forward.
> 
> Cheers,
> 
> Richard

Hi all,
I got the task to fix this issue.
Can you explain where do you see the other problems and tricky
situations you mentioned, maybe with an example?

Regards
Urs



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Issue with bitbake fetch/unpack when using MIRRORS rewrite
  2018-07-02 12:29   ` Urs Fässler
@ 2018-07-03 13:59     ` Urs Fässler
  0 siblings, 0 replies; 9+ messages in thread
From: Urs Fässler @ 2018-07-03 13:59 UTC (permalink / raw)
  To: richard.purdie, bitbake-devel

On Mon, 2018-07-02 at 14:29 +0200, Urs Fässler wrote:
> On Sun, 2018-03-11 at 06:36 -0700, richard.purdiea wrote:
> > On Fri, 2018-03-09 at 09:40 +0000, Bach, Pascal wrote:
> > > Hello Everyone
> > > 
> > > I finally got to pin down an issue that I occasionally observed
> > > in
> > > the past:
> > > 
> > > When using the git fetcher with shallow tarballs
> > > (BB_GIT_SHALLOW,BB_GIT_SHALLOW_DEPTH) experience the following
> > > issues
> > > during the do_unpack task:
> > > 
> > > ERROR: pseudo-native-1.9.0+gitAUTOINC+d7c31a25e4-r0 do_unpack:
> > > Fetcher failure: [...]
> > > tar (child): /home/projects/ccp3-
> > > labs/oe/build/../../../downloads2/gitshallow_git.yoctoproject.org
> > > .p
> > > se
> > > udo_d7c31a2-1_master.tar.gz: Cannot open: No such file or
> > > directory
> > > tar (child): Error is not recoverable: exiting now
> > > tar: Child returned status 2
> > > tar: Error is not recoverable: exiting now
> > > 
> > > The issue is that the do_unpack is trying to find the following
> > > file
> > > in the downloads directory:
> > > downloads2/gitshallow_git.yoctoproject.org.pseudo_d7c31a2-
> > > 1_master.tar.gz
> > > 
> > > However this file doesn't exist, instead the following file was
> > > created by the fetch task:
> > > downloads2/gitshallow_git.yoctoproject.org.git.pseudo_d7c31a2-
> > > 1_master.tar.gz
> > > 
> > > After some digging I suspect that the following rewrite rule is
> > > causing the issue:
> > > MIRRORS +=
> > > "git://git.yoctoproject.org/.*???????????????????????????????git:
> > > //
> > > gi
> > > t.yoctoproject.org/git/PATH;protocol=https \n"
> > > 
> > > The line exists because we don't have access to
> > > git.yoctoproject.org
> > > via the git protocol. So we use the above rule to make access
> > > work
> > > via HTTPS.
> > > However due to the way the Yocto git server is structured the
> > > rewrite
> > > causes the URL path to change from / to /git which causes the
> > > additional .git in the
> > > resulting tarball, which then makes the unpack task look for the
> > > wrong file.
> > > 
> > > I think the proper behavior would be to always name the tarball
> > > after
> > > the original SRC_URI not the rewritten one.
> > > 
> > > BTW: I was able to reproduce the issue with the shallow git
> > > tarballs
> > > but I think it also applies to normal git tarballs and possibly
> > > other
> > > fetchers.
> > > 
> > > The issue still exists with bitbake 1.37.0 on Poky master.
> > 
> > I do think there are likely problems in this area.
> > 
> > This gets tricky as one mirror tarball could in theory match
> > against
> > multiple different parent urls. I'm therefore torn on whether it
> > should
> > use the "parent" naming for the tarball, or create a chain of
> > symlinks
> > back to the high level name.?
> > 
> > Its something I'd have to sit and look at a bit further and
> > understand
> > what the code is doing. I know when Chris added the shallow code,
> > he
> > did add multiple mirror tarball support so it could be related to
> > that
> > too...
> > 
> > Just thinking out loud, you could have a "full" mirror tarball as
> > well
> > as a shallow one too so it does get quite complex. Which corner
> > cases
> > we care about is a tricky one too.
> > 
> > Did you have a fix to propose?
> > 
> > With the fetcher, I do ask when we make changes, we do add unit
> > tests
> > so that we can protect use cases going forward.
> > 
> > Cheers,
> > 
> > Richard
> 
> Hi all,
> I got the task to fix this issue.
> Can you explain where do you see the other problems and tricky
> situations you mentioned, maybe with an example?
> 
> Regards
> Urs

I investigated the problem and saw that the download/pack task uses the
mirrored url to create the local filename while the unpack task uses
the original url to create the local filename.

As I understand the code the link between the filename from
download/pack and the filename from unpack is missing. It seems to work
because they are usually the same.

I don't have the full picture to decide what solution is best. But from
my understanding, it should be ok to use the parent/original url. It
could match against multiple urls, but they should contain the same
version of the data. In the git case, this is secured by the commit
hash in the filename of the tarball.
But I like to hear your opinion.



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Issue with bitbake fetch/unpack when using MIRRORS rewrite
@ 2018-07-17 11:15 Urs Fässler
  2018-07-17 11:24 ` richard.purdie
  0 siblings, 1 reply; 9+ messages in thread
From: Urs Fässler @ 2018-07-17 11:15 UTC (permalink / raw)
  To: bitbake-devel

Hi,
I investigated the issue that the unpack step fails when using mirror
rewrite rules.

The root of the problem is, that the download step uses the
mirrored url to create the local filename while the unpack step uses
the url from the recipe to create the local filename.

As I understand the code, the link between the filename from
download and the filename from unpack is missing. It seems to work
because they are usually the same.

I tried both solutions proposed by Richard: Using symlinks and the
recipe-url.
The symlink solution is nice since it follows the same methods as for
git clones. Unfortunately, it is not practical for us. We like to store
the tarballs on a SMB share or S3 storage. Both do not support
symlinks.
The recipe-url naming method is nice since the tarball is named after
the url as it is written in the recipe. This is easy understandable.
But unfortunately this method breaks the test
"FetcherNetworkTest.test_gitfetch_premirror", which tests the
following: when 2 different recipe-urls point to the same mirrored-url, 
the repository is cloned only once.

Now the question is which solution we should implement. For us, it is
the second one (tarball naming after recipe-url). It comes with the
downside that the one mentioned test fails and has to be removed. In a
real scenario this results in downloading a repository twice and having
2 tarballs with the same content. But I expect this to be unlikely in a
real world scenario.

A third solution may be that we add a link between the download and
unpack task. But this would be the most intrusive solution for Bitbake.

Thanks,
Urs



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Issue with bitbake fetch/unpack when using MIRRORS rewrite
  2018-07-17 11:15 Urs Fässler
@ 2018-07-17 11:24 ` richard.purdie
  2018-07-17 13:39   ` Urs Fässler
  0 siblings, 1 reply; 9+ messages in thread
From: richard.purdie @ 2018-07-17 11:24 UTC (permalink / raw)
  To: Urs Fässler, bitbake-devel

Hi Urs,

On Tue, 2018-07-17 at 13:15 +0200, Urs Fässler wrote:
> I investigated the issue that the unpack step fails when using mirror
> rewrite rules.
> 
> The root of the problem is, that the download step uses the
> mirrored url to create the local filename while the unpack step uses
> the url from the recipe to create the local filename.
> 
> As I understand the code, the link between the filename from
> download and the filename from unpack is missing. It seems to work
> because they are usually the same.
> 
> I tried both solutions proposed by Richard: Using symlinks and the
> recipe-url.
> The symlink solution is nice since it follows the same methods as for
> git clones. Unfortunately, it is not practical for us. We like to
> store
> the tarballs on a SMB share or S3 storage. Both do not support
> symlinks.
> The recipe-url naming method is nice since the tarball is named after
> the url as it is written in the recipe. This is easy understandable.
> But unfortunately this method breaks the test
> "FetcherNetworkTest.test_gitfetch_premirror", which tests the
> following: when 2 different recipe-urls point to the same mirrored-
> url, the repository is cloned only once.

Would you be able to provide a kind of worked example of the problem? I
think I understand the problem but some example urls, the mirror format
and the resulting different mirror tarball names would probably make it
easier for me to comment on this.

> Now the question is which solution we should implement. For us, it is
> the second one (tarball naming after recipe-url). It comes with the
> downside that the one mentioned test fails and has to be removed. In
> a
> real scenario this results in downloading a repository twice and
> having
> 2 tarballs with the same content. But I expect this to be unlikely in
> a
> real world scenario.
> 
> A third solution may be that we add a link between the download and
> unpack task. But this would be the most intrusive solution for
> Bitbake.

I'm more than a little concerned about the symlink comment since the
fetcher assumes that symlinks work in other places too.

Also, do you have any new test cases to add which illustrate it?

Cheers,

Richard





^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Issue with bitbake fetch/unpack when using MIRRORS rewrite
  2018-07-17 11:24 ` richard.purdie
@ 2018-07-17 13:39   ` Urs Fässler
  2018-07-17 14:00     ` richard.purdie
  0 siblings, 1 reply; 9+ messages in thread
From: Urs Fässler @ 2018-07-17 13:39 UTC (permalink / raw)
  To: richard.purdie, bitbake-devel

On Tue, 2018-07-17 at 12:24 +0100, richard.purdie@linuxfoundation.org
wrote:
> Hi Urs,
> 
> On Tue, 2018-07-17 at 13:15 +0200, Urs Fässler wrote:
> > I investigated the issue that the unpack step fails when using
> > mirror
> > rewrite rules.
> > 
> > The root of the problem is, that the download step uses the
> > mirrored url to create the local filename while the unpack step
> > uses
> > the url from the recipe to create the local filename.
> > 
> > As I understand the code, the link between the filename from
> > download and the filename from unpack is missing. It seems to work
> > because they are usually the same.
> > 
> > I tried both solutions proposed by Richard: Using symlinks and the
> > recipe-url.
> > The symlink solution is nice since it follows the same methods as
> > for
> > git clones. Unfortunately, it is not practical for us. We like to
> > store
> > the tarballs on a SMB share or S3 storage. Both do not support
> > symlinks.
> > The recipe-url naming method is nice since the tarball is named
> > after
> > the url as it is written in the recipe. This is easy
> > understandable.
> > But unfortunately this method breaks the test
> > "FetcherNetworkTest.test_gitfetch_premirror", which tests the
> > following: when 2 different recipe-urls point to the same mirrored-
> > url, the repository is cloned only once.
> 
> Would you be able to provide a kind of worked example of the problem?
> I
> think I understand the problem but some example urls, the mirror
> format
> and the resulting different mirror tarball names would probably make
> it
> easier for me to comment on this.

You can reproduce the problem with a current (rocko) Yocto. Add the
following lines in local.conf:
  BB_GIT_SHALLOW = "1"
  BB_GENERATE_MIRROR_TARBALLS = "1"
  PREMIRRORS += "git://git.yoctoproject.org/.* git://git.yoctoproject.org/git/PATH;protocol=https \n"

and execute
  bitbake fstests -c unpack

You should get something like:
  tar -xzf .../build/downloads/gitshallow_git.yoctoproject.org.fstests_e5939ff-1_master.tar.gz failed with exit code 2, output:
  tar (child): .../build/downloads/gitshallow_git.yoctoproject.org.fstests_e5939ff-1_master.tar.gz: Cannot open: No such file or directory

What happens is that the download step generates the tarball:
  gitshallow_git.yoctoproject.org.git.fstests_e5939ff-1_master.tar.gz
but the unpack step expects the tarball:
  gitshallow_git.yoctoproject.org.fstests_e5939ff-1_master.tar.gz

The difference in the tarball names come from the different url used in
the recipe and when rewritten according to PREMIRRORS.


The symlink solution would add a symlink from
gitshallow_git.yoctoproject.org.fstests_e5939ff-1_master.tar.gz to
gitshallow_git.yoctoproject.org.git.fstests_e5939ff-1_master.tar.gz.

The recipe-url solution would name the tarball
gitshallow_git.yoctoproject.org.fstests_e5939ff-1_master.tar.gz.

> > Now the question is which solution we should implement. For us, it
> > is
> > the second one (tarball naming after recipe-url). It comes with the
> > downside that the one mentioned test fails and has to be removed.
> > In
> > a
> > real scenario this results in downloading a repository twice and
> > having
> > 2 tarballs with the same content. But I expect this to be unlikely
> > in
> > a
> > real world scenario.
> > 
> > A third solution may be that we add a link between the download and
> > unpack task. But this would be the most intrusive solution for
> > Bitbake.
> 
> I'm more than a little concerned about the symlink comment since the
> fetcher assumes that symlinks work in other places too.

Sorry for concerning you. I think it is no issue. We generate the
tarballs and archive them on a system without symlinks. Then we get the
tarballs over http with the help of a premirror rule. We do it as
described in the Bitbake manual chapter "The Download (Fetch)". I
expect this to be a fairly common use case.

Another rationale for the recipe-url solution is that the mirrors are
used when the server from the recipe-url is not available. When we
generate a tarball, it would be strange that the name of it depends on
some local conditions (closed ports, local mirror rewrite rules, ...)
rather than the recipe.
This probably invalidates my argument that the symlink solution is nice
since it has the same method for naming as the git clone naming. This
are 2 quite different use cases.

> Also, do you have any new test cases to add which illustrate it?

I have a test but it is a bit tricky since there are 2 issues. The one
you see (as mentioned above with Yocto) is actually not the real
problem but an issue that is only seen in this scenario.

Since I am now quite sure that the solution where we use the recipe-url 
for the tarball name is the correct one, I like to reformulate my
Question: Do you see a reason why it is a bad solution?

If you don't see a problem I will implements this behavior. The failing
test will be replaced with other tests that capture the new behavior.

> Cheers,
> 
> Richard

Thanks,
Urs



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Issue with bitbake fetch/unpack when using MIRRORS rewrite
  2018-07-17 13:39   ` Urs Fässler
@ 2018-07-17 14:00     ` richard.purdie
  2018-07-18 12:38       ` Urs Fässler
  0 siblings, 1 reply; 9+ messages in thread
From: richard.purdie @ 2018-07-17 14:00 UTC (permalink / raw)
  To: Urs Fässler, bitbake-devel

On Tue, 2018-07-17 at 15:39 +0200, Urs Fässler wrote:
> You can reproduce the problem with a current (rocko) Yocto. Add the
> following lines in local.conf:
>   BB_GIT_SHALLOW = "1"
>   BB_GENERATE_MIRROR_TARBALLS = "1"
>   PREMIRRORS += "git://git.yoctoproject.org/.*
> git://git.yoctoproject.org/git/PATH;protocol=https \n"
> 
> and execute
>   bitbake fstests -c unpack
> 
> You should get something like:
>   tar -xzf
> .../build/downloads/gitshallow_git.yoctoproject.org.fstests_e5939ff-
> 1_master.tar.gz failed with exit code 2, output:
>   tar (child):
> .../build/downloads/gitshallow_git.yoctoproject.org.fstests_e5939ff-
> 1_master.tar.gz: Cannot open: No such file or directory
> 
> What happens is that the download step generates the tarball:
>   gitshallow_git.yoctoproject.org.git.fstests_e5939ff-1_master.tar.gz
> but the unpack step expects the tarball:
>   gitshallow_git.yoctoproject.org.fstests_e5939ff-1_master.tar.gz
> 
> The difference in the tarball names come from the different url used
> in the recipe and when rewritten according to PREMIRRORS.

Is this just a problem with shallow clones or with git recipes in
general and the above was just a simple/fast example?

> The symlink solution would add a symlink from
> gitshallow_git.yoctoproject.org.fstests_e5939ff-1_master.tar.gz to
> gitshallow_git.yoctoproject.org.git.fstests_e5939ff-1_master.tar.gz.
>
> The recipe-url solution would name the tarball
> gitshallow_git.yoctoproject.org.fstests_e5939ff-1_master.tar.gz.
>
> > > Now the question is which solution we should implement. For us,
> > > it
> > > is
> > > the second one (tarball naming after recipe-url). It comes with
> > > the
> > > downside that the one mentioned test fails and has to be removed.
> > > In
> > > a
> > > real scenario this results in downloading a repository twice and
> > > having
> > > 2 tarballs with the same content. But I expect this to be
> > > unlikely
> > > in
> > > a
> > > real world scenario.
> > > 
> > > A third solution may be that we add a link between the download
> > > and
> > > unpack task. But this would be the most intrusive solution for
> > > Bitbake.
> > 
> > I'm more than a little concerned about the symlink comment since
> > the
> > fetcher assumes that symlinks work in other places too.
> 
> Sorry for concerning you. I think it is no issue. We generate the
> tarballs and archive them on a system without symlinks. Then we get
> the tarballs over http with the help of a premirror rule. We do it as
> described in the Bitbake manual chapter "The Download (Fetch)". I
> expect this to be a fairly common use case.
> 
> Another rationale for the recipe-url solution is that the mirrors are
> used when the server from the recipe-url is not available. When we
> generate a tarball, it would be strange that the name of it depends
> on some local conditions (closed ports, local mirror rewrite rules,
> ...) rather than the recipe.
> This probably invalidates my argument that the symlink solution is
> nice since it has the same method for naming as the git clone naming.
> This are 2 quite different use cases.
> 
> > Also, do you have any new test cases to add which illustrate it?
> 
> I have a test but it is a bit tricky since there are 2 issues. The
> one you see (as mentioned above with Yocto) is actually not the real
> problem but an issue that is only seen in this scenario.

Could we have separate test cases for the two issues?

> Since I am now quite sure that the solution where we use the recipe-
> url for the tarball name is the correct one, I like to reformulate my
> Question: Do you see a reason why it is a bad solution?
> 
> If you don't see a problem I will implements this behavior. The
> failing test will be replaced with other tests that capture the new
> behavior.

For better/worse, the current fetcher behaviour with mirrors was to use
symlinks to represent the path its resolved through. We use them in a
variety of cirumstances such as when referencing local file:// mirrors
we can't write to.

My instinct is therefore to have all the fetchers behave the same way
with regard to how DL_DIR works. Having consistency is important and
multiple code paths with different behaviour has caused us problems in
the past.

At the back of my mind there is some thinking that the symlink approach
does protect us against some potential data duplication in the
downloads directory but I'm out of time to work through that thought
right now.

Can you confirm if the normal git fetcher has the issue or its just
shallow clones as that is probably important?

I'd be interested to see your alternative patches too if you have them.

Cheers,

Richard





^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Issue with bitbake fetch/unpack when using MIRRORS rewrite
  2018-07-17 14:00     ` richard.purdie
@ 2018-07-18 12:38       ` Urs Fässler
  0 siblings, 0 replies; 9+ messages in thread
From: Urs Fässler @ 2018-07-18 12:38 UTC (permalink / raw)
  To: richard.purdie, bitbake-devel

On Tue, 2018-07-17 at 15:00 +0100, richard.purdie@linuxfoundation.org
wrote:
> On Tue, 2018-07-17 at 15:39 +0200, Urs Fässler wrote:
> > You can reproduce the problem with a current (rocko) Yocto. Add the
> > following lines in local.conf:
> >   BB_GIT_SHALLOW = "1"
> >   BB_GENERATE_MIRROR_TARBALLS = "1"
> >   PREMIRRORS += "git://git.yoctoproject.org/.*
> > git://git.yoctoproject.org/git/PATH;protocol=https \n"
> > 
> > and execute
> >   bitbake fstests -c unpack
> > 
> > You should get something like:
> >   tar -xzf
> > .../build/downloads/gitshallow_git.yoctoproject.org.fstests_e5939ff
> > -
> > 1_master.tar.gz failed with exit code 2, output:
> >   tar (child):
> > .../build/downloads/gitshallow_git.yoctoproject.org.fstests_e5939ff
> > -
> > 1_master.tar.gz: Cannot open: No such file or directory
> > 
> > What happens is that the download step generates the tarball:
> >   gitshallow_git.yoctoproject.org.git.fstests_e5939ff-
> > 1_master.tar.gz
> > but the unpack step expects the tarball:
> >   gitshallow_git.yoctoproject.org.fstests_e5939ff-1_master.tar.gz
> > 
> > The difference in the tarball names come from the different url
> > used
> > in the recipe and when rewritten according to PREMIRRORS.
> 
> Is this just a problem with shallow clones or with git recipes in
> general and the above was just a simple/fast example?

With shallow clones the problem exists always when using PREMIRROS.

The problem exists with full clones and MIRRORS, but is difficult to
reproduce. We think that the problem may depend the circumstances of
the first download (i.e. if the upstream or mirrored server is
available). It may also depend on the order of execution of the steps.
I will further investigate into this.

> > The symlink solution would add a symlink from
> > gitshallow_git.yoctoproject.org.fstests_e5939ff-1_master.tar.gz to
> > gitshallow_git.yoctoproject.org.git.fstests_e5939ff-
> > 1_master.tar.gz.
> > 
> > The recipe-url solution would name the tarball
> > gitshallow_git.yoctoproject.org.fstests_e5939ff-1_master.tar.gz.
> > 
> > > > Now the question is which solution we should implement. For us,
> > > > it
> > > > is
> > > > the second one (tarball naming after recipe-url). It comes with
> > > > the
> > > > downside that the one mentioned test fails and has to be
> > > > removed.
> > > > In
> > > > a
> > > > real scenario this results in downloading a repository twice
> > > > and
> > > > having
> > > > 2 tarballs with the same content. But I expect this to be
> > > > unlikely
> > > > in
> > > > a
> > > > real world scenario.
> > > > 
> > > > A third solution may be that we add a link between the download
> > > > and
> > > > unpack task. But this would be the most intrusive solution for
> > > > Bitbake.
> > > 
> > > I'm more than a little concerned about the symlink comment since
> > > the
> > > fetcher assumes that symlinks work in other places too.
> > 
> > Sorry for concerning you. I think it is no issue. We generate the
> > tarballs and archive them on a system without symlinks. Then we get
> > the tarballs over http with the help of a premirror rule. We do it
> > as
> > described in the Bitbake manual chapter "The Download (Fetch)". I
> > expect this to be a fairly common use case.
> > 
> > Another rationale for the recipe-url solution is that the mirrors
> > are
> > used when the server from the recipe-url is not available. When we
> > generate a tarball, it would be strange that the name of it depends
> > on some local conditions (closed ports, local mirror rewrite rules,
> > ...) rather than the recipe.
> > This probably invalidates my argument that the symlink solution is
> > nice since it has the same method for naming as the git clone
> > naming.
> > This are 2 quite different use cases.
> > 
> > > Also, do you have any new test cases to add which illustrate it?
> > 
> > I have a test but it is a bit tricky since there are 2 issues. The
> > one you see (as mentioned above with Yocto) is actually not the
> > real
> > problem but an issue that is only seen in this scenario.
> 
> Could we have separate test cases for the two issues?
>
> > Since I am now quite sure that the solution where we use the
> > recipe-
> > url for the tarball name is the correct one, I like to reformulate
> > my
> > Question: Do you see a reason why it is a bad solution?
> > 
> > If you don't see a problem I will implements this behavior. The
> > failing test will be replaced with other tests that capture the new
> > behavior.
> 
> For better/worse, the current fetcher behaviour with mirrors was to
> use
> symlinks to represent the path its resolved through. We use them in a
> variety of cirumstances such as when referencing local file://
> mirrors
> we can't write to.
> 
> My instinct is therefore to have all the fetchers behave the same way
> with regard to how DL_DIR works. Having consistency is important and
> multiple code paths with different behaviour has caused us problems
> in
> the past.
> 
> At the back of my mind there is some thinking that the symlink
> approach
> does protect us against some potential data duplication in the
> downloads directory but I'm out of time to work through that thought
> right now.

You are right, it may duplicate some tarballs. This happens when you
have different urls in the recipe that end up with the same url by
mirror rewrite rules. I expect this to be a rare case.

> Can you confirm if the normal git fetcher has the issue or its just
> shallow clones as that is probably important?

git shallow tarball: always a problem with rewrite rules, easy to
reproduce
git full tarball: seldom a problem, difficult to reproduce
git repo clone: no problem seen

> I'd be interested to see your alternative patches too if you have
> them.

I am working on the patches for the shallow tarball.

Regards,
Urs



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-07-18 18:10 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-03-09  9:40 Issue with bitbake fetch/unpack when using MIRRORS rewrite Bach, Pascal
2018-03-11 13:36 ` Richard Purdie
2018-07-02 12:29   ` Urs Fässler
2018-07-03 13:59     ` Urs Fässler
  -- strict thread matches above, loose matches on Subject: below --
2018-07-17 11:15 Urs Fässler
2018-07-17 11:24 ` richard.purdie
2018-07-17 13:39   ` Urs Fässler
2018-07-17 14:00     ` richard.purdie
2018-07-18 12:38       ` Urs Fässler

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.