Buildroot Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [Buildroot] github helper is broken
@ 2014-11-13 18:23 Yann E. MORIN
  2014-11-13 19:43 ` Arnout Vandecappelle
  2014-11-13 19:54 ` Thomas De Schampheleire
  0 siblings, 2 replies; 5+ messages in thread
From: Yann E. MORIN @ 2014-11-13 18:23 UTC (permalink / raw)
  To: buildroot

Hello All,

The github helper is now broken, because GitHub changed their download
scheme, again.

The new scheme is:
    https://github.com/USER/REPO/archive/VERSION.tar.gz

What is not-so-obvious in this new scheme, is that it omits the package
name from the final component of the URU, so we can no longer rely on
the default value of the FOO_SOURCE variable to construct the URL.

So we now have to set FOO_SOURCE explicitly, which was not needed so far,
and set it to $(FOO_VERSION).tar.gz

But then, it means we would store tarballs named FOO_VERSION.tar.gz,
which is not so nice in the end.

We've discussed this on IRC, and came to three main proposals:

1) rely on the server to pass an appropriate content-disposition header
with the filename.

Unfortunately, that's not doable in Buildroot:
  - we need to know the filename prior to doing the download. We could
    use a workaround by having wget do a HEAD fetch to just get the
    filename;
  - but then this would not work for off-line builds.

So this solution is a no-no.

2) differentiate upstream tarball name from local tarball name

This introduces a new variable, like FOO_UPSTREAM_SOURCE, which besides
not being nice, would require quite some work in the pkg-download infra,
which is a bit risky that close to the release.

3) no longer do tarball downloads from github, but do a git clone

This would protect us from any future change in the GitHub tarballs
download naming scheme. And after all, GitHub is a git hosting forge,
so let's use it for what it is.

The problems with that solution are two fold:
  - downloads might take more time than a tarball download, since a
    complete repository would probably be bigger than a single tarball;
  - we must use the http:// scheme for the URL (because, proxies), so
    all packages must now specify FOO_SITE_METHOD = git

Although the download time is not much of an issue, the way we are using
the github helper for now does not allow for setting more than one
variable.


In the end, solution 3 seems the most appropriate, and would require
just a bit of easy modifications:

  - tweaking the github helper to emit the repository URL instead of the
    tarball URL;

  - add FOO_SITE_METHOD to all packages.

The first one is trivial, and the second one is relatively easy with a
bit of 'find' and some clever 'sed' experession.

I have all setup and ready to implement solution 3, but before I do, I'd
like some feedback on those proposals, and if possible new proposals
that are easy to implement for the release.


Notes: I thought of a few other proposals, like:

  - define the github helper so that it emits the necessary variables:
        $(eval $(call github,user,repo))
    would expand to:
        FOO_SITE = https://github.com/user/repo
        FOO_SITE_METHOD = git

    That would allow us to introduce new types of forges, like gitorious
    of bitbucket.

    But that's not so nice, because so far we always relied on only
    setting variables, not generating Makefile code.

    Also, it implies changing all packages as well.

  - introduce new vriables to define a new semantic for the downloads:
        FOO_FORGE = github
        FOO_FORGE_PATH = /user/repo
    which would be interpreted by the pkg-download infra to generate the
    correct URL

    That would also allow us to add more forge-related handlers, like
    gitorious or bitbucket (others?), but is really diverging from the
    current semantics of the variables we already have.

    Also, it implies changing all packages as well.

So, we could not find a solution that would limit the changes to just the
github helper; all solutions we have for now imply touching all packages
hosted on GitHub.

Comments and suggestions most welcome! ;-)

Regards,
Yann E. MORIN.

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 223 225 172 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Buildroot] github helper is broken
  2014-11-13 18:23 [Buildroot] github helper is broken Yann E. MORIN
@ 2014-11-13 19:43 ` Arnout Vandecappelle
  2014-11-13 19:54 ` Thomas De Schampheleire
  1 sibling, 0 replies; 5+ messages in thread
From: Arnout Vandecappelle @ 2014-11-13 19:43 UTC (permalink / raw)
  To: buildroot

On 13/11/14 19:23, Yann E. MORIN wrote:
> Hello All,
> 
> The github helper is now broken, because GitHub changed their download
> scheme, again.
> 
> The new scheme is:
>     https://github.com/USER/REPO/archive/VERSION.tar.gz
> 
> What is not-so-obvious in this new scheme, is that it omits the package
> name from the final component of the URU, so we can no longer rely on
> the default value of the FOO_SOURCE variable to construct the URL.
> 
> So we now have to set FOO_SOURCE explicitly, which was not needed so far,
> and set it to $(FOO_VERSION).tar.gz
> 
> But then, it means we would store tarballs named FOO_VERSION.tar.gz,
> which is not so nice in the end.
> 
> We've discussed this on IRC, and came to three main proposals:
> 
> 1) rely on the server to pass an appropriate content-disposition header
> with the filename.
> 
> Unfortunately, that's not doable in Buildroot:
>   - we need to know the filename prior to doing the download. We could
>     use a workaround by having wget do a HEAD fetch to just get the
>     filename;
>   - but then this would not work for off-line builds.
> 
> So this solution is a no-no.
> 
> 2) differentiate upstream tarball name from local tarball name
> 
> This introduces a new variable, like FOO_UPSTREAM_SOURCE, which besides
> not being nice, would require quite some work in the pkg-download infra,
> which is a bit risky that close to the release.

 I think long-term this is the way to go, but indeed it can't be done this close
to the release.

> 
> 3) no longer do tarball downloads from github, but do a git clone
> 
> This would protect us from any future change in the GitHub tarballs
> download naming scheme. And after all, GitHub is a git hosting forge,
> so let's use it for what it is.
> 
> The problems with that solution are two fold:
>   - downloads might take more time than a tarball download, since a
>     complete repository would probably be bigger than a single tarball;
>   - we must use the http:// scheme for the URL (because, proxies), so
>     all packages must now specify FOO_SITE_METHOD = git
> 
> Although the download time is not much of an issue, the way we are using
> the github helper for now does not allow for setting more than one
> variable.
> 
> 
> In the end, solution 3 seems the most appropriate, and would require
> just a bit of easy modifications:
> 
>   - tweaking the github helper to emit the repository URL instead of the
>     tarball URL;
> 
>   - add FOO_SITE_METHOD to all packages.
> 
> The first one is trivial, and the second one is relatively easy with a
> bit of 'find' and some clever 'sed' experession.
> 
> I have all setup and ready to implement solution 3, but before I do, I'd
> like some feedback on those proposals, and if possible new proposals
> that are easy to implement for the release.
> 
> 
> Notes: I thought of a few other proposals, like:
> 
>   - define the github helper so that it emits the necessary variables:
>         $(eval $(call github,user,repo))
>     would expand to:
>         FOO_SITE = https://github.com/user/repo
>         FOO_SITE_METHOD = git
> 
>     That would allow us to introduce new types of forges, like gitorious
>     of bitbucket.
> 
>     But that's not so nice, because so far we always relied on only
>     setting variables, not generating Makefile code.

 I actually prefer this option. It makes it really easy to convert to option 2
later. Also I don't have a problem with generating more Makefile code.

> 
>     Also, it implies changing all packages as well.

 This we have to do anyway. But doing the eval approach now avoids that we have
to do it again in the future.

> 
>   - introduce new vriables to define a new semantic for the downloads:
>         FOO_FORGE = github
>         FOO_FORGE_PATH = /user/repo
>     which would be interpreted by the pkg-download infra to generate the
>     correct URL

 That's indeed another option if you want to avoid generating more Makefile
code. But I prefer generating Makefile code, because it reduces the complexity
of the generic infrastructure.


>     That would also allow us to add more forge-related handlers, like
>     gitorious or bitbucket (others?), but is really diverging from the
>     current semantics of the variables we already have.
> 
>     Also, it implies changing all packages as well.
> 
> So, we could not find a solution that would limit the changes to just the
> github helper; all solutions we have for now imply touching all packages
> hosted on GitHub.

 That's why it's important to now make sure we never have to do something like
that - i.e., we have to use one of the options you don't want to take :-)


 Regards,
 Arnout

> 
> Comments and suggestions most welcome! ;-)
> 
> Regards,
> Yann E. MORIN.
> 


-- 
Arnout Vandecappelle                          arnout at mind be
Senior Embedded Software Architect            +32-16-286500
Essensium/Mind                                http://www.mind.be
G.Geenslaan 9, 3001 Leuven, Belgium           BE 872 984 063 RPR Leuven
LinkedIn profile: http://www.linkedin.com/in/arnoutvandecappelle
GPG fingerprint:  7CB5 E4CC 6C2E EFD4 6E3D A754 F963 ECAB 2450 2F1F

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Buildroot] github helper is broken
  2014-11-13 18:23 [Buildroot] github helper is broken Yann E. MORIN
  2014-11-13 19:43 ` Arnout Vandecappelle
@ 2014-11-13 19:54 ` Thomas De Schampheleire
  2014-11-13 21:05   ` Yann E. MORIN
  1 sibling, 1 reply; 5+ messages in thread
From: Thomas De Schampheleire @ 2014-11-13 19:54 UTC (permalink / raw)
  To: buildroot

Hi Yann,

On Thu, Nov 13, 2014 at 7:23 PM, Yann E. MORIN <yann.morin.1998@free.fr> wrote:
> Hello All,
>
> The github helper is now broken, because GitHub changed their download
> scheme, again.
>
> The new scheme is:
>     https://github.com/USER/REPO/archive/VERSION.tar.gz

bummer!

>
> What is not-so-obvious in this new scheme, is that it omits the package
> name from the final component of the URU, so we can no longer rely on
> the default value of the FOO_SOURCE variable to construct the URL.

I wonder what drives people to invent such a scheme.
Shouldn't we (in addition to actions as discussed below) discuss the
issue with Github? They may understand our concerns, which surely are
present for other build systems as well.
Of course, we cannot fight every repo hosting site. Google code also
provides tarballs with scheme version.tar.gz, unfortunately.

>
> So we now have to set FOO_SOURCE explicitly, which was not needed so far,
> and set it to $(FOO_VERSION).tar.gz
>
> But then, it means we would store tarballs named FOO_VERSION.tar.gz,
> which is not so nice in the end.
>
> We've discussed this on IRC, and came to three main proposals:
>
> 1) rely on the server to pass an appropriate content-disposition header
> with the filename.
>
> Unfortunately, that's not doable in Buildroot:
>   - we need to know the filename prior to doing the download. We could
>     use a workaround by having wget do a HEAD fetch to just get the
>     filename;
>   - but then this would not work for off-line builds.
>
> So this solution is a no-no.
>
> 2) differentiate upstream tarball name from local tarball name
>
> This introduces a new variable, like FOO_UPSTREAM_SOURCE, which besides
> not being nice, would require quite some work in the pkg-download infra,
> which is a bit risky that close to the release.
>
> 3) no longer do tarball downloads from github, but do a git clone
>
> This would protect us from any future change in the GitHub tarballs
> download naming scheme. And after all, GitHub is a git hosting forge,
> so let's use it for what it is.
>
> The problems with that solution are two fold:
>   - downloads might take more time than a tarball download, since a
>     complete repository would probably be bigger than a single tarball;
>   - we must use the http:// scheme for the URL (because, proxies), so
>     all packages must now specify FOO_SITE_METHOD = git
>
> Although the download time is not much of an issue, the way we are using
> the github helper for now does not allow for setting more than one
> variable.

How do you conclude that the download time is not an issue?
Some repos may be significantly larger than the size of one revision,
we cannot know this in advance. Moreover, the complete repo is
complete overhead in the Buildroot case.

>
>
> In the end, solution 3 seems the most appropriate, and would require
> just a bit of easy modifications:
>
>   - tweaking the github helper to emit the repository URL instead of the
>     tarball URL;
>
>   - add FOO_SITE_METHOD to all packages.
>
> The first one is trivial, and the second one is relatively easy with a
> bit of 'find' and some clever 'sed' experession.
>
> I have all setup and ready to implement solution 3, but before I do, I'd
> like some feedback on those proposals, and if possible new proposals
> that are easy to implement for the release.
>
>
> Notes: I thought of a few other proposals, like:
>
>   - define the github helper so that it emits the necessary variables:
>         $(eval $(call github,user,repo))
>     would expand to:
>         FOO_SITE = https://github.com/user/repo
>         FOO_SITE_METHOD = git
>
>     That would allow us to introduce new types of forges, like gitorious
>     of bitbucket.
>
>     But that's not so nice, because so far we always relied on only
>     setting variables, not generating Makefile code.
>
>     Also, it implies changing all packages as well.

A similar strategy (for SITE/SOURCE) was discussed earlier in the
context of gitorious, following a patch from Alexandre Belloni that
was proposed earlier.
See http://lists.busybox.net/pipermail/buildroot/2014-January/086309.html
and ThomasP's reply:
http://lists.busybox.net/pipermail/buildroot/2014-January/086495.html

>
>   - introduce new vriables to define a new semantic for the downloads:
>         FOO_FORGE = github
>         FOO_FORGE_PATH = /user/repo
>     which would be interpreted by the pkg-download infra to generate the
>     correct URL
>
>     That would also allow us to add more forge-related handlers, like
>     gitorious or bitbucket (others?), but is really diverging from the
>     current semantics of the variables we already have.
>
>     Also, it implies changing all packages as well.
>
> So, we could not find a solution that would limit the changes to just the
> github helper; all solutions we have for now imply touching all packages
> hosted on GitHub.
>
> Comments and suggestions most welcome! ;-)
>

Best regards,
Thomas

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Buildroot] github helper is broken
  2014-11-13 19:54 ` Thomas De Schampheleire
@ 2014-11-13 21:05   ` Yann E. MORIN
  2014-11-15  8:28     ` Samuel Martin
  0 siblings, 1 reply; 5+ messages in thread
From: Yann E. MORIN @ 2014-11-13 21:05 UTC (permalink / raw)
  To: buildroot

Thomas, All,

On 2014-11-13 20:54 +0100, Thomas De Schampheleire spake thusly:
> On Thu, Nov 13, 2014 at 7:23 PM, Yann E. MORIN <yann.morin.1998@free.fr> wrote:
> > The github helper is now broken, because GitHub changed their download
> > scheme, again.

The isue has been resolved on the GitHub side: the current scheme we use
works again.

Still, we should strive at finding a long-term solution that is not
subject to such massive breakage.

[--SNIP--]
> Shouldn't we (in addition to actions as discussed below) discuss the
> issue with Github? They may understand our concerns, which surely are
> present for other build systems as well.

Yes, Samuel has volunteered into contacting them, and Maxime named a
name. ;-)

> Of course, we cannot fight every repo hosting site. Google code also
> provides tarballs with scheme version.tar.gz, unfortunately.

Eventually, I'd like we have support for that, too, because it is a pain
to deal with otherwise.

> > The problems with that solution are two fold:
> >   - downloads might take more time than a tarball download, since a
> >     complete repository would probably be bigger than a single tarball;
> >   - we must use the http:// scheme for the URL (because, proxies), so
> >     all packages must now specify FOO_SITE_METHOD = git
> >
> > Although the download time is not much of an issue, the way we are using
> > the github helper for now does not allow for setting more than one
> > variable.
> 
> How do you conclude that the download time is not an issue?

Not an issue in the face of a quick solution for the release. We're in
feature freeze, now.

> Some repos may be significantly larger than the size of one revision,
> we cannot know this in advance. Moreover, the complete repo is
> complete overhead in the Buildroot case.

Yes, but if we're using a tag, we are doing a shallow clone, which is
not significantly larger than the conrresponding tarball.

Only for SHA1s do we need the full clone.

But better a download that works even if it takes some time, rather than
a download that properly does not work. Hence "not much of an issue".

[--SNIP--]
> >   - define the github helper so that it emits the necessary variables:
> >         $(eval $(call github,user,repo))
> >     would expand to:
> >         FOO_SITE = https://github.com/user/repo
> >         FOO_SITE_METHOD = git
> >
> >     That would allow us to introduce new types of forges, like gitorious
> >     of bitbucket.
> >
> >     But that's not so nice, because so far we always relied on only
> >     setting variables, not generating Makefile code.
> >
> >     Also, it implies changing all packages as well.
> 
> A similar strategy (for SITE/SOURCE) was discussed earlier in the
> context of gitorious, following a patch from Alexandre Belloni that
> was proposed earlier.
> See http://lists.busybox.net/pipermail/buildroot/2014-January/086309.html
> and ThomasP's reply:
> http://lists.busybox.net/pipermail/buildroot/2014-January/086495.html

Yes, that has already popped up durring the IRC discussion (and the
reason why I hint at gitorious, too).

Regards,
Yann E. MORIN.

-- 
.-----------------.--------------------.------------------.--------------------.
|  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
| +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
| +33 223 225 172 `------------.-------:  X  AGAINST      |  \e/  There is no  |
| http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
'------------------------------^-------^------------------^--------------------'

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Buildroot] github helper is broken
  2014-11-13 21:05   ` Yann E. MORIN
@ 2014-11-15  8:28     ` Samuel Martin
  0 siblings, 0 replies; 5+ messages in thread
From: Samuel Martin @ 2014-11-15  8:28 UTC (permalink / raw)
  To: buildroot

Hi all,

On Thu, Nov 13, 2014 at 10:05 PM, Yann E. MORIN <yann.morin.1998@free.fr> wrote:
> Thomas, All,
>
> On 2014-11-13 20:54 +0100, Thomas De Schampheleire spake thusly:
>> On Thu, Nov 13, 2014 at 7:23 PM, Yann E. MORIN <yann.morin.1998@free.fr> wrote:
>> > The github helper is now broken, because GitHub changed their download
>> > scheme, again.
>
> The isue has been resolved on the GitHub side: the current scheme we use
> works again.
>
> Still, we should strive at finding a long-term solution that is not
> subject to such massive breakage.
>
> [--SNIP--]
>> Shouldn't we (in addition to actions as discussed below) discuss the
>> issue with Github? They may understand our concerns, which surely are
>> present for other build systems as well.
>
> Yes, Samuel has volunteered into contacting them, and Maxime named a
> name. ;-)
>

I've contacted github yesterday and got an answer in 2h! (pretty
responsive :-]).

They suggest to use the github APIs to get the download url.

<quote>
  You can use the repository contents API to get a stable download
link. However, your client will need to be able to follow 302
redirects to use this link.

  https://developer.github.com/v3/repos/contents/#get-archive-link

  Let me know if you have any other questions.
</quote>

I'm not sure the github APIs match all our needs, especially WRT the
url ending with the tarball filename...

Maybe we need to extend our download framework to support such cases,
it might be also useful for other forges.


Regards,

-- 
Samuel

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-11-15  8:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-13 18:23 [Buildroot] github helper is broken Yann E. MORIN
2014-11-13 19:43 ` Arnout Vandecappelle
2014-11-13 19:54 ` Thomas De Schampheleire
2014-11-13 21:05   ` Yann E. MORIN
2014-11-15  8:28     ` Samuel Martin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox