Buildroot Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [Buildroot] [PATCH] pkg-download: fix fetching URLs with ? from PRIMARY/SECONDARY
@ 2016-10-21 20:27 Arnout Vandecappelle
  2016-10-22 13:04 ` Thomas Petazzoni
  2016-10-23 12:54 ` Thomas Petazzoni
  0 siblings, 2 replies; 7+ messages in thread
From: Arnout Vandecappelle @ 2016-10-21 20:27 UTC (permalink / raw)
  To: buildroot

Some packages download files (especially patches) with a ? in the
URL. The ? marks the query part of the URL. However, the downloaded
file still contains the ? but from then on it doesn't designate a
query part anymore. Therefore, when fetching from PRIMARY or
SECONDARY site over http, the server will report a 404 Not Found.

To fix, we need to replace the ? with %3F. Obviously, this should
be done only when fetching from PRIMARY or SECONDARY. For fetching
from the real upstream, the ? really does designate the query part.

Fixes #9371.

Reported-by: Johan Derycke <johanderycke@gmail.com>
Signed-off-by: Arnout Vandecappelle (Essensium/Mind) <arnout@mind.be>
---
I tested with
make BR2_PRIMARY_SITE=http://localhost/dl BR2_PRIMARY_SITE_ONLY=y BR2_DL_DIR=/tmp/dl openssl
and also checked with a file:// primary site, with secondary site, and
openssl-source-check
---
 package/pkg-download.mk | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/package/pkg-download.mk b/package/pkg-download.mk
index 0f542e6..65e753c 100644
--- a/package/pkg-download.mk
+++ b/package/pkg-download.mk
@@ -210,6 +210,10 @@ endef
 #
 # E.G. use like this:
 # $(call DOWNLOAD,$(FOO_SITE))
+#
+# For PRIMARY and SECONDARY site, any ? in the URL is replaced by %3F. A ? in
+# the URL is used to separate query arguments, but the PRIMARY and SECONDARY
+# sites serve just plain files.
 ################################################################################
 
 define DOWNLOAD
@@ -226,7 +230,7 @@ define DOWNLOAD_INNER
 		case "$(call geturischeme,$(BR2_PRIMARY_SITE))" in \
 			file) $(call $(3)_LOCALFILES,$(BR2_PRIMARY_SITE)/$(2),$(2)) && exit ;; \
 			scp) $(call $(3)_SCP,$(BR2_PRIMARY_SITE)/$(2),$(2)) && exit ;; \
-			*) $(call $(3)_WGET,$(BR2_PRIMARY_SITE)/$(2),$(2)) && exit ;; \
+			*) $(call $(3)_WGET,$(BR2_PRIMARY_SITE)/$(subst ?,%3F,$(2)),$(2)) && exit ;; \
 		esac ; \
 	fi ; \
 	if test "$(BR2_PRIMARY_SITE_ONLY)" = "y" ; then \
@@ -245,7 +249,7 @@ define DOWNLOAD_INNER
 		esac ; \
 	fi ; \
 	if test -n "$(call qstrip,$(BR2_BACKUP_SITE))" ; then \
-		$(call $(3)_WGET,$(BR2_BACKUP_SITE)/$(2),$(2)) && exit ; \
+		$(call $(3)_WGET,$(BR2_BACKUP_SITE)/$(subst ?,%3F,$(2)),$(2)) && exit ; \
 	fi ; \
 	exit 1
 endef
-- 
2.9.3

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [Buildroot] [PATCH] pkg-download: fix fetching URLs with ? from PRIMARY/SECONDARY
  2016-10-21 20:27 [Buildroot] [PATCH] pkg-download: fix fetching URLs with ? from PRIMARY/SECONDARY Arnout Vandecappelle
@ 2016-10-22 13:04 ` Thomas Petazzoni
  2016-10-23  9:09   ` Peter Korsgaard
  2016-10-23 15:21   ` Arnout Vandecappelle
  2016-10-23 12:54 ` Thomas Petazzoni
  1 sibling, 2 replies; 7+ messages in thread
From: Thomas Petazzoni @ 2016-10-22 13:04 UTC (permalink / raw)
  To: buildroot

Hello,

There is nothing named "SECONDARY" in Buildroot, it's called "BACKUP".
This can be fixed when applying the patch.

On Fri, 21 Oct 2016 22:27:15 +0200, Arnout Vandecappelle
(Essensium/Mind) wrote:
> Some packages download files (especially patches) with a ? in the
> URL. The ? marks the query part of the URL. However, the downloaded
> file still contains the ? but from then on it doesn't designate a
> query part anymore. Therefore, when fetching from PRIMARY or
> SECONDARY site over http, the server will report a 404 Not Found.
> 
> To fix, we need to replace the ? with %3F. Obviously, this should
> be done only when fetching from PRIMARY or SECONDARY. For fetching
> from the real upstream, the ? really does designate the query part.

One thing I'm wondering is if the '?' character is the only one that
needs to be escaped like this, or not.

Thanks,

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Buildroot] [PATCH] pkg-download: fix fetching URLs with ? from PRIMARY/SECONDARY
  2016-10-22 13:04 ` Thomas Petazzoni
@ 2016-10-23  9:09   ` Peter Korsgaard
  2016-10-23 15:21   ` Arnout Vandecappelle
  1 sibling, 0 replies; 7+ messages in thread
From: Peter Korsgaard @ 2016-10-23  9:09 UTC (permalink / raw)
  To: buildroot

>>>>> "Thomas" == Thomas Petazzoni <thomas.petazzoni@free-electrons.com> writes:

 > Hello,
 > There is nothing named "SECONDARY" in Buildroot, it's called "BACKUP".
 > This can be fixed when applying the patch.

 > On Fri, 21 Oct 2016 22:27:15 +0200, Arnout Vandecappelle
 > (Essensium/Mind) wrote:
 >> Some packages download files (especially patches) with a ? in the
 >> URL. The ? marks the query part of the URL. However, the downloaded
 >> file still contains the ? but from then on it doesn't designate a
 >> query part anymore. Therefore, when fetching from PRIMARY or
 >> SECONDARY site over http, the server will report a 404 Not Found.
 >> 
 >> To fix, we need to replace the ? with %3F. Obviously, this should
 >> be done only when fetching from PRIMARY or SECONDARY. For fetching
 >> from the real upstream, the ? really does designate the query part.

 > One thing I'm wondering is if the '?' character is the only one that
 > needs to be escaped like this, or not.

I believe it is, as we already quote the URLs correctly so they don't
confuse the shell and wget afaik correctly URL encodes URL before
requesting the location from the web server.

It works here at least with lighttpd:

echo ok > '/var/www/url?with&special!characters in'

wget 'http://127.0.0.1/url?with&special!characters in'
--2016-10-23 11:04:49--  http://127.0.0.1/url?with&special!characters%20in
Connecting to 127.0.0.1:80... connected.
HTTP request sent, awaiting response... 404 Not Found
2016-10-23 11:04:49 ERROR 404: Not Found.

wget 'http://127.0.0.1/url%3fwith&special!characters in'
--2016-10-23 11:04:27--  http://127.0.0.1/url%3fwith&special!characters%20in
Connecting to 127.0.0.1:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3 [application/octet-stream]
Saving to: ?url?with&special!characters in?

2016-10-23 11:04:27 (316 KB/s) - ?url?with&special!characters in? saved [3/3]


We have seen this problem on sources.buildroot.org as well. I have so
far manually fixed it up by renaming the files to drop ?.., so that will
need to be undone when this gets applied.

-- 
Bye, Peter Korsgaard

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Buildroot] [PATCH] pkg-download: fix fetching URLs with ? from PRIMARY/SECONDARY
  2016-10-21 20:27 [Buildroot] [PATCH] pkg-download: fix fetching URLs with ? from PRIMARY/SECONDARY Arnout Vandecappelle
  2016-10-22 13:04 ` Thomas Petazzoni
@ 2016-10-23 12:54 ` Thomas Petazzoni
  2016-10-23 15:23   ` Arnout Vandecappelle
  2016-10-23 18:31   ` Peter Korsgaard
  1 sibling, 2 replies; 7+ messages in thread
From: Thomas Petazzoni @ 2016-10-23 12:54 UTC (permalink / raw)
  To: buildroot

Hello,

On Fri, 21 Oct 2016 22:27:15 +0200, Arnout Vandecappelle
(Essensium/Mind) wrote:
> Some packages download files (especially patches) with a ? in the
> URL. The ? marks the query part of the URL. However, the downloaded
> file still contains the ? but from then on it doesn't designate a
> query part anymore. Therefore, when fetching from PRIMARY or
> SECONDARY site over http, the server will report a 404 Not Found.
> 
> To fix, we need to replace the ? with %3F. Obviously, this should
> be done only when fetching from PRIMARY or SECONDARY. For fetching
> from the real upstream, the ? really does designate the query part.
> 
> Fixes #9371.
> 
> Reported-by: Johan Derycke <johanderycke@gmail.com>
> Signed-off-by: Arnout Vandecappelle (Essensium/Mind) <arnout@mind.be>
> ---
> I tested with
> make BR2_PRIMARY_SITE=http://localhost/dl BR2_PRIMARY_SITE_ONLY=y BR2_DL_DIR=/tmp/dl openssl
> and also checked with a file:// primary site, with secondary site, and
> openssl-source-check
> ---
>  package/pkg-download.mk | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)

Applied to master, after s/SECONDARY/BACKUP/.

Peter can you take the appropriate actions in sources.b.o ?

Thanks!

Thomas
-- 
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Buildroot] [PATCH] pkg-download: fix fetching URLs with ? from PRIMARY/SECONDARY
  2016-10-22 13:04 ` Thomas Petazzoni
  2016-10-23  9:09   ` Peter Korsgaard
@ 2016-10-23 15:21   ` Arnout Vandecappelle
  1 sibling, 0 replies; 7+ messages in thread
From: Arnout Vandecappelle @ 2016-10-23 15:21 UTC (permalink / raw)
  To: buildroot



On 22-10-16 15:04, Thomas Petazzoni wrote:
> Hello,
> 
> There is nothing named "SECONDARY" in Buildroot, it's called "BACKUP".
> This can be fixed when applying the patch.
> 
> On Fri, 21 Oct 2016 22:27:15 +0200, Arnout Vandecappelle
> (Essensium/Mind) wrote:
>> Some packages download files (especially patches) with a ? in the
>> URL. The ? marks the query part of the URL. However, the downloaded
>> file still contains the ? but from then on it doesn't designate a
>> query part anymore. Therefore, when fetching from PRIMARY or
>> SECONDARY site over http, the server will report a 404 Not Found.
>>
>> To fix, we need to replace the ? with %3F. Obviously, this should
>> be done only when fetching from PRIMARY or SECONDARY. For fetching
>> from the real upstream, the ? really does designate the query part.
> 
> One thing I'm wondering is if the '?' character is the only one that
> needs to be escaped like this, or not.

 From wikipedia, an URL is:

scheme:[//[user:password@]host[:port]][/]path[?query][#fragment]

so after the path, only ? and # are interpreted differently between the normal
download and the PRIMARY/BACKUP download. You might think that & (or ;) is also
special, but those characters are only special when they appear _after_ '?'.
Just to be sure I tested a ?...&... scenario (just adding \&foo after one of the
openssl patches) and with this patch it works correctly both for normal and
PRIMARY downloads. [1]

 As to #fragment, that's very simple: it can never occur because it's
interpreted by the client and doesn't get sent to the server. If a # ends up in
a URL somehow, it already has to be escaped anyway or it won't get passed by wget.

 Note: I haven't tested a PRIMARY that has scp or file download method, but
those are not affected by this patch anyway.

 Regards,
 Arnout

[1]  Well, actually it doesn't work at all, our download handling fails
dramatically when there is a & in the URL. I tried fixing it by adding quoting
here and there, but it's not exactly trivial. Since we don't have URLs with &
and I don't expect any, I'm not continuing work on this. But at least using our
dl-wrapper works for a URL like ...%3F...&...

-- 
Arnout Vandecappelle                          arnout at mind be
Senior Embedded Software Architect            +32-16-286500
Essensium/Mind                                http://www.mind.be
G.Geenslaan 9, 3001 Leuven, Belgium           BE 872 984 063 RPR Leuven
LinkedIn profile: http://www.linkedin.com/in/arnoutvandecappelle
GPG fingerprint:  7493 020B C7E3 8618 8DEC 222C 82EB F404 F9AC 0DDF

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Buildroot] [PATCH] pkg-download: fix fetching URLs with ? from PRIMARY/SECONDARY
  2016-10-23 12:54 ` Thomas Petazzoni
@ 2016-10-23 15:23   ` Arnout Vandecappelle
  2016-10-23 18:31   ` Peter Korsgaard
  1 sibling, 0 replies; 7+ messages in thread
From: Arnout Vandecappelle @ 2016-10-23 15:23 UTC (permalink / raw)
  To: buildroot



On 23-10-16 14:54, Thomas Petazzoni wrote:
> Hello,
> 
> On Fri, 21 Oct 2016 22:27:15 +0200, Arnout Vandecappelle
> (Essensium/Mind) wrote:
>> Some packages download files (especially patches) with a ? in the
>> URL. The ? marks the query part of the URL. However, the downloaded
>> file still contains the ? but from then on it doesn't designate a
>> query part anymore. Therefore, when fetching from PRIMARY or
>> SECONDARY site over http, the server will report a 404 Not Found.
>>
>> To fix, we need to replace the ? with %3F. Obviously, this should
>> be done only when fetching from PRIMARY or SECONDARY. For fetching
>> from the real upstream, the ? really does designate the query part.
>>
>> Fixes #9371.
>>
>> Reported-by: Johan Derycke <johanderycke@gmail.com>
>> Signed-off-by: Arnout Vandecappelle (Essensium/Mind) <arnout@mind.be>
>> ---
>> I tested with
>> make BR2_PRIMARY_SITE=http://localhost/dl BR2_PRIMARY_SITE_ONLY=y BR2_DL_DIR=/tmp/dl openssl
>> and also checked with a file:// primary site, with secondary site, and
>> openssl-source-check
>> ---
>>  package/pkg-download.mk | 8 ++++++--
>>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> Applied to master, after s/SECONDARY/BACKUP/.
> 
> Peter can you take the appropriate actions in sources.b.o ?

 There's nothing to do AFAICS. I did test a download of the openssl patches from
sources.buildroot.net and that worked.

 Regards,
 Arnout

-- 
Arnout Vandecappelle                          arnout at mind be
Senior Embedded Software Architect            +32-16-286500
Essensium/Mind                                http://www.mind.be
G.Geenslaan 9, 3001 Leuven, Belgium           BE 872 984 063 RPR Leuven
LinkedIn profile: http://www.linkedin.com/in/arnoutvandecappelle
GPG fingerprint:  7493 020B C7E3 8618 8DEC 222C 82EB F404 F9AC 0DDF

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Buildroot] [PATCH] pkg-download: fix fetching URLs with ? from PRIMARY/SECONDARY
  2016-10-23 12:54 ` Thomas Petazzoni
  2016-10-23 15:23   ` Arnout Vandecappelle
@ 2016-10-23 18:31   ` Peter Korsgaard
  1 sibling, 0 replies; 7+ messages in thread
From: Peter Korsgaard @ 2016-10-23 18:31 UTC (permalink / raw)
  To: buildroot

>>>>> "Thomas" == Thomas Petazzoni <thomas.petazzoni@free-electrons.com> writes:

Hi,

 >> package/pkg-download.mk | 8 ++++++--
 >> 1 file changed, 6 insertions(+), 2 deletions(-)

 > Applied to master, after s/SECONDARY/BACKUP/.

 > Peter can you take the appropriate actions in sources.b.o ?

Yes. From a quick look it seems I copied files instead of renaming them,
so everything should be good (I'll leave the copies for older BR
versions).

-- 
Venlig hilsen,
Peter Korsgaard 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-10-23 18:31 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-10-21 20:27 [Buildroot] [PATCH] pkg-download: fix fetching URLs with ? from PRIMARY/SECONDARY Arnout Vandecappelle
2016-10-22 13:04 ` Thomas Petazzoni
2016-10-23  9:09   ` Peter Korsgaard
2016-10-23 15:21   ` Arnout Vandecappelle
2016-10-23 12:54 ` Thomas Petazzoni
2016-10-23 15:23   ` Arnout Vandecappelle
2016-10-23 18:31   ` Peter Korsgaard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox