All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] bitbake: Rewrite fetch2.decodeurl() to use urlparse.urlsplit()
@ 2014-01-10 16:28 Phil Blundell
  2014-01-16 14:21 ` Martin Jansa
  2014-01-16 14:45 ` Olof Johansson
  0 siblings, 2 replies; 5+ messages in thread
From: Phil Blundell @ 2014-01-10 16:28 UTC (permalink / raw)
  To: bitbake-devel

This means that it now understands "standard" URI syntax as well as
the slightly odd legacy bitbake variant.

There are other places in bitbake (e.g. Local.urldata_init) that also 
need fixing, but this is a start.

Signed-off-by: Phil Blundell <pb@pbcl.net>
---
 lib/bb/fetch2/__init__.py | 60 ++++++++++++++++++++++++++---------------------
 1 file changed, 33 insertions(+), 27 deletions(-)

diff --git a/lib/bb/fetch2/__init__.py b/lib/bb/fetch2/__init__.py
index 260fb37..4886dae 100644
--- a/lib/bb/fetch2/__init__.py
+++ b/lib/bb/fetch2/__init__.py
@@ -329,40 +329,46 @@ def decodeurl(url):
     user, password, parameters).
     """
 
-    m = re.compile('(?P<type>[^:]*)://((?P<user>.+)@)?(?P<location>[^;]+)(;(?P<parm>.*))?').match(url)
-    if not m:
+    if url.startswith("file://"):
+        # This is an old-style bitbake URL.  Fix it up.
+        url = "file:" + url[7:]
+
+    import urlparse
+    d = urlparse.urlsplit(url)
+    if not d.scheme:
         raise MalformedUrl(url)
 
-    type = m.group('type')
-    location = m.group('location')
-    if not location:
+    netloc = d.netloc
+    path = d.path
+
+    if not path:
         raise MalformedUrl(url)
-    user = m.group('user')
-    parm = m.group('parm')
 
-    locidx = location.find('/')
-    if locidx != -1 and type.lower() != 'file':
-        host = location[:locidx]
-        path = location[locidx:]
-    else:
-        host = ""
-        path = location
-    if user:
-        m = re.compile('(?P<user>[^:]+)(:?(?P<pswd>.*))').match(user)
-        if m:
-            user = m.group('user')
-            pswd = m.group('pswd')
-    else:
-        user = ''
-        pswd = ''
+    user = ''
+    pswd = ''
+    host = ''
+
+    if netloc:
+        m = re.compile('((?P<user>[^:@]+)(:(?P<pswd>[^@]+))?@)?(?P<host>.+)').match(netloc)
+        if not m:
+            raise MalformedUrl(url)
+
+        user = m.group('user')
+        pswd = m.group('pswd')
+        host = m.group('host')
 
     p = {}
-    if parm:
-        for s in parm.split(';'):
-            s1, s2 = s.split('=')
-            p[s1] = s2
+    sep = path.find(";")
+    if sep != -1:
+        for s in path[sep+1:].split(';'):
+            try:
+                s1, s2 = s.split('=')
+                p[s1] = s2
+            except ValueError:
+                raise MalformedUrl(url)
+        path = path[:sep]
 
-    return type, host, urllib.unquote(path), user, pswd, p
+    return d.scheme, host, urllib.unquote(path), user, pswd, p
 
 def encodeurl(decoded):
     """Encodes a URL from tokens (scheme, network location, path,
-- 
1.8.5





^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH] bitbake: Rewrite fetch2.decodeurl() to use urlparse.urlsplit()
  2014-01-10 16:28 [RFC PATCH] bitbake: Rewrite fetch2.decodeurl() to use urlparse.urlsplit() Phil Blundell
@ 2014-01-16 14:21 ` Martin Jansa
  2014-01-16 14:44   ` Martin Jansa
  2014-01-16 14:45 ` Olof Johansson
  1 sibling, 1 reply; 5+ messages in thread
From: Martin Jansa @ 2014-01-16 14:21 UTC (permalink / raw)
  To: Phil Blundell; +Cc: bitbake-devel

[-- Attachment #1: Type: text/plain, Size: 3767 bytes --]

On Fri, Jan 10, 2014 at 04:28:43PM +0000, Phil Blundell wrote:
> This means that it now understands "standard" URI syntax as well as
> the slightly odd legacy bitbake variant.
> 
> There are other places in bitbake (e.g. Local.urldata_init) that also 
> need fixing, but this is a start.

I agree it's good start, I was trying to test this together with
http://lists.openembedded.org/pipermail/bitbake-devel/2014-January/004327.html

and bitbake-selftest shows failure on different URL, did it pass for you?
- ('http', 'www.google.com', '/index.html', None, None, {})
+ ('http', 'www.google.com', '/index.html', '', '', {})

+ few errors before that like:
File "/usr/lib64/python2.7/re.py", line 238, in _compile
    raise TypeError, "first argument must be string or compiled pattern"
  TypeError: first argument must be string or compiled pattern

> Signed-off-by: Phil Blundell <pb@pbcl.net>
> ---
>  lib/bb/fetch2/__init__.py | 60 ++++++++++++++++++++++++++---------------------
>  1 file changed, 33 insertions(+), 27 deletions(-)
> 
> diff --git a/lib/bb/fetch2/__init__.py b/lib/bb/fetch2/__init__.py
> index 260fb37..4886dae 100644
> --- a/lib/bb/fetch2/__init__.py
> +++ b/lib/bb/fetch2/__init__.py
> @@ -329,40 +329,46 @@ def decodeurl(url):
>      user, password, parameters).
>      """
>  
> -    m = re.compile('(?P<type>[^:]*)://((?P<user>.+)@)?(?P<location>[^;]+)(;(?P<parm>.*))?').match(url)
> -    if not m:
> +    if url.startswith("file://"):
> +        # This is an old-style bitbake URL.  Fix it up.
> +        url = "file:" + url[7:]
> +
> +    import urlparse
> +    d = urlparse.urlsplit(url)
> +    if not d.scheme:
>          raise MalformedUrl(url)
>  
> -    type = m.group('type')
> -    location = m.group('location')
> -    if not location:
> +    netloc = d.netloc
> +    path = d.path
> +
> +    if not path:
>          raise MalformedUrl(url)
> -    user = m.group('user')
> -    parm = m.group('parm')
>  
> -    locidx = location.find('/')
> -    if locidx != -1 and type.lower() != 'file':
> -        host = location[:locidx]
> -        path = location[locidx:]
> -    else:
> -        host = ""
> -        path = location
> -    if user:
> -        m = re.compile('(?P<user>[^:]+)(:?(?P<pswd>.*))').match(user)
> -        if m:
> -            user = m.group('user')
> -            pswd = m.group('pswd')
> -    else:
> -        user = ''
> -        pswd = ''
> +    user = ''
> +    pswd = ''
> +    host = ''
> +
> +    if netloc:
> +        m = re.compile('((?P<user>[^:@]+)(:(?P<pswd>[^@]+))?@)?(?P<host>.+)').match(netloc)
> +        if not m:
> +            raise MalformedUrl(url)
> +
> +        user = m.group('user')
> +        pswd = m.group('pswd')
> +        host = m.group('host')
>  
>      p = {}
> -    if parm:
> -        for s in parm.split(';'):
> -            s1, s2 = s.split('=')
> -            p[s1] = s2
> +    sep = path.find(";")
> +    if sep != -1:
> +        for s in path[sep+1:].split(';'):
> +            try:
> +                s1, s2 = s.split('=')
> +                p[s1] = s2
> +            except ValueError:
> +                raise MalformedUrl(url)
> +        path = path[:sep]
>  
> -    return type, host, urllib.unquote(path), user, pswd, p
> +    return d.scheme, host, urllib.unquote(path), user, pswd, p
>  
>  def encodeurl(decoded):
>      """Encodes a URL from tokens (scheme, network location, path,
> -- 
> 1.8.5
> 
> 
> 
> _______________________________________________
> bitbake-devel mailing list
> bitbake-devel@lists.openembedded.org
> http://lists.openembedded.org/mailman/listinfo/bitbake-devel

-- 
Martin 'JaMa' Jansa     jabber: Martin.Jansa@gmail.com

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 205 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH] bitbake: Rewrite fetch2.decodeurl() to use urlparse.urlsplit()
  2014-01-16 14:21 ` Martin Jansa
@ 2014-01-16 14:44   ` Martin Jansa
  0 siblings, 0 replies; 5+ messages in thread
From: Martin Jansa @ 2014-01-16 14:44 UTC (permalink / raw)
  To: Phil Blundell; +Cc: bitbake-devel

[-- Attachment #1: Type: text/plain, Size: 5281 bytes --]

On Thu, Jan 16, 2014 at 03:21:39PM +0100, Martin Jansa wrote:
> On Fri, Jan 10, 2014 at 04:28:43PM +0000, Phil Blundell wrote:
> > This means that it now understands "standard" URI syntax as well as
> > the slightly odd legacy bitbake variant.
> > 
> > There are other places in bitbake (e.g. Local.urldata_init) that also 
> > need fixing, but this is a start.
> 
> I agree it's good start, I was trying to test this together with
> http://lists.openembedded.org/pipermail/bitbake-devel/2014-January/004327.html
> 
> and bitbake-selftest shows failure on different URL, did it pass for you?
> - ('http', 'www.google.com', '/index.html', None, None, {})
> + ('http', 'www.google.com', '/index.html', '', '', {})
> 
> + few errors before that like:
> File "/usr/lib64/python2.7/re.py", line 238, in _compile
>     raise TypeError, "first argument must be string or compiled pattern"
>   TypeError: first argument must be string or compiled pattern

Returning empty string instead of None for user/pass seems to fix all
fetcher tests we currently have (including my with '@') and also the
TypeErrors from uri_replace

Here is what I did, sending inline as maybe the better way would be to
fix uri_replace (and possibly other places) to correctly work with None.

diff --git a/lib/bb/fetch2/__init__.py b/lib/bb/fetch2/__init__.py
index 1bbe0e7..da69500 100644
--- a/lib/bb/fetch2/__init__.py
+++ b/lib/bb/fetch2/__init__.py
@@ -344,18 +344,18 @@ def decodeurl(url):
     if not path:
         raise MalformedUrl(url)
 
-    user = ''
-    pswd = ''
-    host = ''
+    user = d.username or ''
+    pswd = d.password or ''
+    host = d.hostname or ''
 
     if netloc:
         m = re.compile('((?P<user>[^:@]+)(:(?P<pswd>[^@]+))?@)?(?P<host>.+)').match(netloc)
         if not m:
             raise MalformedUrl(url)
 
-        user = m.group('user')
-        pswd = m.group('pswd')
-        host = m.group('host')
+        user = m.group('user') or ''
+        pswd = m.group('pswd') or ''
+        host = m.group('host') or ''
 
     p = {}
     sep = path.find(";")

> > Signed-off-by: Phil Blundell <pb@pbcl.net>
> > ---
> >  lib/bb/fetch2/__init__.py | 60 ++++++++++++++++++++++++++---------------------
> >  1 file changed, 33 insertions(+), 27 deletions(-)
> > 
> > diff --git a/lib/bb/fetch2/__init__.py b/lib/bb/fetch2/__init__.py
> > index 260fb37..4886dae 100644
> > --- a/lib/bb/fetch2/__init__.py
> > +++ b/lib/bb/fetch2/__init__.py
> > @@ -329,40 +329,46 @@ def decodeurl(url):
> >      user, password, parameters).
> >      """
> >  
> > -    m = re.compile('(?P<type>[^:]*)://((?P<user>.+)@)?(?P<location>[^;]+)(;(?P<parm>.*))?').match(url)
> > -    if not m:
> > +    if url.startswith("file://"):
> > +        # This is an old-style bitbake URL.  Fix it up.
> > +        url = "file:" + url[7:]
> > +
> > +    import urlparse
> > +    d = urlparse.urlsplit(url)
> > +    if not d.scheme:
> >          raise MalformedUrl(url)
> >  
> > -    type = m.group('type')
> > -    location = m.group('location')
> > -    if not location:
> > +    netloc = d.netloc
> > +    path = d.path
> > +
> > +    if not path:
> >          raise MalformedUrl(url)
> > -    user = m.group('user')
> > -    parm = m.group('parm')
> >  
> > -    locidx = location.find('/')
> > -    if locidx != -1 and type.lower() != 'file':
> > -        host = location[:locidx]
> > -        path = location[locidx:]
> > -    else:
> > -        host = ""
> > -        path = location
> > -    if user:
> > -        m = re.compile('(?P<user>[^:]+)(:?(?P<pswd>.*))').match(user)
> > -        if m:
> > -            user = m.group('user')
> > -            pswd = m.group('pswd')
> > -    else:
> > -        user = ''
> > -        pswd = ''
> > +    user = ''
> > +    pswd = ''
> > +    host = ''
> > +
> > +    if netloc:
> > +        m = re.compile('((?P<user>[^:@]+)(:(?P<pswd>[^@]+))?@)?(?P<host>.+)').match(netloc)
> > +        if not m:
> > +            raise MalformedUrl(url)
> > +
> > +        user = m.group('user')
> > +        pswd = m.group('pswd')
> > +        host = m.group('host')
> >  
> >      p = {}
> > -    if parm:
> > -        for s in parm.split(';'):
> > -            s1, s2 = s.split('=')
> > -            p[s1] = s2
> > +    sep = path.find(";")
> > +    if sep != -1:
> > +        for s in path[sep+1:].split(';'):
> > +            try:
> > +                s1, s2 = s.split('=')
> > +                p[s1] = s2
> > +            except ValueError:
> > +                raise MalformedUrl(url)
> > +        path = path[:sep]
> >  
> > -    return type, host, urllib.unquote(path), user, pswd, p
> > +    return d.scheme, host, urllib.unquote(path), user, pswd, p
> >  
> >  def encodeurl(decoded):
> >      """Encodes a URL from tokens (scheme, network location, path,
> > -- 
> > 1.8.5
> > 
> > 
> > 
> > _______________________________________________
> > bitbake-devel mailing list
> > bitbake-devel@lists.openembedded.org
> > http://lists.openembedded.org/mailman/listinfo/bitbake-devel
> 
> -- 
> Martin 'JaMa' Jansa     jabber: Martin.Jansa@gmail.com



-- 
Martin 'JaMa' Jansa     jabber: Martin.Jansa@gmail.com

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 205 bytes --]

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH] bitbake: Rewrite fetch2.decodeurl() to use urlparse.urlsplit()
  2014-01-10 16:28 [RFC PATCH] bitbake: Rewrite fetch2.decodeurl() to use urlparse.urlsplit() Phil Blundell
  2014-01-16 14:21 ` Martin Jansa
@ 2014-01-16 14:45 ` Olof Johansson
  2014-01-17 12:33   ` Richard Purdie
  1 sibling, 1 reply; 5+ messages in thread
From: Olof Johansson @ 2014-01-16 14:45 UTC (permalink / raw)
  To: Phil Blundell; +Cc: bitbake-devel@lists.openembedded.org

On 14-01-10 17:28 +0100, Phil Blundell wrote:
> This means that it now understands "standard" URI syntax as well as
> the slightly odd legacy bitbake variant.
> 
> There are other places in bitbake (e.g. Local.urldata_init) that also 
> need fixing, but this is a start.

I wrote a URI class last year that got integrated to bitbake's
fetch2, but the commit that actually made decode/encodeurl a
wrapper around it was reverted because I missed adding support
for query params (oops :-)). The class itself is still intact
though (it's just above decodeurl).

I did send fixes for that (adding support for query params), but
they haven't been merged. Perhaps I should resend?

-- 
olofjn


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH] bitbake: Rewrite fetch2.decodeurl() to use urlparse.urlsplit()
  2014-01-16 14:45 ` Olof Johansson
@ 2014-01-17 12:33   ` Richard Purdie
  0 siblings, 0 replies; 5+ messages in thread
From: Richard Purdie @ 2014-01-17 12:33 UTC (permalink / raw)
  To: Olof Johansson; +Cc: bitbake-devel@lists.openembedded.org, Phil Blundell

On Thu, 2014-01-16 at 15:45 +0100, Olof Johansson wrote:
> On 14-01-10 17:28 +0100, Phil Blundell wrote:
> > This means that it now understands "standard" URI syntax as well as
> > the slightly odd legacy bitbake variant.
> > 
> > There are other places in bitbake (e.g. Local.urldata_init) that also 
> > need fixing, but this is a start.
> 
> I wrote a URI class last year that got integrated to bitbake's
> fetch2, but the commit that actually made decode/encodeurl a
> wrapper around it was reverted because I missed adding support
> for query params (oops :-)). The class itself is still intact
> though (it's just above decodeurl).
> 
> I did send fixes for that (adding support for query params), but
> they haven't been merged. Perhaps I should resend?

Please do, they've fallen off the radar...

Cheers,

Richard




^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-01-17 12:35 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-01-10 16:28 [RFC PATCH] bitbake: Rewrite fetch2.decodeurl() to use urlparse.urlsplit() Phil Blundell
2014-01-16 14:21 ` Martin Jansa
2014-01-16 14:44   ` Martin Jansa
2014-01-16 14:45 ` Olof Johansson
2014-01-17 12:33   ` Richard Purdie

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.