All of lore.kernel.org
 help / color / mirror / Atom feed
* [bitbake-devel][PATCH] fetch2: git: Check if directory is a bare git repo
@ 2023-08-04 15:54 Joshua Watt
  2023-08-09  1:52 ` Alexandre Belloni
  2023-08-24 19:53 ` [bitbake-devel][PATCH v2] fetch2: git: Check if clone directory is a " Joshua Watt
  0 siblings, 2 replies; 12+ messages in thread
From: Joshua Watt @ 2023-08-04 15:54 UTC (permalink / raw)
  To: bitbake-devel; +Cc: Joshua Watt

If the clone target directory exists, but isn't the valid top level of a
bare git repo, it needs to be erased and re-cloned. One example of how
this can happen is if a clone creates the directory, but then fails to
actual clone and make it a git repository. This left-over directory can
be particularly problematic if the download directory is a descent of
some top layer git repo (e.g. the default with poky), as the commands
that operate on the remote later will then mangle the layers git
repository instead of the download git repo.

Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
---
 bitbake/lib/bb/fetch2/git.py | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py
index 2a3c06fe4e9..112bed294f0 100644
--- a/bitbake/lib/bb/fetch2/git.py
+++ b/bitbake/lib/bb/fetch2/git.py
@@ -65,6 +65,7 @@ import fnmatch
 import os
 import re
 import shlex
+import shutil
 import subprocess
 import tempfile
 import bb
@@ -365,8 +366,29 @@ class Git(FetchMethod):
                 runfetchcmd(fetch_cmd, d, workdir=ud.clonedir)
         repourl = self._get_repo_url(ud)
 
+        needs_clone = False
+        if os.path.exists(ud.clonedir):
+            # The directory may exist, but not be the top level of a bare git
+            # repository in which case it needs to be deleted and re-cloned.
+            try:
+                # Since clones are bare, use --absolute-git-dir instead of --show-toplevel
+                output = runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir)
+            except bb.fetch2.FetchError as e:
+                logger.warning("Unable to get top level for %s (not a git directory?): %s", ud.clonedir, e)
+                needs_clone = True
+            else:
+                toplevel = output.rstrip()
+                if os.path.abspath(toplevel) != os.path.abspath(ud.clonedir):
+                    logger.warning("Top level directory '%s' doesn't match expected '%s'. Re-cloning", toplevel, ud.clonedir)
+                    needs_clone = True
+
+            if needs_clone:
+                shutil.rmtree(ud.clonedir)
+        else:
+            needs_clone = True
+
         # If the repo still doesn't exist, fallback to cloning it
-        if not os.path.exists(ud.clonedir):
+        if needs_clone:
             # We do this since git will use a "-l" option automatically for local urls where possible,
             # but it doesn't work when git/objects is a symlink, only works when it is a directory.
             if repourl.startswith("file://"):
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [bitbake-devel][PATCH] fetch2: git: Check if directory is a bare git repo
  2023-08-04 15:54 [bitbake-devel][PATCH] fetch2: git: Check if directory is a bare git repo Joshua Watt
@ 2023-08-09  1:52 ` Alexandre Belloni
  2023-08-17 19:19   ` Joshua Watt
  2023-08-24 19:53 ` [bitbake-devel][PATCH v2] fetch2: git: Check if clone directory is a " Joshua Watt
  1 sibling, 1 reply; 12+ messages in thread
From: Alexandre Belloni @ 2023-08-09  1:52 UTC (permalink / raw)
  To: Joshua Watt; +Cc: bitbake-devel

Hello Joshua,

I'm fairly sure this causes the following bitbake-selftest failure:

https://autobuilder.yoctoproject.org/typhoon/#/builders/56/builds/2307/steps/11/logs/stdio

On 04/08/2023 09:54:24-0600, Joshua Watt wrote:
> If the clone target directory exists, but isn't the valid top level of a
> bare git repo, it needs to be erased and re-cloned. One example of how
> this can happen is if a clone creates the directory, but then fails to
> actual clone and make it a git repository. This left-over directory can
> be particularly problematic if the download directory is a descent of
> some top layer git repo (e.g. the default with poky), as the commands
> that operate on the remote later will then mangle the layers git
> repository instead of the download git repo.
> 
> Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
> ---
>  bitbake/lib/bb/fetch2/git.py | 24 +++++++++++++++++++++++-
>  1 file changed, 23 insertions(+), 1 deletion(-)
> 
> diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py
> index 2a3c06fe4e9..112bed294f0 100644
> --- a/bitbake/lib/bb/fetch2/git.py
> +++ b/bitbake/lib/bb/fetch2/git.py
> @@ -65,6 +65,7 @@ import fnmatch
>  import os
>  import re
>  import shlex
> +import shutil
>  import subprocess
>  import tempfile
>  import bb
> @@ -365,8 +366,29 @@ class Git(FetchMethod):
>                  runfetchcmd(fetch_cmd, d, workdir=ud.clonedir)
>          repourl = self._get_repo_url(ud)
>  
> +        needs_clone = False
> +        if os.path.exists(ud.clonedir):
> +            # The directory may exist, but not be the top level of a bare git
> +            # repository in which case it needs to be deleted and re-cloned.
> +            try:
> +                # Since clones are bare, use --absolute-git-dir instead of --show-toplevel
> +                output = runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir)
> +            except bb.fetch2.FetchError as e:
> +                logger.warning("Unable to get top level for %s (not a git directory?): %s", ud.clonedir, e)
> +                needs_clone = True
> +            else:
> +                toplevel = output.rstrip()
> +                if os.path.abspath(toplevel) != os.path.abspath(ud.clonedir):
> +                    logger.warning("Top level directory '%s' doesn't match expected '%s'. Re-cloning", toplevel, ud.clonedir)
> +                    needs_clone = True
> +
> +            if needs_clone:
> +                shutil.rmtree(ud.clonedir)
> +        else:
> +            needs_clone = True
> +
>          # If the repo still doesn't exist, fallback to cloning it
> -        if not os.path.exists(ud.clonedir):
> +        if needs_clone:
>              # We do this since git will use a "-l" option automatically for local urls where possible,
>              # but it doesn't work when git/objects is a symlink, only works when it is a directory.
>              if repourl.startswith("file://"):
> -- 
> 2.33.0
> 

> 
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#14915): https://lists.openembedded.org/g/bitbake-devel/message/14915
> Mute This Topic: https://lists.openembedded.org/mt/100548945/3617179
> Group Owner: bitbake-devel+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [alexandre.belloni@bootlin.com]
> -=-=-=-=-=-=-=-=-=-=-=-
> 


-- 
Alexandre Belloni, co-owner and COO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [bitbake-devel][PATCH] fetch2: git: Check if directory is a bare git repo
  2023-08-09  1:52 ` Alexandre Belloni
@ 2023-08-17 19:19   ` Joshua Watt
  2023-08-17 20:14     ` Alexandre Belloni
       [not found]     ` <177C44DDA5F18A0D.21730@lists.openembedded.org>
  0 siblings, 2 replies; 12+ messages in thread
From: Joshua Watt @ 2023-08-17 19:19 UTC (permalink / raw)
  To: Alexandre Belloni; +Cc: bitbake-devel

Alexandre,

I'm not sure that's quite correct; I was able to run the test fine
locally, and that failure looks like a 502 error from the server,
which really shouldn't have anything to do with this patch?

On Tue, Aug 8, 2023 at 7:52 PM Alexandre Belloni
<alexandre.belloni@bootlin.com> wrote:
>
> Hello Joshua,
>
> I'm fairly sure this causes the following bitbake-selftest failure:
>
> https://autobuilder.yoctoproject.org/typhoon/#/builders/56/builds/2307/steps/11/logs/stdio
>
> On 04/08/2023 09:54:24-0600, Joshua Watt wrote:
> > If the clone target directory exists, but isn't the valid top level of a
> > bare git repo, it needs to be erased and re-cloned. One example of how
> > this can happen is if a clone creates the directory, but then fails to
> > actual clone and make it a git repository. This left-over directory can
> > be particularly problematic if the download directory is a descent of
> > some top layer git repo (e.g. the default with poky), as the commands
> > that operate on the remote later will then mangle the layers git
> > repository instead of the download git repo.
> >
> > Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
> > ---
> >  bitbake/lib/bb/fetch2/git.py | 24 +++++++++++++++++++++++-
> >  1 file changed, 23 insertions(+), 1 deletion(-)
> >
> > diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py
> > index 2a3c06fe4e9..112bed294f0 100644
> > --- a/bitbake/lib/bb/fetch2/git.py
> > +++ b/bitbake/lib/bb/fetch2/git.py
> > @@ -65,6 +65,7 @@ import fnmatch
> >  import os
> >  import re
> >  import shlex
> > +import shutil
> >  import subprocess
> >  import tempfile
> >  import bb
> > @@ -365,8 +366,29 @@ class Git(FetchMethod):
> >                  runfetchcmd(fetch_cmd, d, workdir=ud.clonedir)
> >          repourl = self._get_repo_url(ud)
> >
> > +        needs_clone = False
> > +        if os.path.exists(ud.clonedir):
> > +            # The directory may exist, but not be the top level of a bare git
> > +            # repository in which case it needs to be deleted and re-cloned.
> > +            try:
> > +                # Since clones are bare, use --absolute-git-dir instead of --show-toplevel
> > +                output = runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir)
> > +            except bb.fetch2.FetchError as e:
> > +                logger.warning("Unable to get top level for %s (not a git directory?): %s", ud.clonedir, e)
> > +                needs_clone = True
> > +            else:
> > +                toplevel = output.rstrip()
> > +                if os.path.abspath(toplevel) != os.path.abspath(ud.clonedir):
> > +                    logger.warning("Top level directory '%s' doesn't match expected '%s'. Re-cloning", toplevel, ud.clonedir)
> > +                    needs_clone = True
> > +
> > +            if needs_clone:
> > +                shutil.rmtree(ud.clonedir)
> > +        else:
> > +            needs_clone = True
> > +
> >          # If the repo still doesn't exist, fallback to cloning it
> > -        if not os.path.exists(ud.clonedir):
> > +        if needs_clone:
> >              # We do this since git will use a "-l" option automatically for local urls where possible,
> >              # but it doesn't work when git/objects is a symlink, only works when it is a directory.
> >              if repourl.startswith("file://"):
> > --
> > 2.33.0
> >
>
> >
> > -=-=-=-=-=-=-=-=-=-=-=-
> > Links: You receive all messages sent to this group.
> > View/Reply Online (#14915): https://lists.openembedded.org/g/bitbake-devel/message/14915
> > Mute This Topic: https://lists.openembedded.org/mt/100548945/3617179
> > Group Owner: bitbake-devel+owner@lists.openembedded.org
> > Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [alexandre.belloni@bootlin.com]
> > -=-=-=-=-=-=-=-=-=-=-=-
> >
>
>
> --
> Alexandre Belloni, co-owner and COO, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [bitbake-devel][PATCH] fetch2: git: Check if directory is a bare git repo
  2023-08-17 19:19   ` Joshua Watt
@ 2023-08-17 20:14     ` Alexandre Belloni
       [not found]     ` <177C44DDA5F18A0D.21730@lists.openembedded.org>
  1 sibling, 0 replies; 12+ messages in thread
From: Alexandre Belloni @ 2023-08-17 20:14 UTC (permalink / raw)
  To: Joshua Watt; +Cc: bitbake-devel

On 17/08/2023 13:19:56-0600, Joshua Watt wrote:
> Alexandre,
> 
> I'm not sure that's quite correct; I was able to run the test fine
> locally, and that failure looks like a 502 error from the server,
> which really shouldn't have anything to do with this patch?
> 

I'm pretty sure you are right...

> On Tue, Aug 8, 2023 at 7:52 PM Alexandre Belloni
> <alexandre.belloni@bootlin.com> wrote:
> >
> > Hello Joshua,
> >
> > I'm fairly sure this causes the following bitbake-selftest failure:
> >
> > https://autobuilder.yoctoproject.org/typhoon/#/builders/56/builds/2307/steps/11/logs/stdio
> >
> > On 04/08/2023 09:54:24-0600, Joshua Watt wrote:
> > > If the clone target directory exists, but isn't the valid top level of a
> > > bare git repo, it needs to be erased and re-cloned. One example of how
> > > this can happen is if a clone creates the directory, but then fails to
> > > actual clone and make it a git repository. This left-over directory can
> > > be particularly problematic if the download directory is a descent of
> > > some top layer git repo (e.g. the default with poky), as the commands
> > > that operate on the remote later will then mangle the layers git
> > > repository instead of the download git repo.
> > >
> > > Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
> > > ---
> > >  bitbake/lib/bb/fetch2/git.py | 24 +++++++++++++++++++++++-
> > >  1 file changed, 23 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py
> > > index 2a3c06fe4e9..112bed294f0 100644
> > > --- a/bitbake/lib/bb/fetch2/git.py
> > > +++ b/bitbake/lib/bb/fetch2/git.py
> > > @@ -65,6 +65,7 @@ import fnmatch
> > >  import os
> > >  import re
> > >  import shlex
> > > +import shutil
> > >  import subprocess
> > >  import tempfile
> > >  import bb
> > > @@ -365,8 +366,29 @@ class Git(FetchMethod):
> > >                  runfetchcmd(fetch_cmd, d, workdir=ud.clonedir)
> > >          repourl = self._get_repo_url(ud)
> > >
> > > +        needs_clone = False
> > > +        if os.path.exists(ud.clonedir):
> > > +            # The directory may exist, but not be the top level of a bare git
> > > +            # repository in which case it needs to be deleted and re-cloned.
> > > +            try:
> > > +                # Since clones are bare, use --absolute-git-dir instead of --show-toplevel
> > > +                output = runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir)
> > > +            except bb.fetch2.FetchError as e:
> > > +                logger.warning("Unable to get top level for %s (not a git directory?): %s", ud.clonedir, e)
> > > +                needs_clone = True
> > > +            else:
> > > +                toplevel = output.rstrip()
> > > +                if os.path.abspath(toplevel) != os.path.abspath(ud.clonedir):
> > > +                    logger.warning("Top level directory '%s' doesn't match expected '%s'. Re-cloning", toplevel, ud.clonedir)
> > > +                    needs_clone = True
> > > +
> > > +            if needs_clone:
> > > +                shutil.rmtree(ud.clonedir)
> > > +        else:
> > > +            needs_clone = True
> > > +
> > >          # If the repo still doesn't exist, fallback to cloning it
> > > -        if not os.path.exists(ud.clonedir):
> > > +        if needs_clone:
> > >              # We do this since git will use a "-l" option automatically for local urls where possible,
> > >              # but it doesn't work when git/objects is a symlink, only works when it is a directory.
> > >              if repourl.startswith("file://"):
> > > --
> > > 2.33.0
> > >
> >
> > >
> > > -=-=-=-=-=-=-=-=-=-=-=-
> > > Links: You receive all messages sent to this group.
> > > View/Reply Online (#14915): https://lists.openembedded.org/g/bitbake-devel/message/14915
> > > Mute This Topic: https://lists.openembedded.org/mt/100548945/3617179
> > > Group Owner: bitbake-devel+owner@lists.openembedded.org
> > > Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [alexandre.belloni@bootlin.com]
> > > -=-=-=-=-=-=-=-=-=-=-=-
> > >
> >
> >
> > --
> > Alexandre Belloni, co-owner and COO, Bootlin
> > Embedded Linux and Kernel engineering
> > https://bootlin.com

-- 
Alexandre Belloni, co-owner and COO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [bitbake-devel][PATCH] fetch2: git: Check if directory is a bare git repo
       [not found]     ` <177C44DDA5F18A0D.21730@lists.openembedded.org>
@ 2023-08-19 12:21       ` Alexandre Belloni
  2023-08-24 19:54         ` Joshua Watt
  0 siblings, 1 reply; 12+ messages in thread
From: Alexandre Belloni @ 2023-08-19 12:21 UTC (permalink / raw)
  To: Joshua Watt, bitbake-devel

This was because I didn't point to the correct build error:

======================================================================
ERROR: test_mirror_commit_exists (bb.tests.fetch.FetchPremirroronlyLocalTest.test_mirror_commit_exists)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/pokybuild/yocto-worker/oe-selftest-fedora/build/bitbake/lib/bb/tests/fetch.py", line 3075, in test_mirror_commit_exists
    fetcher.download()
  File "/home/pokybuild/yocto-worker/oe-selftest-fedora/build/bitbake/lib/bb/fetch2/__init__.py", line 1746, in download
    done = m.try_mirrors(self, ud, self.d, mirrors)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pokybuild/yocto-worker/oe-selftest-fedora/build/bitbake/lib/bb/fetch2/__init__.py", line 1612, in try_mirrors
    return bool(try_mirrors(fetch, d, urldata, mirrors, check))
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pokybuild/yocto-worker/oe-selftest-fedora/build/bitbake/lib/bb/fetch2/__init__.py", line 1136, in try_mirrors
    ret = try_mirror_url(fetch, origud, uds[index], ld, check)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pokybuild/yocto-worker/oe-selftest-fedora/build/bitbake/lib/bb/fetch2/__init__.py", line 1065, in try_mirror_url
    origud.method.download(origud, ld)
  File "/home/pokybuild/yocto-worker/oe-selftest-fedora/build/bitbake/lib/bb/fetch2/git.py", line 401, in download
    bb.fetch2.check_network_access(d, clone_cmd, ud.url)
  File "/home/pokybuild/yocto-worker/oe-selftest-fedora/build/bitbake/lib/bb/fetch2/__init__.py", line 944, in check_network_access
    raise NetworkAccess(url, info)
bb.fetch2.NetworkAccess: Network access disabled through BB_NO_NETWORK (or set indirectly due to use of BB_FETCH_PREMIRRORONLY) but access requested with command LANG=C git -c gc.autoDetach=false -c core.pager=cat clone --bare --mirror https://git.fake.repo/bitbake /tmp/bitbake-fetch-xeon98ff/download/git2/git.fake.repo.bitbake --progress (for url git://git.fake.repo/bitbake;branch=master;protocol=https)


https://autobuilder.yoctoproject.org/typhoon/#/builders/86/builds/5619/steps/11/logs/stdio
https://autobuilder.yoctoproject.org/typhoon/#/builders/80/builds/5577/steps/11/logs/stdio
https://autobuilder.yoctoproject.org/typhoon/#/builders/127/builds/1935/steps/11/logs/stdio
https://autobuilder.yoctoproject.org/typhoon/#/builders/87/builds/5653/steps/11/logs/stdio


On 17/08/2023 22:14:34+0200, Alexandre Belloni via lists.openembedded.org wrote:
> On 17/08/2023 13:19:56-0600, Joshua Watt wrote:
> > Alexandre,
> > 
> > I'm not sure that's quite correct; I was able to run the test fine
> > locally, and that failure looks like a 502 error from the server,
> > which really shouldn't have anything to do with this patch?
> > 
> 
> I'm pretty sure you are right...
> 
> > On Tue, Aug 8, 2023 at 7:52 PM Alexandre Belloni
> > <alexandre.belloni@bootlin.com> wrote:
> > >
> > > Hello Joshua,
> > >
> > > I'm fairly sure this causes the following bitbake-selftest failure:
> > >
> > > https://autobuilder.yoctoproject.org/typhoon/#/builders/56/builds/2307/steps/11/logs/stdio
> > >
> > > On 04/08/2023 09:54:24-0600, Joshua Watt wrote:
> > > > If the clone target directory exists, but isn't the valid top level of a
> > > > bare git repo, it needs to be erased and re-cloned. One example of how
> > > > this can happen is if a clone creates the directory, but then fails to
> > > > actual clone and make it a git repository. This left-over directory can
> > > > be particularly problematic if the download directory is a descent of
> > > > some top layer git repo (e.g. the default with poky), as the commands
> > > > that operate on the remote later will then mangle the layers git
> > > > repository instead of the download git repo.
> > > >
> > > > Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
> > > > ---
> > > >  bitbake/lib/bb/fetch2/git.py | 24 +++++++++++++++++++++++-
> > > >  1 file changed, 23 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py
> > > > index 2a3c06fe4e9..112bed294f0 100644
> > > > --- a/bitbake/lib/bb/fetch2/git.py
> > > > +++ b/bitbake/lib/bb/fetch2/git.py
> > > > @@ -65,6 +65,7 @@ import fnmatch
> > > >  import os
> > > >  import re
> > > >  import shlex
> > > > +import shutil
> > > >  import subprocess
> > > >  import tempfile
> > > >  import bb
> > > > @@ -365,8 +366,29 @@ class Git(FetchMethod):
> > > >                  runfetchcmd(fetch_cmd, d, workdir=ud.clonedir)
> > > >          repourl = self._get_repo_url(ud)
> > > >
> > > > +        needs_clone = False
> > > > +        if os.path.exists(ud.clonedir):
> > > > +            # The directory may exist, but not be the top level of a bare git
> > > > +            # repository in which case it needs to be deleted and re-cloned.
> > > > +            try:
> > > > +                # Since clones are bare, use --absolute-git-dir instead of --show-toplevel
> > > > +                output = runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir)
> > > > +            except bb.fetch2.FetchError as e:
> > > > +                logger.warning("Unable to get top level for %s (not a git directory?): %s", ud.clonedir, e)
> > > > +                needs_clone = True
> > > > +            else:
> > > > +                toplevel = output.rstrip()
> > > > +                if os.path.abspath(toplevel) != os.path.abspath(ud.clonedir):
> > > > +                    logger.warning("Top level directory '%s' doesn't match expected '%s'. Re-cloning", toplevel, ud.clonedir)
> > > > +                    needs_clone = True
> > > > +
> > > > +            if needs_clone:
> > > > +                shutil.rmtree(ud.clonedir)
> > > > +        else:
> > > > +            needs_clone = True
> > > > +
> > > >          # If the repo still doesn't exist, fallback to cloning it
> > > > -        if not os.path.exists(ud.clonedir):
> > > > +        if needs_clone:
> > > >              # We do this since git will use a "-l" option automatically for local urls where possible,
> > > >              # but it doesn't work when git/objects is a symlink, only works when it is a directory.
> > > >              if repourl.startswith("file://"):
> > > > --
> > > > 2.33.0
> > > >
> > >
> > > >
> > > > 
> > > >
> > >
> > >
> > > --
> > > Alexandre Belloni, co-owner and COO, Bootlin
> > > Embedded Linux and Kernel engineering
> > > https://bootlin.com
> 
> -- 
> Alexandre Belloni, co-owner and COO, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com

> 
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#14952): https://lists.openembedded.org/g/bitbake-devel/message/14952
> Mute This Topic: https://lists.openembedded.org/mt/100548945/3617179
> Group Owner: bitbake-devel+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [alexandre.belloni@bootlin.com]
> -=-=-=-=-=-=-=-=-=-=-=-
> 


-- 
Alexandre Belloni, co-owner and COO, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [bitbake-devel][PATCH v2] fetch2: git: Check if clone directory is a git repo
  2023-08-04 15:54 [bitbake-devel][PATCH] fetch2: git: Check if directory is a bare git repo Joshua Watt
  2023-08-09  1:52 ` Alexandre Belloni
@ 2023-08-24 19:53 ` Joshua Watt
  2023-08-25  9:18   ` Paulo Neves
       [not found]   ` <177E95B06B95EDA3.23833@lists.openembedded.org>
  1 sibling, 2 replies; 12+ messages in thread
From: Joshua Watt @ 2023-08-24 19:53 UTC (permalink / raw)
  To: bitbake-devel; +Cc: Joshua Watt

If the clone target directory exists and is a valid git repo, but the
git directory is not a child, it needs to be erased and re-cloned. One
example of how this can happen is if a clone creates the directory, but
then fails to actual clone and make it a git repository. This left-over
directory can be particularly problematic if the download directory is a
descent of some top layer git repo (e.g. the default with poky), as the
commands that operate on the remote later will then mangle the layers
git repository instead of the download git repo.

Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
---
 bitbake/lib/bb/fetch2/git.py | 30 +++++++++++++++++++++++++++++-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py
index 2a3c06fe4e9..33895e09b29 100644
--- a/bitbake/lib/bb/fetch2/git.py
+++ b/bitbake/lib/bb/fetch2/git.py
@@ -65,6 +65,7 @@ import fnmatch
 import os
 import re
 import shlex
+import shutil
 import subprocess
 import tempfile
 import bb
@@ -365,8 +366,35 @@ class Git(FetchMethod):
                 runfetchcmd(fetch_cmd, d, workdir=ud.clonedir)
         repourl = self._get_repo_url(ud)
 
+        needs_clone = False
+        if os.path.exists(ud.clonedir):
+            # The directory may exist, but not be the top level of a bare git
+            # repository in which case it needs to be deleted and re-cloned.
+            try:
+                # Since clones can be bare, use --absolute-git-dir instead of --show-toplevel
+                output = runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir)
+            except bb.fetch2.FetchError as e:
+                logger.warning("Unable to get top level for %s (not a git directory?): %s", ud.clonedir, e)
+                needs_clone = True
+            else:
+                toplevel = os.path.abspath(output.rstrip())
+                abs_clonedir = os.path.abspath(ud.clonedir).rstrip('/')
+                # The top level Git directory must either be the clone directory
+                # or a child of the clone directory. Any ancestor directory of
+                # the clone directory is not valid as the Git directory (and
+                # probably belongs to some other unrelated repository), so a
+                # clone is required
+                if os.path.commonprefix([abs_clonedir, toplevel]) != abs_clonedir:
+                    logger.warning("Top level directory '%s' doesn't match expected '%s'. Re-cloning", toplevel, ud.clonedir)
+                    needs_clone = True
+
+            if needs_clone:
+                shutil.rmtree(ud.clonedir)
+        else:
+            needs_clone = True
+
         # If the repo still doesn't exist, fallback to cloning it
-        if not os.path.exists(ud.clonedir):
+        if needs_clone:
             # We do this since git will use a "-l" option automatically for local urls where possible,
             # but it doesn't work when git/objects is a symlink, only works when it is a directory.
             if repourl.startswith("file://"):
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [bitbake-devel][PATCH] fetch2: git: Check if directory is a bare git repo
  2023-08-19 12:21       ` Alexandre Belloni
@ 2023-08-24 19:54         ` Joshua Watt
  0 siblings, 0 replies; 12+ messages in thread
From: Joshua Watt @ 2023-08-24 19:54 UTC (permalink / raw)
  To: Alexandre Belloni; +Cc: bitbake-devel

On Sat, Aug 19, 2023 at 6:21 AM Alexandre Belloni
<alexandre.belloni@bootlin.com> wrote:
>
> This was because I didn't point to the correct build error:

Yep, fixed in V2 (and I validated that the tests pass locally also)

>
> ======================================================================
> ERROR: test_mirror_commit_exists (bb.tests.fetch.FetchPremirroronlyLocalTest.test_mirror_commit_exists)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "/home/pokybuild/yocto-worker/oe-selftest-fedora/build/bitbake/lib/bb/tests/fetch.py", line 3075, in test_mirror_commit_exists
>     fetcher.download()
>   File "/home/pokybuild/yocto-worker/oe-selftest-fedora/build/bitbake/lib/bb/fetch2/__init__.py", line 1746, in download
>     done = m.try_mirrors(self, ud, self.d, mirrors)
>            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "/home/pokybuild/yocto-worker/oe-selftest-fedora/build/bitbake/lib/bb/fetch2/__init__.py", line 1612, in try_mirrors
>     return bool(try_mirrors(fetch, d, urldata, mirrors, check))
>                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "/home/pokybuild/yocto-worker/oe-selftest-fedora/build/bitbake/lib/bb/fetch2/__init__.py", line 1136, in try_mirrors
>     ret = try_mirror_url(fetch, origud, uds[index], ld, check)
>           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "/home/pokybuild/yocto-worker/oe-selftest-fedora/build/bitbake/lib/bb/fetch2/__init__.py", line 1065, in try_mirror_url
>     origud.method.download(origud, ld)
>   File "/home/pokybuild/yocto-worker/oe-selftest-fedora/build/bitbake/lib/bb/fetch2/git.py", line 401, in download
>     bb.fetch2.check_network_access(d, clone_cmd, ud.url)
>   File "/home/pokybuild/yocto-worker/oe-selftest-fedora/build/bitbake/lib/bb/fetch2/__init__.py", line 944, in check_network_access
>     raise NetworkAccess(url, info)
> bb.fetch2.NetworkAccess: Network access disabled through BB_NO_NETWORK (or set indirectly due to use of BB_FETCH_PREMIRRORONLY) but access requested with command LANG=C git -c gc.autoDetach=false -c core.pager=cat clone --bare --mirror https://git.fake.repo/bitbake /tmp/bitbake-fetch-xeon98ff/download/git2/git.fake.repo.bitbake --progress (for url git://git.fake.repo/bitbake;branch=master;protocol=https)
>
>
> https://autobuilder.yoctoproject.org/typhoon/#/builders/86/builds/5619/steps/11/logs/stdio
> https://autobuilder.yoctoproject.org/typhoon/#/builders/80/builds/5577/steps/11/logs/stdio
> https://autobuilder.yoctoproject.org/typhoon/#/builders/127/builds/1935/steps/11/logs/stdio
> https://autobuilder.yoctoproject.org/typhoon/#/builders/87/builds/5653/steps/11/logs/stdio
>
>
> On 17/08/2023 22:14:34+0200, Alexandre Belloni via lists.openembedded.org wrote:
> > On 17/08/2023 13:19:56-0600, Joshua Watt wrote:
> > > Alexandre,
> > >
> > > I'm not sure that's quite correct; I was able to run the test fine
> > > locally, and that failure looks like a 502 error from the server,
> > > which really shouldn't have anything to do with this patch?
> > >
> >
> > I'm pretty sure you are right...
> >
> > > On Tue, Aug 8, 2023 at 7:52 PM Alexandre Belloni
> > > <alexandre.belloni@bootlin.com> wrote:
> > > >
> > > > Hello Joshua,
> > > >
> > > > I'm fairly sure this causes the following bitbake-selftest failure:
> > > >
> > > > https://autobuilder.yoctoproject.org/typhoon/#/builders/56/builds/2307/steps/11/logs/stdio
> > > >
> > > > On 04/08/2023 09:54:24-0600, Joshua Watt wrote:
> > > > > If the clone target directory exists, but isn't the valid top level of a
> > > > > bare git repo, it needs to be erased and re-cloned. One example of how
> > > > > this can happen is if a clone creates the directory, but then fails to
> > > > > actual clone and make it a git repository. This left-over directory can
> > > > > be particularly problematic if the download directory is a descent of
> > > > > some top layer git repo (e.g. the default with poky), as the commands
> > > > > that operate on the remote later will then mangle the layers git
> > > > > repository instead of the download git repo.
> > > > >
> > > > > Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
> > > > > ---
> > > > >  bitbake/lib/bb/fetch2/git.py | 24 +++++++++++++++++++++++-
> > > > >  1 file changed, 23 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py
> > > > > index 2a3c06fe4e9..112bed294f0 100644
> > > > > --- a/bitbake/lib/bb/fetch2/git.py
> > > > > +++ b/bitbake/lib/bb/fetch2/git.py
> > > > > @@ -65,6 +65,7 @@ import fnmatch
> > > > >  import os
> > > > >  import re
> > > > >  import shlex
> > > > > +import shutil
> > > > >  import subprocess
> > > > >  import tempfile
> > > > >  import bb
> > > > > @@ -365,8 +366,29 @@ class Git(FetchMethod):
> > > > >                  runfetchcmd(fetch_cmd, d, workdir=ud.clonedir)
> > > > >          repourl = self._get_repo_url(ud)
> > > > >
> > > > > +        needs_clone = False
> > > > > +        if os.path.exists(ud.clonedir):
> > > > > +            # The directory may exist, but not be the top level of a bare git
> > > > > +            # repository in which case it needs to be deleted and re-cloned.
> > > > > +            try:
> > > > > +                # Since clones are bare, use --absolute-git-dir instead of --show-toplevel
> > > > > +                output = runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir)
> > > > > +            except bb.fetch2.FetchError as e:
> > > > > +                logger.warning("Unable to get top level for %s (not a git directory?): %s", ud.clonedir, e)
> > > > > +                needs_clone = True
> > > > > +            else:
> > > > > +                toplevel = output.rstrip()
> > > > > +                if os.path.abspath(toplevel) != os.path.abspath(ud.clonedir):
> > > > > +                    logger.warning("Top level directory '%s' doesn't match expected '%s'. Re-cloning", toplevel, ud.clonedir)
> > > > > +                    needs_clone = True
> > > > > +
> > > > > +            if needs_clone:
> > > > > +                shutil.rmtree(ud.clonedir)
> > > > > +        else:
> > > > > +            needs_clone = True
> > > > > +
> > > > >          # If the repo still doesn't exist, fallback to cloning it
> > > > > -        if not os.path.exists(ud.clonedir):
> > > > > +        if needs_clone:
> > > > >              # We do this since git will use a "-l" option automatically for local urls where possible,
> > > > >              # but it doesn't work when git/objects is a symlink, only works when it is a directory.
> > > > >              if repourl.startswith("file://"):
> > > > > --
> > > > > 2.33.0
> > > > >
> > > >
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Alexandre Belloni, co-owner and COO, Bootlin
> > > > Embedded Linux and Kernel engineering
> > > > https://bootlin.com
> >
> > --
> > Alexandre Belloni, co-owner and COO, Bootlin
> > Embedded Linux and Kernel engineering
> > https://bootlin.com
>
> >
> > -=-=-=-=-=-=-=-=-=-=-=-
> > Links: You receive all messages sent to this group.
> > View/Reply Online (#14952): https://lists.openembedded.org/g/bitbake-devel/message/14952
> > Mute This Topic: https://lists.openembedded.org/mt/100548945/3617179
> > Group Owner: bitbake-devel+owner@lists.openembedded.org
> > Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [alexandre.belloni@bootlin.com]
> > -=-=-=-=-=-=-=-=-=-=-=-
> >
>
>
> --
> Alexandre Belloni, co-owner and COO, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [bitbake-devel][PATCH v2] fetch2: git: Check if clone directory is a git repo
  2023-08-24 19:53 ` [bitbake-devel][PATCH v2] fetch2: git: Check if clone directory is a " Joshua Watt
@ 2023-08-25  9:18   ` Paulo Neves
  2023-08-25 14:19     ` Joshua Watt
       [not found]   ` <177E95B06B95EDA3.23833@lists.openembedded.org>
  1 sibling, 1 reply; 12+ messages in thread
From: Paulo Neves @ 2023-08-25  9:18 UTC (permalink / raw)
  To: Joshua Watt, bitbake-devel

On 24/08/2023 21:53, Joshua Watt wrote:
> If the clone target directory exists and is a valid git repo, but the
> git directory is not a child, it needs to be erased and re-cloned. One
> example of how this can happen is if a clone creates the directory, but
> then fails to actual clone and make it a git repository. This left-over
> directory can be particularly problematic if the download directory is a
> descent of some top layer git repo (e.g. the default with poky), as the
> commands that operate on the remote later will then mangle the layers
> git repository instead of the download git repo.
>
> Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
> ---
>   bitbake/lib/bb/fetch2/git.py | 30 +++++++++++++++++++++++++++++-
>   1 file changed, 29 insertions(+), 1 deletion(-)
>
> diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py
> index 2a3c06fe4e9..33895e09b29 100644
> --- a/bitbake/lib/bb/fetch2/git.py
> +++ b/bitbake/lib/bb/fetch2/git.py
> @@ -65,6 +65,7 @@ import fnmatch
>   import os
>   import re
>   import shlex
> +import shutil
>   import subprocess
>   import tempfile
>   import bb
> @@ -365,8 +366,35 @@ class Git(FetchMethod):
>                   runfetchcmd(fetch_cmd, d, workdir=ud.clonedir)
>           repourl = self._get_repo_url(ud)
>
> +        needs_clone = False
> +        if os.path.exists(ud.clonedir):
> +            # The directory may exist, but not be the top level of a bare git
> +            # repository in which case it needs to be deleted and re-cloned.
> +            try:
> +                # Since clones can be bare, use --absolute-git-dir instead of --show-toplevel
> +                output = runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir)
> +            except bb.fetch2.FetchError as e:
> +                logger.warning("Unable to get top level for %s (not a git directory?): %s", ud.clonedir, e)
> +                needs_clone = True
> +            else:
> +                toplevel = os.path.abspath(output.rstrip())
> +                abs_clonedir = os.path.abspath(ud.clonedir).rstrip('/')
> +                # The top level Git directory must either be the clone directory
> +                # or a child of the clone directory. Any ancestor directory of
> +                # the clone directory is not valid as the Git directory (and
> +                # probably belongs to some other unrelated repository), so a
> +                # clone is required
> +                if os.path.commonprefix([abs_clonedir, toplevel]) != abs_clonedir:
> +                    logger.warning("Top level directory '%s' doesn't match expected '%s'. Re-cloning", toplevel, ud.clonedir)
> +                    needs_clone = True
> +
> +            if needs_clone:
> +                shutil.rmtree(ud.clonedir)
> +        else:
> +            needs_clone = True
> +
>           # If the repo still doesn't exist, fallback to cloning it
> -        if not os.path.exists(ud.clonedir):
> +        if needs_clone:
>               # We do this since git will use a "-l" option automatically for local urls where possible,
>               # but it doesn't work when git/objects is a symlink, only works when it is a directory.
>               if repourl.startswith("file://"):
> --
> 2.33.0
>
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#14979): https://lists.openembedded.org/g/bitbake-devel/message/14979
> Mute This Topic: https://lists.openembedded.org/mt/100942876/4454782
> Group Owner: bitbake-devel+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [paulo@myneves.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>
Is it possible to factor the try/except else to a function? Then we 
would not need the else, and would just return immediately and have a 
self contained code, which in my opinion is easier to read. I also think 
f-string would be more readable for runfetchcmd. f-string are available 
since python 3.6.

Can you share the rationale for the else clause? I do not see anything 
inside the else clause that might raise a FetchError, thus the else 
clause is redundant.

Another change I suggest is to move the rmtree inside the usual 
need_clone and ignore the error if the path does not exist. Removes 
another conditional.

I also think using needs_clone to assert whether the rmtree should be 
done is confusing A different variable should be used or use a 
function's return value. All the code inside if 
os.path.exists(ud.clonedir): deals with the need for removal and as a 
side effect will trigger the clone in another part of the code.





^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [bitbake-devel][PATCH v2] fetch2: git: Check if clone directory is a git repo
       [not found]   ` <177E95B06B95EDA3.23833@lists.openembedded.org>
@ 2023-08-25  9:33     ` Paulo Neves
  0 siblings, 0 replies; 12+ messages in thread
From: Paulo Neves @ 2023-08-25  9:33 UTC (permalink / raw)
  To: Joshua Watt, bitbake-devel

[-- Attachment #1: Type: text/plain, Size: 5057 bytes --]

On 25/08/2023 11:18, Paulo Neves wrote:

> On 24/08/2023 21:53, Joshua Watt wrote:
>
>> If the clone target directory exists and is a valid git repo, but the
>> git directory is not a child, it needs to be erased and re-cloned. One
>> example of how this can happen is if a clone creates the directory, but
>> then fails to actual clone and make it a git repository. This left-over
>> directory can be particularly problematic if the download directory is a
>> descent of some top layer git repo (e.g. the default with poky), as the
>> commands that operate on the remote later will then mangle the layers
>> git repository instead of the download git repo.
>>
>> Signed-off-by: Joshua Watt
>> [<JPEWhacker@gmail.com>](mailto:JPEWhacker@gmail.com)
>> ---
>>   bitbake/lib/bb/fetch2/git.py | 30 +++++++++++++++++++++++++++++-
>>   1 file changed, 29 insertions(+), 1 deletion(-)
>>
>> diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py
>> index 2a3c06fe4e9..33895e09b29 100644
>> --- a/bitbake/lib/bb/fetch2/git.py
>> +++ b/bitbake/lib/bb/fetch2/git.py
>> @@ -65,6 +65,7 @@ import fnmatch
>>   import os
>>   import re
>>   import shlex
>> +import shutil
>>   import subprocess
>>   import tempfile
>>   import bb
>> @@ -365,8 +366,35 @@ class Git(FetchMethod):
>>                   runfetchcmd(fetch_cmd, d, workdir=ud.clonedir)
>>           repourl = self._get_repo_url(ud)
>>
>> +        needs_clone = False
>> +        if os.path.exists(ud.clonedir):
>> +            # The directory may exist, but not be the top level of a bare git
>> +            # repository in which case it needs to be deleted and re-cloned.
>> +            try:
>> +                # Since clones can be bare, use --absolute-git-dir instead of --show-toplevel
>> +                output = runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir)
>> +            except bb.fetch2.FetchError as e:
>> +                logger.warning("Unable to get top level for %s (not a git directory?): %s", ud.clonedir, e)
>> +                needs_clone = True
>> +            else:
>> +                toplevel = os.path.abspath(output.rstrip())
>> +                abs_clonedir = os.path.abspath(ud.clonedir).rstrip('/')
>> +                # The top level Git directory must either be the clone directory
>> +                # or a child of the clone directory. Any ancestor directory of
>> +                # the clone directory is not valid as the Git directory (and
>> +                # probably belongs to some other unrelated repository), so a
>> +                # clone is required
>> +                if os.path.commonprefix([abs_clonedir, toplevel]) != abs_clonedir:
>> +                    logger.warning("Top level directory '%s' doesn't match expected '%s'. Re-cloning", toplevel, ud.clonedir)
>> +                    needs_clone = True
>> +
>> +            if needs_clone:
>> +                shutil.rmtree(ud.clonedir)
>> +        else:
>> +            needs_clone = True
>> +
>>           # If the repo still doesn't exist, fallback to cloning it
>> -        if not os.path.exists(ud.clonedir):
>> +        if needs_clone:
>>               # We do this since git will use a "-l" option automatically for local urls where possible,
>>               # but it doesn't work when git/objects is a symlink, only works when it is a directory.
>>               if repourl.startswith(
>> ["file://"](file://)
>> ):
>> --
>> 2.33.0
>
> Is it possible to factor the try/except else to a function? Then we
> would not need the else, and would just return immediately and have a
> self contained code, which in my opinion is easier to read. I also think
> f-string would be more readable for runfetchcmd. f-string are available
> since python 3.6.
>
> Can you share the rationale for the else clause? I do not see anything
> inside the else clause that might raise a FetchError, thus the else
> clause is redundant.
>
> Another change I suggest is to move the rmtree inside the usual
> need_clone and ignore the error if the path does not exist. Removes
> another conditional.
>
> I also think using needs_clone to assert whether the rmtree should be
> done is confusing A different variable should be used or use a
> function's return value. All the code inside if
> os.path.exists(ud.clonedir): deals with the need for removal and as a
> side effect will trigger the clone in another part of the code.
>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#14982):
> https://lists.openembedded.org/g/bitbake-devel/message/14982
> Mute This Topic:
> https://lists.openembedded.org/mt/100942876/4454782
> Group Owner:
> bitbake-devel+owner@lists.openembedded.org
> Unsubscribe:
> https://lists.openembedded.org/g/bitbake-devel/unsub
> [
> paulo@myneves.com
> ]
> -=-=-=-=-=-=-=-=-=-=-=-

> Another change I suggest is to move the rmtree inside the usual
need_clone and ignore the error if the path does not exist. Removes
another conditional.

Disregard this comment. It is incorrect.

[-- Attachment #2: Type: text/html, Size: 6038 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [bitbake-devel][PATCH v2] fetch2: git: Check if clone directory is a git repo
  2023-08-25  9:18   ` Paulo Neves
@ 2023-08-25 14:19     ` Joshua Watt
  2023-08-31 12:21       ` Paulo Neves
  0 siblings, 1 reply; 12+ messages in thread
From: Joshua Watt @ 2023-08-25 14:19 UTC (permalink / raw)
  To: Paulo Neves; +Cc: bitbake-devel

On Fri, Aug 25, 2023 at 3:18 AM Paulo Neves <paulo@myneves.com> wrote:
>
> On 24/08/2023 21:53, Joshua Watt wrote:
> > If the clone target directory exists and is a valid git repo, but the
> > git directory is not a child, it needs to be erased and re-cloned. One
> > example of how this can happen is if a clone creates the directory, but
> > then fails to actual clone and make it a git repository. This left-over
> > directory can be particularly problematic if the download directory is a
> > descent of some top layer git repo (e.g. the default with poky), as the
> > commands that operate on the remote later will then mangle the layers
> > git repository instead of the download git repo.
> >
> > Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
> > ---
> >   bitbake/lib/bb/fetch2/git.py | 30 +++++++++++++++++++++++++++++-
> >   1 file changed, 29 insertions(+), 1 deletion(-)
> >
> > diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py
> > index 2a3c06fe4e9..33895e09b29 100644
> > --- a/bitbake/lib/bb/fetch2/git.py
> > +++ b/bitbake/lib/bb/fetch2/git.py
> > @@ -65,6 +65,7 @@ import fnmatch
> >   import os
> >   import re
> >   import shlex
> > +import shutil
> >   import subprocess
> >   import tempfile
> >   import bb
> > @@ -365,8 +366,35 @@ class Git(FetchMethod):
> >                   runfetchcmd(fetch_cmd, d, workdir=ud.clonedir)
> >           repourl = self._get_repo_url(ud)
> >
> > +        needs_clone = False
> > +        if os.path.exists(ud.clonedir):
> > +            # The directory may exist, but not be the top level of a bare git
> > +            # repository in which case it needs to be deleted and re-cloned.
> > +            try:
> > +                # Since clones can be bare, use --absolute-git-dir instead of --show-toplevel
> > +                output = runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir)
> > +            except bb.fetch2.FetchError as e:
> > +                logger.warning("Unable to get top level for %s (not a git directory?): %s", ud.clonedir, e)
> > +                needs_clone = True
> > +            else:
> > +                toplevel = os.path.abspath(output.rstrip())
> > +                abs_clonedir = os.path.abspath(ud.clonedir).rstrip('/')
> > +                # The top level Git directory must either be the clone directory
> > +                # or a child of the clone directory. Any ancestor directory of
> > +                # the clone directory is not valid as the Git directory (and
> > +                # probably belongs to some other unrelated repository), so a
> > +                # clone is required
> > +                if os.path.commonprefix([abs_clonedir, toplevel]) != abs_clonedir:
> > +                    logger.warning("Top level directory '%s' doesn't match expected '%s'. Re-cloning", toplevel, ud.clonedir)
> > +                    needs_clone = True
> > +
> > +            if needs_clone:
> > +                shutil.rmtree(ud.clonedir)
> > +        else:
> > +            needs_clone = True
> > +
> >           # If the repo still doesn't exist, fallback to cloning it
> > -        if not os.path.exists(ud.clonedir):
> > +        if needs_clone:
> >               # We do this since git will use a "-l" option automatically for local urls where possible,
> >               # but it doesn't work when git/objects is a symlink, only works when it is a directory.
> >               if repourl.startswith("file://"):
> > --
> > 2.33.0
> >
> >
> > -=-=-=-=-=-=-=-=-=-=-=-
> > Links: You receive all messages sent to this group.
> > View/Reply Online (#14979): https://lists.openembedded.org/g/bitbake-devel/message/14979
> > Mute This Topic: https://lists.openembedded.org/mt/100942876/4454782
> > Group Owner: bitbake-devel+owner@lists.openembedded.org
> > Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [paulo@myneves.com]
> > -=-=-=-=-=-=-=-=-=-=-=-
> >
> Is it possible to factor the try/except else to a function? Then we
> would not need the else, and would just return immediately and have a
> self contained code, which in my opinion is easier to read. I also think
> f-string would be more readable for runfetchcmd. f-string are available
> since python 3.6.
>
> Can you share the rationale for the else clause? I do not see anything
> inside the else clause that might raise a FetchError, thus the else
> clause is redundant.

The else clause is pretty much the entire reason for this patch in the
first place :)

The problem is that the fetcher assumes that the simple existence of
the download directory means it can can add a remote and clone there,
but that is not true. If your clone directory is a subdirectory of
another git repository, but not a git repository itself, bitbake will
(incorrectly) add the remote the higher level git repos and fetch
there, leaving it a mess.

More visually, if your layout is like this:

 poky <-- This is a git repository
   build
      downloads
          git2
            foo <-- This is NOT a git repository

The old code would see that the foo directory existed, see that it was
part of a git repo (poky!) add the remotes and start cloning objects
into the poky git repo, which is not at all what we want.

The else clause corrects this as it makes the fetcher check if "foo"
is not only a git repo, but that the it is the toplevel of a git repo
(or at least that it's --git-dir is not a parent of that directory).
If it's not, it deletes it because it's not a valid place to clone
things into anymore.

>
> Another change I suggest is to move the rmtree inside the usual
> need_clone and ignore the error if the path does not exist. Removes
> another conditional.
>
> I also think using needs_clone to assert whether the rmtree should be
> done is confusing A different variable should be used or use a
> function's return value. All the code inside if
> os.path.exists(ud.clonedir): deals with the need for removal and as a
> side effect will trigger the clone in another part of the code.
>
>
>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [bitbake-devel][PATCH v2] fetch2: git: Check if clone directory is a git repo
  2023-08-25 14:19     ` Joshua Watt
@ 2023-08-31 12:21       ` Paulo Neves
  2023-08-31 14:27         ` Joshua Watt
  0 siblings, 1 reply; 12+ messages in thread
From: Paulo Neves @ 2023-08-31 12:21 UTC (permalink / raw)
  To: Joshua Watt; +Cc: bitbake-devel

On 25/08/2023 16:19, Joshua Watt wrote:
> On Fri, Aug 25, 2023 at 3:18 AM Paulo Neves <paulo@myneves.com> wrote:
>> On 24/08/2023 21:53, Joshua Watt wrote:
>>> If the clone target directory exists and is a valid git repo, but the
>>> git directory is not a child, it needs to be erased and re-cloned. One
>>> example of how this can happen is if a clone creates the directory, but
>>> then fails to actual clone and make it a git repository. This left-over
>>> directory can be particularly problematic if the download directory is a
>>> descent of some top layer git repo (e.g. the default with poky), as the
>>> commands that operate on the remote later will then mangle the layers
>>> git repository instead of the download git repo.
>>>
>>> Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
>>> ---
>>>    bitbake/lib/bb/fetch2/git.py | 30 +++++++++++++++++++++++++++++-
>>>    1 file changed, 29 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py
>>> index 2a3c06fe4e9..33895e09b29 100644
>>> --- a/bitbake/lib/bb/fetch2/git.py
>>> +++ b/bitbake/lib/bb/fetch2/git.py
>>> @@ -65,6 +65,7 @@ import fnmatch
>>>    import os
>>>    import re
>>>    import shlex
>>> +import shutil
>>>    import subprocess
>>>    import tempfile
>>>    import bb
>>> @@ -365,8 +366,35 @@ class Git(FetchMethod):
>>>                    runfetchcmd(fetch_cmd, d, workdir=ud.clonedir)
>>>            repourl = self._get_repo_url(ud)
>>>
>>> +        needs_clone = False
>>> +        if os.path.exists(ud.clonedir):
>>> +            # The directory may exist, but not be the top level of a bare git
>>> +            # repository in which case it needs to be deleted and re-cloned.
>>> +            try:
>>> +                # Since clones can be bare, use --absolute-git-dir instead of --show-toplevel
>>> +                output = runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir)
>>> +            except bb.fetch2.FetchError as e:
>>> +                logger.warning("Unable to get top level for %s (not a git directory?): %s", ud.clonedir, e)
>>> +                needs_clone = True
>>> +            else:
>>> +                toplevel = os.path.abspath(output.rstrip())
>>> +                abs_clonedir = os.path.abspath(ud.clonedir).rstrip('/')
>>> +                # The top level Git directory must either be the clone directory
>>> +                # or a child of the clone directory. Any ancestor directory of
>>> +                # the clone directory is not valid as the Git directory (and
>>> +                # probably belongs to some other unrelated repository), so a
>>> +                # clone is required
>>> +                if os.path.commonprefix([abs_clonedir, toplevel]) != abs_clonedir:
>>> +                    logger.warning("Top level directory '%s' doesn't match expected '%s'. Re-cloning", toplevel, ud.clonedir)
>>> +                    needs_clone = True
>>> +
>>> +            if needs_clone:
>>> +                shutil.rmtree(ud.clonedir)
>>> +        else:
>>> +            needs_clone = True
>>> +
>>>            # If the repo still doesn't exist, fallback to cloning it
>>> -        if not os.path.exists(ud.clonedir):
>>> +        if needs_clone:
>>>                # We do this since git will use a "-l" option automatically for local urls where possible,
>>>                # but it doesn't work when git/objects is a symlink, only works when it is a directory.
>>>                if repourl.startswith("file://"):
>>> --
>>> 2.33.0
>>>
>>>
>>>
>>>
>> Is it possible to factor the try/except else to a function? Then we
>> would not need the else, and would just return immediately and have a
>> self contained code, which in my opinion is easier to read. I also think
>> f-string would be more readable for runfetchcmd. f-string are available
>> since python 3.6.
>>
>> Can you share the rationale for the else clause? I do not see anything
>> inside the else clause that might raise a FetchError, thus the else
>> clause is redundant.
> The else clause is pretty much the entire reason for this patch in the
> first place :)
>
> The problem is that the fetcher assumes that the simple existence of
> the download directory means it can can add a remote and clone there,
> but that is not true. If your clone directory is a subdirectory of
> another git repository, but not a git repository itself, bitbake will
> (incorrectly) add the remote the higher level git repos and fetch
> there, leaving it a mess.
>
> More visually, if your layout is like this:
>
>   poky <-- This is a git repository
>     build
>        downloads
>            git2
>              foo <-- This is NOT a git repository
>
> The old code would see that the foo directory existed, see that it was
> part of a git repo (poky!) add the remotes and start cloning objects
> into the poky git repo, which is not at all what we want.
>
> The else clause corrects this as it makes the fetcher check if "foo"
> is not only a git repo, but that the it is the toplevel of a git repo
> (or at least that it's --git-dir is not a parent of that directory).
> If it's not, it deletes it because it's not a valid place to clone
> things into anymore.

I understand the conceptual point of the patch and I think it is valid. 
What i mean is that you do not need the else clause. The code under the 
else clause can continue inside the try clause and not change the logic. 
The try-else idiom is very unusual and represents a meaning which is not 
needed here: That the code inside the else should not trigger the except 
clause.

>> Another change I suggest is to move the rmtree inside the usual
>> need_clone and ignore the error if the path does not exist. Removes
>> another conditional.
>>
>> I also think using needs_clone to assert whether the rmtree should be
>> done is confusing A different variable should be used or use a
>> function's return value. All the code inside if
>> os.path.exists(ud.clonedir): deals with the need for removal and as a
>> side effect will trigger the clone in another part of the code.
>>
>>
>>
> -=-=-=-=-=-=-=-=-=-=-=-
> Links: You receive all messages sent to this group.
> View/Reply Online (#14984): https://lists.openembedded.org/g/bitbake-devel/message/14984
> Mute This Topic: https://lists.openembedded.org/mt/100942876/4454782
> Group Owner: bitbake-devel+owner@lists.openembedded.org
> Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [paulo@myneves.com]
> -=-=-=-=-=-=-=-=-=-=-=-
>




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [bitbake-devel][PATCH v2] fetch2: git: Check if clone directory is a git repo
  2023-08-31 12:21       ` Paulo Neves
@ 2023-08-31 14:27         ` Joshua Watt
  0 siblings, 0 replies; 12+ messages in thread
From: Joshua Watt @ 2023-08-31 14:27 UTC (permalink / raw)
  To: Paulo Neves; +Cc: bitbake-devel

On Thu, Aug 31, 2023 at 6:21 AM Paulo Neves <paulo@myneves.com> wrote:
>
> On 25/08/2023 16:19, Joshua Watt wrote:
> > On Fri, Aug 25, 2023 at 3:18 AM Paulo Neves <paulo@myneves.com> wrote:
> >> On 24/08/2023 21:53, Joshua Watt wrote:
> >>> If the clone target directory exists and is a valid git repo, but the
> >>> git directory is not a child, it needs to be erased and re-cloned. One
> >>> example of how this can happen is if a clone creates the directory, but
> >>> then fails to actual clone and make it a git repository. This left-over
> >>> directory can be particularly problematic if the download directory is a
> >>> descent of some top layer git repo (e.g. the default with poky), as the
> >>> commands that operate on the remote later will then mangle the layers
> >>> git repository instead of the download git repo.
> >>>
> >>> Signed-off-by: Joshua Watt <JPEWhacker@gmail.com>
> >>> ---
> >>>    bitbake/lib/bb/fetch2/git.py | 30 +++++++++++++++++++++++++++++-
> >>>    1 file changed, 29 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/bitbake/lib/bb/fetch2/git.py b/bitbake/lib/bb/fetch2/git.py
> >>> index 2a3c06fe4e9..33895e09b29 100644
> >>> --- a/bitbake/lib/bb/fetch2/git.py
> >>> +++ b/bitbake/lib/bb/fetch2/git.py
> >>> @@ -65,6 +65,7 @@ import fnmatch
> >>>    import os
> >>>    import re
> >>>    import shlex
> >>> +import shutil
> >>>    import subprocess
> >>>    import tempfile
> >>>    import bb
> >>> @@ -365,8 +366,35 @@ class Git(FetchMethod):
> >>>                    runfetchcmd(fetch_cmd, d, workdir=ud.clonedir)
> >>>            repourl = self._get_repo_url(ud)
> >>>
> >>> +        needs_clone = False
> >>> +        if os.path.exists(ud.clonedir):
> >>> +            # The directory may exist, but not be the top level of a bare git
> >>> +            # repository in which case it needs to be deleted and re-cloned.
> >>> +            try:
> >>> +                # Since clones can be bare, use --absolute-git-dir instead of --show-toplevel
> >>> +                output = runfetchcmd("LANG=C %s rev-parse --absolute-git-dir" % ud.basecmd, d, workdir=ud.clonedir)
> >>> +            except bb.fetch2.FetchError as e:
> >>> +                logger.warning("Unable to get top level for %s (not a git directory?): %s", ud.clonedir, e)
> >>> +                needs_clone = True
> >>> +            else:
> >>> +                toplevel = os.path.abspath(output.rstrip())
> >>> +                abs_clonedir = os.path.abspath(ud.clonedir).rstrip('/')
> >>> +                # The top level Git directory must either be the clone directory
> >>> +                # or a child of the clone directory. Any ancestor directory of
> >>> +                # the clone directory is not valid as the Git directory (and
> >>> +                # probably belongs to some other unrelated repository), so a
> >>> +                # clone is required
> >>> +                if os.path.commonprefix([abs_clonedir, toplevel]) != abs_clonedir:
> >>> +                    logger.warning("Top level directory '%s' doesn't match expected '%s'. Re-cloning", toplevel, ud.clonedir)
> >>> +                    needs_clone = True
> >>> +
> >>> +            if needs_clone:
> >>> +                shutil.rmtree(ud.clonedir)
> >>> +        else:
> >>> +            needs_clone = True
> >>> +
> >>>            # If the repo still doesn't exist, fallback to cloning it
> >>> -        if not os.path.exists(ud.clonedir):
> >>> +        if needs_clone:
> >>>                # We do this since git will use a "-l" option automatically for local urls where possible,
> >>>                # but it doesn't work when git/objects is a symlink, only works when it is a directory.
> >>>                if repourl.startswith("file://"):
> >>> --
> >>> 2.33.0
> >>>
> >>>
> >>>
> >>>
> >> Is it possible to factor the try/except else to a function? Then we
> >> would not need the else, and would just return immediately and have a
> >> self contained code, which in my opinion is easier to read. I also think
> >> f-string would be more readable for runfetchcmd. f-string are available
> >> since python 3.6.
> >>
> >> Can you share the rationale for the else clause? I do not see anything
> >> inside the else clause that might raise a FetchError, thus the else
> >> clause is redundant.
> > The else clause is pretty much the entire reason for this patch in the
> > first place :)
> >
> > The problem is that the fetcher assumes that the simple existence of
> > the download directory means it can can add a remote and clone there,
> > but that is not true. If your clone directory is a subdirectory of
> > another git repository, but not a git repository itself, bitbake will
> > (incorrectly) add the remote the higher level git repos and fetch
> > there, leaving it a mess.
> >
> > More visually, if your layout is like this:
> >
> >   poky <-- This is a git repository
> >     build
> >        downloads
> >            git2
> >              foo <-- This is NOT a git repository
> >
> > The old code would see that the foo directory existed, see that it was
> > part of a git repo (poky!) add the remotes and start cloning objects
> > into the poky git repo, which is not at all what we want.
> >
> > The else clause corrects this as it makes the fetcher check if "foo"
> > is not only a git repo, but that the it is the toplevel of a git repo
> > (or at least that it's --git-dir is not a parent of that directory).
> > If it's not, it deletes it because it's not a valid place to clone
> > things into anymore.
>
> I understand the conceptual point of the patch and I think it is valid.
> What i mean is that you do not need the else clause. The code under the
> else clause can continue inside the try clause and not change the logic.
> The try-else idiom is very unusual and represents a meaning which is not
> needed here: That the code inside the else should not trigger the except
> clause.

Ah, yes. I see what you are saying now

>
> >> Another change I suggest is to move the rmtree inside the usual
> >> need_clone and ignore the error if the path does not exist. Removes
> >> another conditional.
> >>
> >> I also think using needs_clone to assert whether the rmtree should be
> >> done is confusing A different variable should be used or use a
> >> function's return value. All the code inside if
> >> os.path.exists(ud.clonedir): deals with the need for removal and as a
> >> side effect will trigger the clone in another part of the code.
> >>
> >>
> >>
> > -=-=-=-=-=-=-=-=-=-=-=-
> > Links: You receive all messages sent to this group.
> > View/Reply Online (#14984): https://lists.openembedded.org/g/bitbake-devel/message/14984
> > Mute This Topic: https://lists.openembedded.org/mt/100942876/4454782
> > Group Owner: bitbake-devel+owner@lists.openembedded.org
> > Unsubscribe: https://lists.openembedded.org/g/bitbake-devel/unsub [paulo@myneves.com]
> > -=-=-=-=-=-=-=-=-=-=-=-
> >
>
>


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-08-31 14:28 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-04 15:54 [bitbake-devel][PATCH] fetch2: git: Check if directory is a bare git repo Joshua Watt
2023-08-09  1:52 ` Alexandre Belloni
2023-08-17 19:19   ` Joshua Watt
2023-08-17 20:14     ` Alexandre Belloni
     [not found]     ` <177C44DDA5F18A0D.21730@lists.openembedded.org>
2023-08-19 12:21       ` Alexandre Belloni
2023-08-24 19:54         ` Joshua Watt
2023-08-24 19:53 ` [bitbake-devel][PATCH v2] fetch2: git: Check if clone directory is a " Joshua Watt
2023-08-25  9:18   ` Paulo Neves
2023-08-25 14:19     ` Joshua Watt
2023-08-31 12:21       ` Paulo Neves
2023-08-31 14:27         ` Joshua Watt
     [not found]   ` <177E95B06B95EDA3.23833@lists.openembedded.org>
2023-08-25  9:33     ` Paulo Neves

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.