git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Add test for cloning with "--reference" repo being a subset of source repo
  2008-03-02 16:48   ` Daniel Barkalow
@ 2008-03-03  9:04     ` Johan Herland
  2008-03-03 16:36       ` Daniel Barkalow
  2008-03-03 18:21       ` Daniel Barkalow
  0 siblings, 2 replies; 9+ messages in thread
From: Johan Herland @ 2008-03-03  9:04 UTC (permalink / raw)
  To: Daniel Barkalow
  Cc: git, Johannes Schindelin, Kristian Høgsberg,
	Santi Béjar

The first test in this series tests "git clone -l -s --reference B A C",
where repo B is a superset of repo A (A has one commit, B has the same
commit plus another). In this case, all objects to be cloned are already
present in B.

However, we should also test the case where the "--reference" repo is a
_subset_ of the source repo (e.g. "git clone -l -s --reference A B C"),
i.e. some objects are not available in the "--reference" repo, and will
have to be found in the source repo.

Signed-off-by: Johan Herland <johan@herland.net>
---

On Sunday 02 March 2008, Daniel Barkalow wrote:
> In any case, I've got my current version at
> 
> git://iabervon.org/~barkalow/git.git builtin-clone

Thanks, it already looks much better than the initial version. :)

However, this added test currently fails for me with the following output:

repo is /home/johan/git/git/t/trash/B/.git
dir is E
Initialize E/.git
Initialized empty Git repository in E/.git/
Okay
Wrote /home/johan/git/git/t/trash/A/.git/objects
 to E/.git/objects/info/alternates
Wrote /home/johan/git/git/t/trash/B/.git/objects
 to E/.git/objects/info/alternates
Get for /home/johan/git/git/t/trash/B/.git
error: Trying to write ref refs/remotes/origin/master with nonexistant object 276cf9e94287a7c4e6f79b2724460e9650fa4871
fatal: Cannot update the ref 'refs/remotes/origin/master'.
Remove junk E/.git
Remove junk E

The same test work well with git-clone.sh.
Not sure what's going on here, yet, but I thought I'd give you a heads up.

...Johan


 t/t5700-clone-reference.sh |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/t/t5700-clone-reference.sh b/t/t5700-clone-reference.sh
index b6a5486..d318780 100755
--- a/t/t5700-clone-reference.sh
+++ b/t/t5700-clone-reference.sh
@@ -113,4 +113,9 @@ diff expected current'
 
 cd "$base_dir"
 
+test_expect_success 'cloning with reference being subset of source (-l -s)' \
+'git clone -l -s --reference A B E'
+
+cd "$base_dir"
+
 test_done
-- 
1.5.4.3.328.gcaed


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] Add test for cloning with "--reference" repo being a subset of source repo
  2008-03-03  9:04     ` [PATCH] Add test for cloning with "--reference" repo being a subset of source repo Johan Herland
@ 2008-03-03 16:36       ` Daniel Barkalow
  2008-03-03 18:21       ` Daniel Barkalow
  1 sibling, 0 replies; 9+ messages in thread
From: Daniel Barkalow @ 2008-03-03 16:36 UTC (permalink / raw)
  To: Johan Herland
  Cc: git, Johannes Schindelin, Kristian Høgsberg,
	Santi Béjar

On Mon, 3 Mar 2008, Johan Herland wrote:

> However, we should also test the case where the "--reference" repo is a
> _subset_ of the source repo (e.g. "git clone -l -s --reference A B C"),
> i.e. some objects are not available in the "--reference" repo, and will
> have to be found in the source repo.
> 
> However, this added test currently fails for me with the following output:
> 
> repo is /home/johan/git/git/t/trash/B/.git
> dir is E
> Initialize E/.git
> Initialized empty Git repository in E/.git/
> Okay
> Wrote /home/johan/git/git/t/trash/A/.git/objects
>  to E/.git/objects/info/alternates
> Wrote /home/johan/git/git/t/trash/B/.git/objects
>  to E/.git/objects/info/alternates
> Get for /home/johan/git/git/t/trash/B/.git
> error: Trying to write ref refs/remotes/origin/master with nonexistant object 276cf9e94287a7c4e6f79b2724460e9650fa4871
> fatal: Cannot update the ref 'refs/remotes/origin/master'.
> Remove junk E/.git
> Remove junk E
> 
> The same test work well with git-clone.sh.
> Not sure what's going on here, yet, but I thought I'd give you a heads up.

Thanks for the report; I haven't really gone through the local clone 
stuff, and I've altered the use of chdir at various points, so it's quite 
possible that it's not right at all for some cases.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Add test for cloning with "--reference" repo being a subset of source repo
  2008-03-03  9:04     ` [PATCH] Add test for cloning with "--reference" repo being a subset of source repo Johan Herland
  2008-03-03 16:36       ` Daniel Barkalow
@ 2008-03-03 18:21       ` Daniel Barkalow
  2008-03-04  3:02         ` Johan Herland
  1 sibling, 1 reply; 9+ messages in thread
From: Daniel Barkalow @ 2008-03-03 18:21 UTC (permalink / raw)
  To: Johan Herland
  Cc: git, Johannes Schindelin, Kristian Høgsberg,
	Santi Béjar

On Mon, 3 Mar 2008, Johan Herland wrote:

> Not sure what's going on here, yet, but I thought I'd give you a heads up.

I figured it out, and pushed out a fix; it was doing everything correctly, 
but it wrote to the alternates files after the library had read that file, 
so it then didn't notice that it actually had the objects that are in the 
second alternate repository.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Add test for cloning with "--reference" repo being a subset of source repo
  2008-03-03 18:21       ` Daniel Barkalow
@ 2008-03-04  3:02         ` Johan Herland
  2008-03-04 23:10           ` Daniel Barkalow
  0 siblings, 1 reply; 9+ messages in thread
From: Johan Herland @ 2008-03-04  3:02 UTC (permalink / raw)
  To: Daniel Barkalow
  Cc: git, Johannes Schindelin, Kristian Høgsberg,
	Santi Béjar, Junio C Hamano, Linus Torvalds

On Monday 03 March 2008, Daniel Barkalow wrote:
> On Mon, 3 Mar 2008, Johan Herland wrote:
> 
> > Not sure what's going on here, yet, but I thought I'd give you a heads up.
> 
> I figured it out, and pushed out a fix; it was doing everything correctly, 
> but it wrote to the alternates files after the library had read that file, 
> so it then didn't notice that it actually had the objects that are in the 
> second alternate repository.

Thanks. After looking a bit more at the original test repo where I found
this issue, I discovered another, similar bug. This one seems ugly; brace
yourself:

In some cases (I'm not exactly sure of all the preconditions) when
cloning with "--reference", it seems git tries to access a loose object
in the "--reference" repo instead of in the cloned repo, even if that
object is already present in the cloned repo and _missing_ in the
"--reference" repo. The symptom is this error message:
    error: Trying to write ref $ref with nonexistant object $sha1

After playing around with this in gdb, it seems the problem is all the
way down in sha1_file_name() (sha1_file.c). This function is responsible
for generating the loose object filename for a given $sha1. It keeps a
static char *base which is initially set to the object directory name,
and then calls fill_sha1_path() to copy the rest of the object filename
into the following bytes. On subsequent calls, only the fill_sha1_path()
part is done, thereby reusing the base from the previous invocation.

What I observe is that this base is not reset after accessing loose
objects in the "--reference" repo. Thus, later when accessing objects in
the cloned repo, sha1_file_name() generates incorrect filenames (pointing
to the "--reference" repo instead of the cloned repo).

Of course, this often goes undetected since the "--reference" repo often
have the same loose objects as the clone.

Unfortunately (from a builtin git-clone's POV) this seems to be
symptomatic of a deeper problem in this part of the code: Using
function-static variables as caches only works as far as the cache
is in sync with reality. Especially when switching between multiple
repositories within the same process, it seems that several of these
variables are left with invalid data in them. This needs to be fixed,
if not only for now, then at least as part of the libification effort.

I'm not sure what is the best way of fixing this issue; my initial guess
is to move these function-static variables out to file-level, and make
sure they're properly reset whenever the appropriate context is changed
(typically when set_git_dir() is called, I guess).

Here are the function-static variables I immediately found in sha1_file.c
(there may be more, both in sha1_file.c and in other files):
- sha1_file_name(): static char *base
- sha1_pack_name(): static char *base
- sha1_pack_index_name(): static char *base
- find_pack_entry(): static struct packed_git *last_found
  (not sure about this one)

I will follow up this email with two patches, one adding the failing test,
and one providing a simple fix for that specific test (although very much
insufficient as a fix for the actual issue described above).


...Johan

-- 
Johan Herland, <johan@herland.net>
www.herland.net

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Add test for cloning with "--reference" repo being a subset of source repo
  2008-03-04  3:02         ` Johan Herland
@ 2008-03-04 23:10           ` Daniel Barkalow
  2008-03-05  0:24             ` Daniel Barkalow
  0 siblings, 1 reply; 9+ messages in thread
From: Daniel Barkalow @ 2008-03-04 23:10 UTC (permalink / raw)
  To: Johan Herland
  Cc: git, Johannes Schindelin, Kristian Høgsberg,
	Santi Béjar, Junio C Hamano, Linus Torvalds

On Tue, 4 Mar 2008, Johan Herland wrote:

> On Monday 03 March 2008, Daniel Barkalow wrote:
> > On Mon, 3 Mar 2008, Johan Herland wrote:
> > 
> > > Not sure what's going on here, yet, but I thought I'd give you a heads up.
> > 
> > I figured it out, and pushed out a fix; it was doing everything correctly, 
> > but it wrote to the alternates files after the library had read that file, 
> > so it then didn't notice that it actually had the objects that are in the 
> > second alternate repository.
> 
> Thanks. After looking a bit more at the original test repo where I found
> this issue, I discovered another, similar bug. This one seems ugly; brace
> yourself:
> 
> In some cases (I'm not exactly sure of all the preconditions) when
> cloning with "--reference", it seems git tries to access a loose object
> in the "--reference" repo instead of in the cloned repo, even if that
> object is already present in the cloned repo and _missing_ in the
> "--reference" repo. The symptom is this error message:
>     error: Trying to write ref $ref with nonexistant object $sha1
> 
> After playing around with this in gdb, it seems the problem is all the
> way down in sha1_file_name() (sha1_file.c). This function is responsible
> for generating the loose object filename for a given $sha1. It keeps a
> static char *base which is initially set to the object directory name,
> and then calls fill_sha1_path() to copy the rest of the object filename
> into the following bytes. On subsequent calls, only the fill_sha1_path()
> part is done, thereby reusing the base from the previous invocation.
>
> What I observe is that this base is not reset after accessing loose
> objects in the "--reference" repo. Thus, later when accessing objects in
> the cloned repo, sha1_file_name() generates incorrect filenames (pointing
> to the "--reference" repo instead of the cloned repo).
> 
> Of course, this often goes undetected since the "--reference" repo often
> have the same loose objects as the clone.
> 
> Unfortunately (from a builtin git-clone's POV) this seems to be
> symptomatic of a deeper problem in this part of the code: Using
> function-static variables as caches only works as far as the cache
> is in sync with reality. Especially when switching between multiple
> repositories within the same process, it seems that several of these
> variables are left with invalid data in them. This needs to be fixed,
> if not only for now, then at least as part of the libification effort.
> 
> I'm not sure what is the best way of fixing this issue; my initial guess
> is to move these function-static variables out to file-level, and make
> sure they're properly reset whenever the appropriate context is changed
> (typically when set_git_dir() is called, I guess).

I think we should be able to avoid setting git_dir to anything other than 
the repo we're creating, which would avoid this problem for the present, 
although, as you say, it would be good to be able to switch around as 
instructed for libification purposes eventually.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Add test for cloning with "--reference" repo being a subset of source repo
  2008-03-04 23:10           ` Daniel Barkalow
@ 2008-03-05  0:24             ` Daniel Barkalow
  2008-03-05 23:56               ` Johan Herland
  0 siblings, 1 reply; 9+ messages in thread
From: Daniel Barkalow @ 2008-03-05  0:24 UTC (permalink / raw)
  To: Johan Herland
  Cc: git, Johannes Schindelin, Kristian Høgsberg,
	Santi Béjar, Junio C Hamano, Linus Torvalds

On Tue, 4 Mar 2008, Daniel Barkalow wrote:

> I think we should be able to avoid setting git_dir to anything other than 
> the repo we're creating, which would avoid this problem for the present, 
> although, as you say, it would be good to be able to switch around as 
> instructed for libification purposes eventually.

Okay, stuff pushed out to not use git_dir to access the reference repo, 
and an additional test that requires that we actually note that we have 
the refs in the reference repository (because otherwise we could pass all 
the tests by just making --reference useless, but that's obviously no 
good).

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Add test for cloning with "--reference" repo being a subset of source repo
  2008-03-05  0:24             ` Daniel Barkalow
@ 2008-03-05 23:56               ` Johan Herland
  0 siblings, 0 replies; 9+ messages in thread
From: Johan Herland @ 2008-03-05 23:56 UTC (permalink / raw)
  To: git
  Cc: Daniel Barkalow, Johannes Schindelin, Kristian Høgsberg,
	Santi Béjar, Junio C Hamano, Linus Torvalds

On Wednesday 05 March 2008, Daniel Barkalow wrote:
> On Tue, 4 Mar 2008, Daniel Barkalow wrote:
> > I think we should be able to avoid setting git_dir to anything other than 
> > the repo we're creating, which would avoid this problem for the present, 
> > although, as you say, it would be good to be able to switch around as 
> > instructed for libification purposes eventually.
> 
> Okay, stuff pushed out to not use git_dir to access the reference repo, 
> and an additional test that requires that we actually note that we have 
> the refs in the reference repository (because otherwise we could pass all 
> the tests by just making --reference useless, but that's obviously no 
> good).

Thanks. Nice work. The testsuite passes, and it all looks good from here. :)


...Johan

-- 
Johan Herland, <johan@herland.net>
www.herland.net

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] Add test for cloning with "--reference" repo being a subset of source repo
@ 2008-05-22 22:03 Daniel Barkalow
  2008-05-22 22:31 ` Johan Herland
  0 siblings, 1 reply; 9+ messages in thread
From: Daniel Barkalow @ 2008-05-22 22:03 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

From: Johan Herland <johan@herland.net>

The first test in this series tests "git clone -l -s --reference B A C",
where repo B is a superset of repo A (A has one commit, B has the same
commit plus another). In this case, all objects to be cloned are already
present in B.

However, we should also test the case where the "--reference" repo is a
_subset_ of the source repo (e.g. "git clone -l -s --reference A B C"),
i.e. some objects are not available in the "--reference" repo, and will
have to be found in the source repo.

Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
---
 t/t5700-clone-reference.sh |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/t/t5700-clone-reference.sh b/t/t5700-clone-reference.sh
index b6a5486..d318780 100755
--- a/t/t5700-clone-reference.sh
+++ b/t/t5700-clone-reference.sh
@@ -113,4 +113,9 @@ diff expected current'
 
 cd "$base_dir"
 
+test_expect_success 'cloning with reference being subset of source (-l -s)' \
+'git clone -l -s --reference A B E'
+
+cd "$base_dir"
+
 test_done
-- 
1.5.3.7

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] Add test for cloning with "--reference" repo being a subset of source repo
  2008-05-22 22:03 [PATCH] Add test for cloning with "--reference" repo being a subset of source repo Daniel Barkalow
@ 2008-05-22 22:31 ` Johan Herland
  0 siblings, 0 replies; 9+ messages in thread
From: Johan Herland @ 2008-05-22 22:31 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git, Junio C Hamano

On Friday 23 May 2008, Daniel Barkalow wrote:
> From: Johan Herland <johan@herland.net>
> 
> The first test in this series tests "git clone -l -s --reference B A C",
> where repo B is a superset of repo A (A has one commit, B has the same
> commit plus another). In this case, all objects to be cloned are already
> present in B.
> 
> However, we should also test the case where the "--reference" repo is a
> _subset_ of the source repo (e.g. "git clone -l -s --reference A B C"),
> i.e. some objects are not available in the "--reference" repo, and will
> have to be found in the source repo.
> 
> Signed-off-by: Johan Herland <johan@herland.net>
> Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>

Thanks for the resend.

...Johan

-- 
Johan Herland, <johan@herland.net>
www.herland.net

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2008-05-22 22:33 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-22 22:03 [PATCH] Add test for cloning with "--reference" repo being a subset of source repo Daniel Barkalow
2008-05-22 22:31 ` Johan Herland
  -- strict thread matches above, loose matches on Subject: below --
2008-02-25 21:12 [RFC] Build in clone Daniel Barkalow
2008-03-02  7:46 ` [PATCH] builtin clone: support bundles Johannes Schindelin
2008-03-02 16:48   ` Daniel Barkalow
2008-03-03  9:04     ` [PATCH] Add test for cloning with "--reference" repo being a subset of source repo Johan Herland
2008-03-03 16:36       ` Daniel Barkalow
2008-03-03 18:21       ` Daniel Barkalow
2008-03-04  3:02         ` Johan Herland
2008-03-04 23:10           ` Daniel Barkalow
2008-03-05  0:24             ` Daniel Barkalow
2008-03-05 23:56               ` Johan Herland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).