* [Bug] data loss with cyclic alternates
@ 2014-07-11 9:37 Ephrim Khong
2014-07-11 16:01 ` Junio C Hamano
0 siblings, 1 reply; 16+ messages in thread
From: Ephrim Khong @ 2014-07-11 9:37 UTC (permalink / raw)
To: GIT Mailing-list
Hi,
git seems to have issues with alternates when cycles are present (repo A
has B/objects as alternates, B has A/objects as alternates). In such
cases, gc and repack might delete objects that are present in only one
of the alternates, leading to data loss.
I understand that this is no big use case, and requires manual editing
of objects/info/alternates. But for the sake of preventing unneccesary
data loss, and since I found no warning regarding alternate cycles in
the documentation, it might make sense to handle such cases properly
(maybe it's a simple "after finding all alternates directories, remove
duplicates"?).
Here is a small test case that adds the object storage of a repository
as alternate to itsself. After git repack -adl, all objects are deleted.
---
rm -rf test_repo &&
mkdir test_repo &&
cd test_repo &&
git init &&
touch a &&
git add a &&
git commit -m "c1" &&
git repack -adl &&
echo $PWD/.git/objects > .git/objects/info/alternates &&
echo ">> re-packing..." &&
git repack -adl &&
echo ">> git fsck..." &&
git fsck
---
Output:
---
Initialized empty Git repository in /somewhere/test_repo/.git/
[master (root-commit) ab9e123] c1
1 file changed, 0 insertions(+), 0 deletions(-)
create mode 100644 a
Counting objects: 3, done.
Writing objects: 100% (3/3), done.
Total 3 (delta 0), reused 0 (delta 0)
>> re-packing...
Nothing new to pack.
error: refs/heads/master does not point to a valid object!
>> git fsck...
Checking object directories: 100% (256/256), done.
Checking object directories: 100% (256/256), done.
error: HEAD: invalid sha1 pointer 1494ec24356cbbbd66e19f22cef762dd83de7f54
error: refs/heads/master does not point to a valid object!
notice: No default references
missing blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
---
Thanks
- Eph
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Bug] data loss with cyclic alternates
2014-07-11 9:37 [Bug] data loss with cyclic alternates Ephrim Khong
@ 2014-07-11 16:01 ` Junio C Hamano
2014-07-11 18:01 ` Keller, Jacob E
2014-07-13 10:44 ` Ephrim Khong
0 siblings, 2 replies; 16+ messages in thread
From: Junio C Hamano @ 2014-07-11 16:01 UTC (permalink / raw)
To: Ephrim Khong; +Cc: GIT Mailing-list
Ephrim Khong <dr.khong@gmail.com> writes:
> git seems to have issues with alternates when cycles are present (repo
> A has B/objects as alternates, B has A/objects as alternates).
Yeah, don't do that. A thinks "eh, the other guy must have it" and
B thinks the same. In general, do not prune or gc a repository
other repositories borrow from, even if there is no cycle, because
the borrowee does not know anythning about objects that it itself no
longer needs but are still needed by its borrowers.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Bug] data loss with cyclic alternates
2014-07-11 16:01 ` Junio C Hamano
@ 2014-07-11 18:01 ` Keller, Jacob E
2014-07-12 5:57 ` Jeff King
2014-07-13 10:44 ` Ephrim Khong
1 sibling, 1 reply; 16+ messages in thread
From: Keller, Jacob E @ 2014-07-11 18:01 UTC (permalink / raw)
To: gitster@pobox.com; +Cc: dr.khong@gmail.com, git@vger.kernel.org
On Fri, 2014-07-11 at 09:01 -0700, Junio C Hamano wrote:
> Ephrim Khong <dr.khong@gmail.com> writes:
>
> > git seems to have issues with alternates when cycles are present (repo
> > A has B/objects as alternates, B has A/objects as alternates).
>
> Yeah, don't do that. A thinks "eh, the other guy must have it" and
> B thinks the same. In general, do not prune or gc a repository
> other repositories borrow from, even if there is no cycle, because
> the borrowee does not know anythning about objects that it itself no
> longer needs but are still needed by its borrowers.
>
Doesn't gc get run automatically at some points? Is the automatic gc run
setup to avoid this problem?
Thanks,
Jake
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Bug] data loss with cyclic alternates
2014-07-11 18:01 ` Keller, Jacob E
@ 2014-07-12 5:57 ` Jeff King
2014-07-14 22:05 ` Keller, Jacob E
0 siblings, 1 reply; 16+ messages in thread
From: Jeff King @ 2014-07-12 5:57 UTC (permalink / raw)
To: Keller, Jacob E
Cc: gitster@pobox.com, dr.khong@gmail.com, git@vger.kernel.org
On Fri, Jul 11, 2014 at 06:01:46PM +0000, Keller, Jacob E wrote:
> > Yeah, don't do that. A thinks "eh, the other guy must have it" and
> > B thinks the same. In general, do not prune or gc a repository
> > other repositories borrow from, even if there is no cycle, because
> > the borrowee does not know anythning about objects that it itself no
> > longer needs but are still needed by its borrowers.
> >
>
> Doesn't gc get run automatically at some points? Is the automatic gc run
> setup to avoid this problem?
No, the automatic gc doesn't avoid this. It can't in the general case,
as the parent repository does not know how many or which children are
pointed to it as an alternate. And the borrowing repository does not
even need to have write permission to the parent, so it cannot write a
backpointer.
If people are using alternates, they should probably turn off gc.auto in
the borrowee (it doesn't seem unreasonable to me to do so automatically
via "clone -s" in cases where we can write to the alternates repo, and
to issue a warning otherwise).
-Peff
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [Bug] data loss with cyclic alternates
2014-07-11 16:01 ` Junio C Hamano
2014-07-11 18:01 ` Keller, Jacob E
@ 2014-07-13 10:44 ` Ephrim Khong
2014-07-14 9:02 ` [PATCH] sha1_file: do not add own object directory as alternate Ephrim Khong
1 sibling, 1 reply; 16+ messages in thread
From: Ephrim Khong @ 2014-07-13 10:44 UTC (permalink / raw)
To: Junio C Hamano; +Cc: GIT Mailing-list
Am 11.07.14 18:01, schrieb Junio C Hamano:
> Ephrim Khong <dr.khong@gmail.com> writes:
>
>> git seems to have issues with alternates when cycles are present (repo
>> A has B/objects as alternates, B has A/objects as alternates).
>
> Yeah, don't do that. A thinks "eh, the other guy must have it" and
> B thinks the same. In general, do not prune or gc a repository
> other repositories borrow from, even if there is no cycle, because
> the borrowee does not know anythning about objects that it itself no
> longer needs but are still needed by its borrowers.
It seems that there is a safeguard against this in sha1_file.c,
link_alt_odb_entry(), that doesn't work as intended:
if (!strcmp(ent->base, objdir)) {
free(ent);
return -1;
}
However, printf-debugging tells me that ent->base is absolute and objdir
is relative (".git/objects") at this point, so the strings are different
even though the files are the same.
I never submitted a patch to git. Do you think someone can fix this
hickup, otherwise I'll give it a shot next week.
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH] sha1_file: do not add own object directory as alternate
2014-07-13 10:44 ` Ephrim Khong
@ 2014-07-14 9:02 ` Ephrim Khong
2014-07-14 10:47 ` Duy Nguyen
0 siblings, 1 reply; 16+ messages in thread
From: Ephrim Khong @ 2014-07-14 9:02 UTC (permalink / raw)
To: GIT Mailing-list; +Cc: Junio C Hamano
When adding alternate object directories, we try not to add the
directory of the current repository to avoid cycles. Unfortunately,
that test was broken, since it compared an absolute with a relative
path.
Signed-off-by: Ephrim Khong <dr.khong@gmail.com>
---
My first patch, so be harsh. I'm not sure about the filename of the
test, the behavior is tested with repack, but it affects gc and others
as well.
sha1_file.c | 2 +-
t/t7702-repack-cyclic-alternate.sh | 24 ++++++++++++++++++++++++
2 files changed, 25 insertions(+), 1 deletion(-)
create mode 100755 t/t7702-repack-cyclic-alternate.sh
diff --git a/sha1_file.c b/sha1_file.c
index 34d527f..7e98e9e 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -320,7 +320,7 @@ static int link_alt_odb_entry(const char *entry,
const char *relative_base, int
return -1;
}
}
- if (!strcmp(ent->base, objdir)) {
+ if (!strcmp(ent->base, absolute_path(objdir))) {
free(ent);
return -1;
}
diff --git a/t/t7702-repack-cyclic-alternate.sh
b/t/t7702-repack-cyclic-alternate.sh
new file mode 100755
index 0000000..9a22d98
--- /dev/null
+++ b/t/t7702-repack-cyclic-alternate.sh
@@ -0,0 +1,24 @@
+#!/bin/sh
+#
+# Copyright (c) 2014 Ephrim Khong
+#
+
+test_description='repack involving cyclic alternate'
+. ./test-lib.sh
+
+test_expect_success setup '
+ git init repo &&
+ cd repo &&
+ touch a &&
+ git add a &&
+ git commit -m 1 &&
+ git repack -adl &&
+ echo $PWD/.git/objects > .git/objects/info/alternates
+'
+
+test_expect_success 're-packing repository with itsself as alternate' '
+ git repack -adl &&
+ git fsck
+'
+
+test_done
--
1.8.4.3
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH] sha1_file: do not add own object directory as alternate
2014-07-14 9:02 ` [PATCH] sha1_file: do not add own object directory as alternate Ephrim Khong
@ 2014-07-14 10:47 ` Duy Nguyen
2014-07-14 12:53 ` [PATCH v2] " Ephrim Khong
0 siblings, 1 reply; 16+ messages in thread
From: Duy Nguyen @ 2014-07-14 10:47 UTC (permalink / raw)
To: Ephrim Khong; +Cc: GIT Mailing-list, Junio C Hamano
On Mon, Jul 14, 2014 at 4:02 PM, Ephrim Khong <dr.khong@gmail.com> wrote:
> When adding alternate object directories, we try not to add the
> directory of the current repository to avoid cycles. Unfortunately,
> that test was broken, since it compared an absolute with a relative
> path.
Not blaming anyone. I'm simply interested in code archeology. The
first introduction of this strcmp is from 1494e03 (sha1_file.c: make
sure packs in an alternate odb is named properly. - 2005-12-04). The
intention is good as described in the commit message. But I think it's
broken even back then because "objdir" (or get_object_directory())
will always be relative (unless someone sets it explicitly) and
ent->base back then is already absolute (but not normalized).
>
> Signed-off-by: Ephrim Khong <dr.khong@gmail.com>
> ---
> My first patch, so be harsh. I'm not sure about the filename of the test,
> the behavior is tested with repack, but it affects gc and others as well.
>
> sha1_file.c | 2 +-
> t/t7702-repack-cyclic-alternate.sh | 24 ++++++++++++++++++++++++
> 2 files changed, 25 insertions(+), 1 deletion(-)
> create mode 100755 t/t7702-repack-cyclic-alternate.sh
>
> diff --git a/sha1_file.c b/sha1_file.c
> index 34d527f..7e98e9e 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -320,7 +320,7 @@ static int link_alt_odb_entry(const char *entry, const
> char *relative_base, int
> return -1;
> }
> }
> - if (!strcmp(ent->base, objdir)) {
> + if (!strcmp(ent->base, absolute_path(objdir))) {
I think you want to normalize objdir in addition to making it
absolute, in case someone sets GIT_OBJECT_DIR to foo///bar. ent->base
is normalized already. Oh and maybe use strcmp_icase to be nice on
case-insensitive filesystems..
Kinda nit picking, but perhaps this path
absolute-ization/normalization should be done by the caller
link_alt_odb_entries so you don't have to redo it for every entry in
alternates file. I know there's a loop of memcmp() right above this
that may be more expensive than this micro-optimization when we have
lots of alternate entries.
--
Duy
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2] sha1_file: do not add own object directory as alternate
2014-07-14 10:47 ` Duy Nguyen
@ 2014-07-14 12:53 ` Ephrim Khong
2014-07-15 5:44 ` Johannes Sixt
0 siblings, 1 reply; 16+ messages in thread
From: Ephrim Khong @ 2014-07-14 12:53 UTC (permalink / raw)
To: Duy Nguyen, GIT Mailing-list; +Cc: Junio C Hamano
When adding alternate object directories, we try not to add the
directory of the current repository to avoid cycles. Unfortunately,
that test was broken, since it compared an absolute with a relative
path.
Signed-off-by: Ephrim Khong <dr.khong@gmail.com>
---
As proposed by Duy, v2 of the patch normalizes the object directory,
uses strcmp_icase for case-insensitive file systems, and moves the
path normalization to link_alt_odb_entries to save some cycles.
Additionally, the test is now more compact by using the already
prepared git repository instead of a new one, and tests if the
path normalization for both objectdir and alternate works.
sha1_file.c | 13 +++++++++----
t/t7702-repack-cyclic-alternate.sh | 24 ++++++++++++++++++++++++
2 files changed, 33 insertions(+), 4 deletions(-)
create mode 100755 t/t7702-repack-cyclic-alternate.sh
diff --git a/sha1_file.c b/sha1_file.c
index 34d527f..88fd168 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -268,9 +268,9 @@ static struct alternate_object_database **alt_odb_tail;
* SHA1, an extra slash for the first level indirection, and the
* terminating NUL.
*/
-static int link_alt_odb_entry(const char *entry, const char *relative_base, int depth)
+static int link_alt_odb_entry(const char *entry, const char *relative_base,
+ int depth, const char *normalized_objdir)
{
- const char *objdir = get_object_directory();
struct alternate_object_database *ent;
struct alternate_object_database *alt;
int pfxlen, entlen;
@@ -320,7 +320,7 @@ static int link_alt_odb_entry(const char *entry, const char *relative_base, int
return -1;
}
}
- if (!strcmp(ent->base, objdir)) {
+ if (!strcmp_icase(ent->base, normalized_objdir)) {
free(ent);
return -1;
}
@@ -344,6 +344,7 @@ static void link_alt_odb_entries(const char *alt, int len, int sep,
struct string_list entries = STRING_LIST_INIT_NODUP;
char *alt_copy;
int i;
+ struct strbuf objdirbuf = STRBUF_INIT;
if (depth > 5) {
error("%s: ignoring alternate object stores, nesting too deep.",
@@ -351,6 +352,9 @@ static void link_alt_odb_entries(const char *alt, int len, int sep,
return;
}
+ strbuf_addstr(&objdirbuf, absolute_path(get_object_directory()));
+ normalize_path_copy(objdirbuf.buf, objdirbuf.buf);
+
alt_copy = xmemdupz(alt, len);
string_list_split_in_place(&entries, alt_copy, sep, -1);
for (i = 0; i < entries.nr; i++) {
@@ -361,11 +365,12 @@ static void link_alt_odb_entries(const char *alt, int len, int sep,
error("%s: ignoring relative alternate object store %s",
relative_base, entry);
} else {
- link_alt_odb_entry(entry, relative_base, depth);
+ link_alt_odb_entry(entry, relative_base, depth, objdirbuf.buf);
}
}
string_list_clear(&entries, 0);
free(alt_copy);
+ strbuf_release(&objdirbuf);
}
void read_info_alternates(const char * relative_base, int depth)
diff --git a/t/t7702-repack-cyclic-alternate.sh b/t/t7702-repack-cyclic-alternate.sh
new file mode 100755
index 0000000..3771613
--- /dev/null
+++ b/t/t7702-repack-cyclic-alternate.sh
@@ -0,0 +1,24 @@
+#!/bin/sh
+#
+# Copyright (c) 2014 Ephrim Khong
+#
+
+test_description='repack involving cyclic alternate'
+. ./test-lib.sh
+
+test_expect_success setup '
+ GIT_OBJECT_DIRECTORY=.git//../.git/objects &&
+ export GIT_OBJECT_DIRECTORY &&
+ touch a &&
+ git add a &&
+ git commit -m 1 &&
+ git repack -adl &&
+ echo $PWD/.git/objects/../objects > .git/objects/info/alternates
+'
+
+test_expect_success 're-packing repository with itsself as alternate' '
+ git repack -adl &&
+ git fsck
+'
+
+test_done
--
1.8.4.3
^ permalink raw reply related [flat|nested] 16+ messages in thread
* RE: [Bug] data loss with cyclic alternates
2014-07-12 5:57 ` Jeff King
@ 2014-07-14 22:05 ` Keller, Jacob E
0 siblings, 0 replies; 16+ messages in thread
From: Keller, Jacob E @ 2014-07-14 22:05 UTC (permalink / raw)
To: Jeff King; +Cc: gitster@pobox.com, dr.khong@gmail.com, git@vger.kernel.org
> -----Original Message-----
> From: Jeff King [mailto:peff@peff.net]
> Sent: Friday, July 11, 2014 10:57 PM
> To: Keller, Jacob E
> Cc: gitster@pobox.com; dr.khong@gmail.com; git@vger.kernel.org
> Subject: Re: [Bug] data loss with cyclic alternates
>
> On Fri, Jul 11, 2014 at 06:01:46PM +0000, Keller, Jacob E wrote:
>
> > > Yeah, don't do that. A thinks "eh, the other guy must have it" and
> > > B thinks the same. In general, do not prune or gc a repository
> > > other repositories borrow from, even if there is no cycle, because
> > > the borrowee does not know anythning about objects that it itself no
> > > longer needs but are still needed by its borrowers.
> > >
> >
> > Doesn't gc get run automatically at some points? Is the automatic gc run
> > setup to avoid this problem?
>
> No, the automatic gc doesn't avoid this. It can't in the general case,
> as the parent repository does not know how many or which children are
> pointed to it as an alternate. And the borrowing repository does not
> even need to have write permission to the parent, so it cannot write a
> backpointer.
>
> If people are using alternates, they should probably turn off gc.auto in
> the borrowee (it doesn't seem unreasonable to me to do so automatically
> via "clone -s" in cases where we can write to the alternates repo, and
> to issue a warning otherwise).
>
> -Peff
Yes, if this is a known problem and we can avoid it by disabling garbage collection, that makes sense to me to do so..
Thanks,
Jake
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v2] sha1_file: do not add own object directory as alternate
2014-07-14 12:53 ` [PATCH v2] " Ephrim Khong
@ 2014-07-15 5:44 ` Johannes Sixt
2014-07-15 11:29 ` [PATCH v3] " Ephrim Khong
0 siblings, 1 reply; 16+ messages in thread
From: Johannes Sixt @ 2014-07-15 5:44 UTC (permalink / raw)
To: Ephrim Khong, Duy Nguyen, GIT Mailing-list; +Cc: Junio C Hamano
Am 14.07.2014 14:53, schrieb Ephrim Khong:
> diff --git a/t/t7702-repack-cyclic-alternate.sh b/t/t7702-repack-cyclic-alternate.sh
> new file mode 100755
> index 0000000..3771613
> --- /dev/null
> +++ b/t/t7702-repack-cyclic-alternate.sh
> @@ -0,0 +1,24 @@
> +#!/bin/sh
> +#
> +# Copyright (c) 2014 Ephrim Khong
> +#
> +
> +test_description='repack involving cyclic alternate'
> +. ./test-lib.sh
> +
> +test_expect_success setup '
> + GIT_OBJECT_DIRECTORY=.git//../.git/objects &&
> + export GIT_OBJECT_DIRECTORY &&
> + touch a &&
> + git add a &&
> + git commit -m 1 &&
> + git repack -adl &&
> + echo $PWD/.git/objects/../objects > .git/objects/info/alternates
We need a Windows-style path in the file, double-quotes to protect
special characters, and a small style fix. Therefore:
echo "$(pwd)"/.git/objects/../objects >.git/objects/info/alternates
> +'
> +
> +test_expect_success 're-packing repository with itsself as alternate' '
> + git repack -adl &&
> + git fsck
> +'
> +
> +test_done
>
^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH v3] sha1_file: do not add own object directory as alternate
2014-07-15 5:44 ` Johannes Sixt
@ 2014-07-15 11:29 ` Ephrim Khong
2014-07-15 19:26 ` Junio C Hamano
` (2 more replies)
0 siblings, 3 replies; 16+ messages in thread
From: Ephrim Khong @ 2014-07-15 11:29 UTC (permalink / raw)
To: Johannes Sixt, Duy Nguyen, GIT Mailing-list; +Cc: Junio C Hamano
When adding alternate object directories, we try not to add the
directory of the current repository to avoid cycles. Unfortunately,
that test was broken, since it compared an absolute with a relative
path.
Signed-off-by: Ephrim Khong <dr.khong@gmail.com>
---
Since v2: Added Johannes' comments.
sha1_file.c | 13 +++++++++----
t/t7702-repack-cyclic-alternate.sh | 24 ++++++++++++++++++++++++
2 files changed, 33 insertions(+), 4 deletions(-)
create mode 100755 t/t7702-repack-cyclic-alternate.sh
diff --git a/sha1_file.c b/sha1_file.c
index 34d527f..88fd168 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -268,9 +268,9 @@ static struct alternate_object_database **alt_odb_tail;
* SHA1, an extra slash for the first level indirection, and the
* terminating NUL.
*/
-static int link_alt_odb_entry(const char *entry, const char *relative_base, int depth)
+static int link_alt_odb_entry(const char *entry, const char *relative_base,
+ int depth, const char *normalized_objdir)
{
- const char *objdir = get_object_directory();
struct alternate_object_database *ent;
struct alternate_object_database *alt;
int pfxlen, entlen;
@@ -320,7 +320,7 @@ static int link_alt_odb_entry(const char *entry, const char *relative_base, int
return -1;
}
}
- if (!strcmp(ent->base, objdir)) {
+ if (!strcmp_icase(ent->base, normalized_objdir)) {
free(ent);
return -1;
}
@@ -344,6 +344,7 @@ static void link_alt_odb_entries(const char *alt, int len, int sep,
struct string_list entries = STRING_LIST_INIT_NODUP;
char *alt_copy;
int i;
+ struct strbuf objdirbuf = STRBUF_INIT;
if (depth > 5) {
error("%s: ignoring alternate object stores, nesting too deep.",
@@ -351,6 +352,9 @@ static void link_alt_odb_entries(const char *alt, int len, int sep,
return;
}
+ strbuf_addstr(&objdirbuf, absolute_path(get_object_directory()));
+ normalize_path_copy(objdirbuf.buf, objdirbuf.buf);
+
alt_copy = xmemdupz(alt, len);
string_list_split_in_place(&entries, alt_copy, sep, -1);
for (i = 0; i < entries.nr; i++) {
@@ -361,11 +365,12 @@ static void link_alt_odb_entries(const char *alt, int len, int sep,
error("%s: ignoring relative alternate object store %s",
relative_base, entry);
} else {
- link_alt_odb_entry(entry, relative_base, depth);
+ link_alt_odb_entry(entry, relative_base, depth, objdirbuf.buf);
}
}
string_list_clear(&entries, 0);
free(alt_copy);
+ strbuf_release(&objdirbuf);
}
void read_info_alternates(const char * relative_base, int depth)
diff --git a/t/t7702-repack-cyclic-alternate.sh b/t/t7702-repack-cyclic-alternate.sh
new file mode 100755
index 0000000..8341d46
--- /dev/null
+++ b/t/t7702-repack-cyclic-alternate.sh
@@ -0,0 +1,24 @@
+#!/bin/sh
+#
+# Copyright (c) 2014 Ephrim Khong
+#
+
+test_description='repack involving cyclic alternate'
+. ./test-lib.sh
+
+test_expect_success setup '
+ GIT_OBJECT_DIRECTORY=.git//../.git/objects &&
+ export GIT_OBJECT_DIRECTORY &&
+ touch a &&
+ git add a &&
+ git commit -m 1 &&
+ git repack -adl &&
+ echo "$(pwd)"/.git/objects/../objects >.git/objects/info/alternates
+'
+
+test_expect_success 're-packing repository with itsself as alternate' '
+ git repack -adl &&
+ git fsck
+'
+
+test_done
--
1.8.4.3
^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH v3] sha1_file: do not add own object directory as alternate
2014-07-15 11:29 ` [PATCH v3] " Ephrim Khong
@ 2014-07-15 19:26 ` Junio C Hamano
2014-07-16 6:42 ` Ephrim Khong
2014-07-15 19:33 ` Eric Sunshine
2014-07-15 19:48 ` Junio C Hamano
2 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2014-07-15 19:26 UTC (permalink / raw)
To: Ephrim Khong; +Cc: Johannes Sixt, Duy Nguyen, GIT Mailing-list
Ephrim Khong <dr.khong@gmail.com> writes:
> @@ -320,7 +320,7 @@ static int link_alt_odb_entry(const char *entry, const char *relative_base, int
> return -1;
> }
> }
> - if (!strcmp(ent->base, objdir)) {
> + if (!strcmp_icase(ent->base, normalized_objdir)) {
Not a problem with your patch, but we should rethink the name of
this function when the code base is more quiet. It always makes me
wonder if it is something similar to strcasecmp(), but in fact it is
not. It is meant to be used *only* for pathnames; pathname_cmp() or
something that has "path" in its name would be appropriate, but it
is wrong to call it "str"-anything.
> @@ -344,6 +344,7 @@ static void link_alt_odb_entries(const char *alt, int len, int sep,
> struct string_list entries = STRING_LIST_INIT_NODUP;
> char *alt_copy;
> int i;
> + struct strbuf objdirbuf = STRBUF_INIT;
>
> if (depth > 5) {
> error("%s: ignoring alternate object stores, nesting too deep.",
> @@ -351,6 +352,9 @@ static void link_alt_odb_entries(const char *alt, int len, int sep,
> return;
> }
>
> + strbuf_addstr(&objdirbuf, absolute_path(get_object_directory()));
> + normalize_path_copy(objdirbuf.buf, objdirbuf.buf);
This is somewhat a strange usage of a strbuf.
- it relies on that normalize_path_copy() only shrinks, never
lengthens, which is not too bad;
- if the operation ever shrinks, objdirbuf.len becomes
meaningless. The allocated length is objdirbuf.alloc, length
of the string is strlen(objdirbuf.buf).
- abspath.c::absolute_path() is still restricted to PATH_MAX, so
you are not gaining much by using strbuf here.
But at least this patch is not making things any worse, so....
> @@ -361,11 +365,12 @@ static void link_alt_odb_entries(const char *alt, int len, int sep,
> error("%s: ignoring relative alternate object store %s",
> relative_base, entry);
> } else {
> - link_alt_odb_entry(entry, relative_base, depth);
> + link_alt_odb_entry(entry, relative_base, depth, objdirbuf.buf);
> }
> }
> string_list_clear(&entries, 0);
> free(alt_copy);
> + strbuf_release(&objdirbuf);
> }
>
> void read_info_alternates(const char * relative_base, int depth)
> diff --git a/t/t7702-repack-cyclic-alternate.sh b/t/t7702-repack-cyclic-alternate.sh
> new file mode 100755
> index 0000000..8341d46
> --- /dev/null
> +++ b/t/t7702-repack-cyclic-alternate.sh
Do we really have to waste a new test file only for this test?
Don't we have any test that already uses alternate that these two
new test pieces can be added to?
$ git grep info/alternates t/
seems to show a few existing ones, including 1450 (fsck) and 7700
(repack) that look very relevant (I didn't check what the tests in
them are about, though).
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3] sha1_file: do not add own object directory as alternate
2014-07-15 11:29 ` [PATCH v3] " Ephrim Khong
2014-07-15 19:26 ` Junio C Hamano
@ 2014-07-15 19:33 ` Eric Sunshine
2014-07-15 19:48 ` Junio C Hamano
2 siblings, 0 replies; 16+ messages in thread
From: Eric Sunshine @ 2014-07-15 19:33 UTC (permalink / raw)
To: Ephrim Khong; +Cc: Johannes Sixt, Duy Nguyen, GIT Mailing-list, Junio C Hamano
On Tue, Jul 15, 2014 at 7:29 AM, Ephrim Khong <dr.khong@gmail.com> wrote:
> When adding alternate object directories, we try not to add the
> directory of the current repository to avoid cycles. Unfortunately,
> that test was broken, since it compared an absolute with a relative
> path.
>
> Signed-off-by: Ephrim Khong <dr.khong@gmail.com>
> ---
> diff --git a/t/t7702-repack-cyclic-alternate.sh b/t/t7702-repack-cyclic-alternate.sh
> new file mode 100755
> index 0000000..8341d46
> --- /dev/null
> +++ b/t/t7702-repack-cyclic-alternate.sh
> @@ -0,0 +1,24 @@
> +#!/bin/sh
> +#
> +# Copyright (c) 2014 Ephrim Khong
> +#
> +
> +test_description='repack involving cyclic alternate'
> +. ./test-lib.sh
> +
> +test_expect_success setup '
> + GIT_OBJECT_DIRECTORY=.git//../.git/objects &&
> + export GIT_OBJECT_DIRECTORY &&
> + touch a &&
Since the existence of 'a' is significant here, not its timestamp, it
would be clearer to create the file with:
>a &&
> + git add a &&
> + git commit -m 1 &&
> + git repack -adl &&
> + echo "$(pwd)"/.git/objects/../objects >.git/objects/info/alternates
> +'
> +
> +test_expect_success 're-packing repository with itsself as alternate' '
> + git repack -adl &&
> + git fsck
> +'
> +
> +test_done
> --
> 1.8.4.3
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3] sha1_file: do not add own object directory as alternate
2014-07-15 11:29 ` [PATCH v3] " Ephrim Khong
2014-07-15 19:26 ` Junio C Hamano
2014-07-15 19:33 ` Eric Sunshine
@ 2014-07-15 19:48 ` Junio C Hamano
2014-07-16 6:47 ` Ephrim Khong
2 siblings, 1 reply; 16+ messages in thread
From: Junio C Hamano @ 2014-07-15 19:48 UTC (permalink / raw)
To: Ephrim Khong; +Cc: Johannes Sixt, Duy Nguyen, GIT Mailing-list
Ephrim Khong <dr.khong@gmail.com> writes:
> diff --git a/t/t7702-repack-cyclic-alternate.sh b/t/t7702-repack-cyclic-alternate.sh
> new file mode 100755
> index 0000000..8341d46
> --- /dev/null
> +++ b/t/t7702-repack-cyclic-alternate.sh
> @@ -0,0 +1,24 @@
> +#!/bin/sh
> +#
> +# Copyright (c) 2014 Ephrim Khong
> +#
> +
> +test_description='repack involving cyclic alternate'
> +. ./test-lib.sh
> +
> +test_expect_success setup '
> + GIT_OBJECT_DIRECTORY=.git//../.git/objects &&
> + export GIT_OBJECT_DIRECTORY &&
Do you need this artificially strange environment settings for the
problem to manifest itself, or is it sufficient to have a
non-canonical pathname in the info/alternates file?
Exporting an environment early in the test and having later tests in
the file depend on it makes it harder to debug when things go wrong,
than leaving an info/alternates file in the repository, primarily
because the latter can be inspected more easily in the trash
directory after "t7702-*.sh -i" dies, hence the above question.
> + touch a &&
Don't use 'touch' if you are not interested in timestams. Write this as
>a &&
because what you care about here in this test is that an empty file
'a' exists, so that you can run "git add" on it.
> + git add a &&
> + git commit -m 1 &&
> + git repack -adl &&
> + echo "$(pwd)"/.git/objects/../objects >.git/objects/info/alternates
> +'
> +
> +test_expect_success 're-packing repository with itsself as alternate' '
> + git repack -adl &&
> + git fsck
> +'
> +
> +test_done
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3] sha1_file: do not add own object directory as alternate
2014-07-15 19:26 ` Junio C Hamano
@ 2014-07-16 6:42 ` Ephrim Khong
0 siblings, 0 replies; 16+ messages in thread
From: Ephrim Khong @ 2014-07-16 6:42 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Johannes Sixt, Duy Nguyen, GIT Mailing-list
On 15.07.2014 21:26, Junio C Hamano wrote:
>> + strbuf_addstr(&objdirbuf, absolute_path(get_object_directory()));
>> + normalize_path_copy(objdirbuf.buf, objdirbuf.buf);
>
> This is somewhat a strange usage of a strbuf.
There might be a more elegant way, but I tried to mimic the local coding
style and simply copied how the alternate paths were normalized. Let me
know if you want this re-rolled without strbuf, otherwise I'll leave it
as-is.
>> diff --git a/t/t7702-repack-cyclic-alternate.sh b/t/t7702-repack-cyclic-alternate.sh
>> new file mode 100755
>> index 0000000..8341d46
>> --- /dev/null
>> +++ b/t/t7702-repack-cyclic-alternate.sh
>
> Do we really have to waste a new test file only for this test?
Absolutely not. I'll add it to 7700, which seems appropriate.
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH v3] sha1_file: do not add own object directory as alternate
2014-07-15 19:48 ` Junio C Hamano
@ 2014-07-16 6:47 ` Ephrim Khong
0 siblings, 0 replies; 16+ messages in thread
From: Ephrim Khong @ 2014-07-16 6:47 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Johannes Sixt, Duy Nguyen, GIT Mailing-list
On 15.07.2014 21:48, Junio C Hamano wrote:
> Ephrim Khong <dr.khong@gmail.com> writes:
>> +test_expect_success setup '
>> + GIT_OBJECT_DIRECTORY=.git//../.git/objects &&
>> + export GIT_OBJECT_DIRECTORY &&
>
> Do you need this artificially strange environment settings for the
> problem to manifest itself, or is it sufficient to have a
> non-canonical pathname in the info/alternates file?
The test should check the normalization of both the paths in
info/alternates and of GIT_OBJECT_DIRECTORY, so I'd prefer to leave it
in. It is not necessary to reproduce the original problem, though.
> Exporting an environment early in the test and having later tests in
> the file depend on it makes it harder to debug when things go wrong,
> than leaving an info/alternates file in the repository, primarily
> because the latter can be inspected more easily in the trash
> directory after "t7702-*.sh -i" dies, hence the above question.
That sounds reasonable. The variable is not required at the
initialization anyway, so I'll set it right before the actual test and
wrap it into a subshell to make debugging easier.
^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2014-07-16 6:48 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-07-11 9:37 [Bug] data loss with cyclic alternates Ephrim Khong
2014-07-11 16:01 ` Junio C Hamano
2014-07-11 18:01 ` Keller, Jacob E
2014-07-12 5:57 ` Jeff King
2014-07-14 22:05 ` Keller, Jacob E
2014-07-13 10:44 ` Ephrim Khong
2014-07-14 9:02 ` [PATCH] sha1_file: do not add own object directory as alternate Ephrim Khong
2014-07-14 10:47 ` Duy Nguyen
2014-07-14 12:53 ` [PATCH v2] " Ephrim Khong
2014-07-15 5:44 ` Johannes Sixt
2014-07-15 11:29 ` [PATCH v3] " Ephrim Khong
2014-07-15 19:26 ` Junio C Hamano
2014-07-16 6:42 ` Ephrim Khong
2014-07-15 19:33 ` Eric Sunshine
2014-07-15 19:48 ` Junio C Hamano
2014-07-16 6:47 ` Ephrim Khong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).