git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* getting list of objects for packing
@ 2008-10-31 19:32 Brandon Casey
  2008-10-31 20:40 ` Nicolas Pitre
  0 siblings, 1 reply; 61+ messages in thread
From: Brandon Casey @ 2008-10-31 19:32 UTC (permalink / raw)
  To: Git Mailing List


I'm trying to write a script that will repack large binary or compressed
objects into their own non-compressed, non-delta'ed pack file.

To make the decision about whether an object should go into this special
pack file or not, I want the output from 'git cat-file --batch-check'.
I get it with something similar to:

   git rev-list --objects --all |
      sed -e 's/^\([0-9a-f]\{40\}\).*/\1/' |
      git cat-file --batch-check

First question: Is the rev-list call correct?
  -If I am understanding things right, then the list of objects produced
   by rev-list will be in the right order for piping to pack-objects. 
  -The sed statement is stripping off anything after the sha1. Any way to
   get rev-list to print out just the sha1 so that sed is not necessary?

Then I want to parse the output from cat-file and use an external program
to detect the file format. Here is a simplified version:

  | while read sha1 type size; do

       if [ $type = "blob" ]; then
           if ! ( git cat-file blob "$sha1" | file -b - | grep text ) &&
              [ $size -ge $threshhold ]; then
               # pack into special pack
           else
               # pack normally into normal pack
           fi
       fi
  done

All of this has actually been rewritten into a perl script, so ignore any
syntax mistakes.

I have successfully created two of the pack files that I have been trying to
make. Where the definition of successful means that after removing the existing
packs and objects, and putting in place the two pack files that I generated,
'git fsck --full' prints no errors and exits successfully.

These two packs will be placed into a central repository.

ISSUE TWO:

I have placed these two packs into my own personal repo, and I have unpacked all
of the other objects so that they are loose.

I thought I could use a similar sequence of commands to pack those loose objects
into a normal and special pack. I added the --unpacked option to my rev-list
command, but it still lists many more objects than exist loosely in the repository.

   git rev-list --objects --unpacked --all

The man page says:

   --objects
          Print  the  object  IDs  of any object referenced by the listed
          commits. --objects foo ^bar thus means "send me all object  IDs
          which  I  need to download if I have the commit object bar, but
          not foo".

   --unpacked
          Only useful with --objects; print the object IDs that  are  not
          in packs.

Is this the correct behavior for rev-list --unpacked?
Am I mis-reading the --unpacked text, or should it be changed?

-brandon

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: getting list of objects for packing
  2008-10-31 19:32 getting list of objects for packing Brandon Casey
@ 2008-10-31 20:40 ` Nicolas Pitre
  2008-10-31 20:48   ` Brandon Casey
  0 siblings, 1 reply; 61+ messages in thread
From: Nicolas Pitre @ 2008-10-31 20:40 UTC (permalink / raw)
  To: Brandon Casey; +Cc: Git Mailing List

On Fri, 31 Oct 2008, Brandon Casey wrote:

> 
> I'm trying to write a script that will repack large binary or compressed
> objects into their own non-compressed, non-delta'ed pack file.
> 
> To make the decision about whether an object should go into this special
> pack file or not, I want the output from 'git cat-file --batch-check'.
> I get it with something similar to:
> 
>    git rev-list --objects --all |
>       sed -e 's/^\([0-9a-f]\{40\}\).*/\1/' |
>       git cat-file --batch-check
> 
> First question: Is the rev-list call correct?

yes.

>   -If I am understanding things right, then the list of objects produced
>    by rev-list will be in the right order for piping to pack-objects. 
>   -The sed statement is stripping off anything after the sha1. Any way to
>    get rev-list to print out just the sha1 so that sed is not necessary?

If you strip the data after the SHA1 when pipping into pack-objects then 
you'll have horrible delta compression results.  The path names after 
each SHA1 is used to sort objects when trying to find best matches for 
delta compression. So you should preserve those and feed it back 
especially with those packs that you still want delta compression for.

> ISSUE TWO:
> 
> I have placed these two packs into my own personal repo, and I have unpacked all
> of the other objects so that they are loose.
> 
> I thought I could use a similar sequence of commands to pack those loose objects
> into a normal and special pack. I added the --unpacked option to my rev-list
> command, but it still lists many more objects than exist loosely in the repository.
> 
>    git rev-list --objects --unpacked --all
> 
> The man page says:
> 
>    --objects
>           Print  the  object  IDs  of any object referenced by the listed
>           commits. --objects foo ^bar thus means "send me all object  IDs
>           which  I  need to download if I have the commit object bar, but
>           not foo".
> 
>    --unpacked
>           Only useful with --objects; print the object IDs that  are  not
>           in packs.
> 
> Is this the correct behavior for rev-list --unpacked?

Well, don't think so.  Although I have zero unpacked objects in my git 
repository, the listing still gives me a hundred objects or so, and they 
all appear to be tag objects.  There is probably a bug here.


Nicolas

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: getting list of objects for packing
  2008-10-31 20:40 ` Nicolas Pitre
@ 2008-10-31 20:48   ` Brandon Casey
  2008-10-31 21:30     ` Junio C Hamano
  0 siblings, 1 reply; 61+ messages in thread
From: Brandon Casey @ 2008-10-31 20:48 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Git Mailing List

Nicolas Pitre wrote:
> On Fri, 31 Oct 2008, Brandon Casey wrote:

>>   -The sed statement is stripping off anything after the sha1. Any way to
>>    get rev-list to print out just the sha1 so that sed is not necessary?
> 
> If you strip the data after the SHA1 when pipping into pack-objects then 
> you'll have horrible delta compression results.  The path names after 
> each SHA1 is used to sort objects when trying to find best matches for 
> delta compression. So you should preserve those and feed it back 
> especially with those packs that you still want delta compression for.

Ah, I'll have to rethink my script then. Thanks!

-brandon

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: getting list of objects for packing
  2008-10-31 20:48   ` Brandon Casey
@ 2008-10-31 21:30     ` Junio C Hamano
  2008-10-31 21:40       ` Brandon Casey
  0 siblings, 1 reply; 61+ messages in thread
From: Junio C Hamano @ 2008-10-31 21:30 UTC (permalink / raw)
  To: Brandon Casey; +Cc: Nicolas Pitre, Git Mailing List

Brandon Casey <casey@nrlssc.navy.mil> writes:

> Nicolas Pitre wrote:
>> On Fri, 31 Oct 2008, Brandon Casey wrote:
>
>>>   -The sed statement is stripping off anything after the sha1. Any way to
>>>    get rev-list to print out just the sha1 so that sed is not necessary?
>> 
>> If you strip the data after the SHA1 when pipping into pack-objects then 
>> you'll have horrible delta compression results.  The path names after 
>> each SHA1 is used to sort objects when trying to find best matches for 
>> delta compression. So you should preserve those and feed it back 
>> especially with those packs that you still want delta compression for.
>
> Ah, I'll have to rethink my script then. Thanks!

Yeah, but wasn't the purpose of your whole exercise to list objects that
do not delta nor compress well with each other, in which case the delta
compression order (aka name hash) would not matter, no?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: getting list of objects for packing
  2008-10-31 21:30     ` Junio C Hamano
@ 2008-10-31 21:40       ` Brandon Casey
  2008-10-31 22:23         ` Jakub Narebski
  2008-11-01  0:00         ` Brandon Casey
  0 siblings, 2 replies; 61+ messages in thread
From: Brandon Casey @ 2008-10-31 21:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nicolas Pitre, Git Mailing List

Junio C Hamano wrote:
> Brandon Casey <casey@nrlssc.navy.mil> writes:
> 
>> Nicolas Pitre wrote:
>>> On Fri, 31 Oct 2008, Brandon Casey wrote:
>>>>   -The sed statement is stripping off anything after the sha1. Any way to
>>>>    get rev-list to print out just the sha1 so that sed is not necessary?
>>> If you strip the data after the SHA1 when pipping into pack-objects then 
>>> you'll have horrible delta compression results.  The path names after 
>>> each SHA1 is used to sort objects when trying to find best matches for 
>>> delta compression. So you should preserve those and feed it back 
>>> especially with those packs that you still want delta compression for.
>> Ah, I'll have to rethink my script then. Thanks!
> 
> Yeah, but wasn't the purpose of your whole exercise to list objects that
> do not delta nor compress well with each other, in which case the delta
> compression order (aka name hash) would not matter, no?

The script I wrote actually starts up two pack-objects instances and I was
writing the objects I wanted to pack _normally_ to one, and the ones that I
did not want compressed/deltafied to the other (which was started with
--no-reuse-object --window=0 --depth=0 --compression=0).

I didn't mentioned that fact in my first email, but I'm very glad Nico
made his point.

-brandon

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: getting list of objects for packing
  2008-10-31 21:40       ` Brandon Casey
@ 2008-10-31 22:23         ` Jakub Narebski
  2008-11-01  0:00         ` Brandon Casey
  1 sibling, 0 replies; 61+ messages in thread
From: Jakub Narebski @ 2008-10-31 22:23 UTC (permalink / raw)
  To: Brandon Casey; +Cc: Junio C Hamano, Nicolas Pitre, Git Mailing List

Brandon Casey <casey@nrlssc.navy.mil> writes:

> Junio C Hamano wrote:

> > Yeah, but wasn't the purpose of your whole exercise to list objects that
> > do not delta nor compress well with each other, in which case the delta
> > compression order (aka name hash) would not matter, no?
> 
> The script I wrote actually starts up two pack-objects instances and I was
> writing the objects I wanted to pack _normally_ to one, and the ones that I
> did not want compressed/deltafied to the other (which was started with
> --no-reuse-object --window=0 --depth=0 --compression=0).
> 
> I didn't mentioned that fact in my first email, but I'm very glad Nico
> made his point.

Wasn't there some gitattribute which prohibited deltification of some
files (`delta` or something)?  Or wasn't this patch accepted, as I
cannot find such attribute in documentation (gitattributes(5))...
... err, it was added in commit a74db82e15cd8a2c53a4a83e9a36dc7bf7a4c750
(Teach "delta" attribute to pack-objects.) by Junio C Hamano in May
19, _without_ documentation.

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: getting list of objects for packing
  2008-10-31 21:40       ` Brandon Casey
  2008-10-31 22:23         ` Jakub Narebski
@ 2008-11-01  0:00         ` Brandon Casey
  2008-11-02  3:35           ` [PATCH] t7700: demonstrate mishandling of objects in packs with a .keep file drafnel
  1 sibling, 1 reply; 61+ messages in thread
From: Brandon Casey @ 2008-11-01  0:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nicolas Pitre, Git Mailing List

Brandon Casey wrote:
> Junio C Hamano wrote:
>> Brandon Casey <casey@nrlssc.navy.mil> writes:
>>
>>> Nicolas Pitre wrote:
>>>> On Fri, 31 Oct 2008, Brandon Casey wrote:
>>>>>   -The sed statement is stripping off anything after the sha1. Any way to
>>>>>    get rev-list to print out just the sha1 so that sed is not necessary?
>>>> If you strip the data after the SHA1 when pipping into pack-objects then 
>>>> you'll have horrible delta compression results.  The path names after 
>>>> each SHA1 is used to sort objects when trying to find best matches for 
>>>> delta compression. So you should preserve those and feed it back 
>>>> especially with those packs that you still want delta compression for.
>>> Ah, I'll have to rethink my script then. Thanks!
>> Yeah, but wasn't the purpose of your whole exercise to list objects that
>> do not delta nor compress well with each other, in which case the delta
>> compression order (aka name hash) would not matter, no?
> 
> The script I wrote actually starts up two pack-objects instances and I was
> writing the objects I wanted to pack _normally_ to one, and the ones that I
> did not want compressed/deltafied to the other (which was started with
> --no-reuse-object --window=0 --depth=0 --compression=0).

So, my script created two pack files: one packed normally, and one packed without
compression or delta. I removed my original packs, and put these two new ones in
my pack directory and ran 'git fsck --full' and it completed successfully. There
are no loose objects in the repo.

I added a .keep file for each pack.

Since my script removed the extra info from rev-parse's output, I removed the
.keep file from the appropriate pack and ran 'git gc --aggressive'.

The 1.7GB pack that had the .keep file removed has been replaced with a +3GB
pack file. The other pack file which still has the .keep file is 2.3GB.

In another repo with 3 packs marked .keep, and one 388KB pack with ~300
objects in it, and 3 loose dangling objects, the 388KB pack was replaced with
a 3.5GB pack.

It appears that the entire repository is being packed into the new pack file
even though there are existing pack files with .keep files.

If I compare the output from 'git verify-pack -v' I can see that many of the
objects in the packs marked with a .keep file are indeed in the new pack file.
But not all of them.

-brandon

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH] t7700: demonstrate mishandling of objects in packs with a .keep file
  2008-11-01  0:00         ` Brandon Casey
@ 2008-11-02  3:35           ` drafnel
  2008-11-02 16:31             ` [PATCH 1/3] packed_git: convert pack_local flag into generic bit mask drafnel
                               ` (2 more replies)
  0 siblings, 3 replies; 61+ messages in thread
From: drafnel @ 2008-11-02  3:35 UTC (permalink / raw)
  To: git; +Cc: gitster, nico, Brandon Casey

From: Brandon Casey <drafnel@gmail.com>

Objects residing in pack files that have an associated .keep file are not
supposed to be repacked into new pack files, but they are.

Signed-off-by: Brandon Casey <drafnel@gmail.com>
---
 t/t7700-repack.sh |   38 ++++++++++++++++++++++++++++++++++++++
 1 files changed, 38 insertions(+), 0 deletions(-)
 create mode 100755 t/t7700-repack.sh

diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
new file mode 100755
index 0000000..1489e68
--- /dev/null
+++ b/t/t7700-repack.sh
@@ -0,0 +1,38 @@
+#!/bin/sh
+
+test_description='git repack works correctly'
+
+. ./test-lib.sh
+
+test_expect_failure 'objects in packs marked .keep are not repacked' '
+	echo content1 > file1 &&
+	echo content2 > file2 &&
+	git add . &&
+	git commit -m initial_commit &&
+	# Create two packs 
+	# The first pack will contain all of the objects except one
+	git rev-list --objects --all | head -n -1 |
+		git pack-objects pack > /dev/null &&
+	# The second pack will contain the excluded object
+	packsha1=$(git rev-list --objects --all | tail -n 1 |
+		git pack-objects pack) &&
+	touch -r pack-$packsha1.pack pack-$packsha1.keep &&
+	objsha1=$(git verify-pack -v pack-$packsha1.idx | head -n 1 |
+		sed -e "s/^\([0-9a-f]\{40\}\).*/\1/") &&
+	mv pack-* .git/objects/pack/ &&
+	git repack -A -d -l &&
+	git prune-packed &&
+	for p in .git/objects/pack/*.idx; do
+		idx=$(basename $p)
+		test "pack-$packsha1.idx" = "$idx" && continue
+		if git verify-pack -v $p | egrep "^$objsha1"; then
+			found_duplicate_object=1
+			echo "DUPLICATE OBJECT FOUND"
+			break
+		fi
+	done &&
+	test -z "$found_duplicate_object"
+'
+
+test_done
+
-- 
1.6.0.2.588.g3102

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 1/3] packed_git: convert pack_local flag into generic bit mask
  2008-11-02  3:35           ` [PATCH] t7700: demonstrate mishandling of objects in packs with a .keep file drafnel
@ 2008-11-02 16:31             ` drafnel
  2008-11-03 16:12               ` Shawn O. Pearce
       [not found]             ` <1225643477-32319-1-git-send-email-foo@foo.com>
  2008-11-03 10:35             ` [PATCH] t7700: demonstrate mishandling of objects in packs with a .keep file Andreas Ericsson
  2 siblings, 1 reply; 61+ messages in thread
From: drafnel @ 2008-11-02 16:31 UTC (permalink / raw)
  To: git; +Cc: gitster, nico, spearce, Brandon Casey

From: Brandon Casey <drafnel@gmail.com>

This converts the pack_local flag of the packed_git structure into a generic
bit mask and introduces a PACK_LOCAL mask and an ispacklocal() access macro.

So instead of this:

   if (p->pack_local)
      do_something

you would do this:

   if (ispacklocal(p))
      do_something

This is in preparation for adding a flag indicating whether a .keep file is
present.

Signed-off-by: Brandon Casey <drafnel@gmail.com>
---
 builtin-count-objects.c |    2 +-
 builtin-gc.c            |    2 +-
 builtin-pack-objects.c  |    2 +-
 cache.h                 |    5 ++++-
 pack-redundant.c        |    4 ++--
 server-info.c           |    4 ++--
 sha1_file.c             |    5 +++--
 7 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/builtin-count-objects.c b/builtin-count-objects.c
index ab35b65..3f981d6 100644
--- a/builtin-count-objects.c
+++ b/builtin-count-objects.c
@@ -108,7 +108,7 @@ int cmd_count_objects(int argc, const char **argv, const char *prefix)
 		if (!packed_git)
 			prepare_packed_git();
 		for (p = packed_git; p; p = p->next) {
-			if (!p->pack_local)
+			if (!ispacklocal(p))
 				continue;
 			if (open_pack_index(p))
 				continue;
diff --git a/builtin-gc.c b/builtin-gc.c
index 7af65bb..0473158 100644
--- a/builtin-gc.c
+++ b/builtin-gc.c
@@ -135,7 +135,7 @@ static int too_many_packs(void)
 		size_t len;
 		int keep;
 
-		if (!p->pack_local)
+		if (!ispacklocal(p))
 			continue;
 		len = strlen(p->pack_name);
 		if (PATH_MAX <= len + 1)
diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 59c30d1..6a8b9bf 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -701,7 +701,7 @@ static int add_object_entry(const unsigned char *sha1, enum object_type type,
 				break;
 			if (incremental)
 				return 0;
-			if (local && !p->pack_local)
+			if (local && !ispacklocal(p))
 				return 0;
 		}
 	}
diff --git a/cache.h b/cache.h
index b0edbf9..0cb9350 100644
--- a/cache.h
+++ b/cache.h
@@ -679,12 +679,15 @@ extern struct packed_git {
 	int index_version;
 	time_t mtime;
 	int pack_fd;
-	int pack_local;
+	unsigned int flags;
 	unsigned char sha1[20];
 	/* something like ".git/objects/pack/xxxxx.pack" */
 	char pack_name[FLEX_ARRAY]; /* more */
 } *packed_git;
 
+#define PACK_LOCAL	1
+#define ispacklocal(p) ((p)->flags & PACK_LOCAL)
+
 struct pack_entry {
 	off_t offset;
 	unsigned char sha1[20];
diff --git a/pack-redundant.c b/pack-redundant.c
index 25b81a4..964f18f 100644
--- a/pack-redundant.c
+++ b/pack-redundant.c
@@ -544,7 +544,7 @@ static struct pack_list * add_pack(struct packed_git *p)
 	unsigned long off = 0, step;
 	const unsigned char *base;
 
-	if (!p->pack_local && !(alt_odb || verbose))
+	if (!ispacklocal(p) && !(alt_odb || verbose))
 		return NULL;
 
 	l.pack = p;
@@ -562,7 +562,7 @@ static struct pack_list * add_pack(struct packed_git *p)
 	}
 	/* this list will be pruned in cmp_two_packs later */
 	l.unique_objects = llist_copy(l.all_objects);
-	if (p->pack_local)
+	if (ispacklocal(p))
 		return pack_list_insert(&local_packs, &l);
 	else
 		return pack_list_insert(&altodb_packs, &l);
diff --git a/server-info.c b/server-info.c
index c1c073b..2eb20f5 100644
--- a/server-info.c
+++ b/server-info.c
@@ -168,14 +168,14 @@ static void init_pack_info(const char *infofile, int force)
 		/* we ignore things on alternate path since they are
 		 * not available to the pullers in general.
 		 */
-		if (!p->pack_local)
+		if (!ispacklocal(p))
 			continue;
 		i++;
 	}
 	num_pack = i;
 	info = xcalloc(num_pack, sizeof(struct pack_info *));
 	for (i = 0, p = packed_git; p; p = p->next) {
-		if (!p->pack_local)
+		if (!ispacklocal(p))
 			continue;
 		info[i] = xcalloc(1, sizeof(struct pack_info));
 		info[i]->p = p;
diff --git a/sha1_file.c b/sha1_file.c
index ab2b520..e4141c9 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -851,7 +851,8 @@ struct packed_git *add_packed_git(const char *path, int path_len, int local)
 	 * actually mapping the pack file.
 	 */
 	p->pack_size = st.st_size;
-	p->pack_local = local;
+	if (local)
+		p->flags |= PACK_LOCAL;
 	p->mtime = st.st_mtime;
 	if (path_len < 40 || get_sha1_hex(path + path_len - 40, p->sha1))
 		hashclr(p->sha1);
@@ -941,7 +942,7 @@ static int sort_pack(const void *a_, const void *b_)
 	 * remote ones could be on a network mounted filesystem.
 	 * Favor local ones for these reasons.
 	 */
-	st = a->pack_local - b->pack_local;
+	st = ispacklocal(a) - ispacklocal(b);
 	if (st)
 		return -st;
 
-- 
1.6.0.2.588.g3102

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 2/3] packed_git: add new PACK_KEEP flag and haspackkeep() access macro
       [not found]             ` <1225643477-32319-1-git-send-email-foo@foo.com>
@ 2008-11-02 16:31               ` drafnel
       [not found]               ` <1225643477-32319-2-git-send-email-foo@foo.com>
  1 sibling, 0 replies; 61+ messages in thread
From: drafnel @ 2008-11-02 16:31 UTC (permalink / raw)
  To: git; +Cc: gitster, nico, spearce, Brandon Casey

From: Brandon Casey <drafnel@gmail.com>

If you want to tell whether a pack has an associated ".keep" file you
would do:

   if (haspackkeep(p))
      do_something

Signed-off-by: Brandon Casey <drafnel@gmail.com>
---
 cache.h     |    2 ++
 sha1_file.c |    5 +++++
 2 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/cache.h b/cache.h
index 0cb9350..48cd366 100644
--- a/cache.h
+++ b/cache.h
@@ -686,7 +686,9 @@ extern struct packed_git {
 } *packed_git;
 
 #define PACK_LOCAL	1
+#define PACK_KEEP	2
 #define ispacklocal(p) ((p)->flags & PACK_LOCAL)
+#define haspackkeep(p) ((p)->flags & PACK_KEEP)
 
 struct pack_entry {
 	off_t offset;
diff --git a/sha1_file.c b/sha1_file.c
index e4141c9..8a027e9 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -841,6 +841,11 @@ struct packed_git *add_packed_git(const char *path, int path_len, int local)
 		return NULL;
 	}
 	memcpy(p->pack_name, path, path_len);
+
+	strcpy(p->pack_name + path_len, ".keep");
+	if (!access(p->pack_name, F_OK))
+		p->flags |= PACK_KEEP;
+
 	strcpy(p->pack_name + path_len, ".pack");
 	if (stat(p->pack_name, &st) || !S_ISREG(st.st_mode)) {
 		free(p);
-- 
1.6.0.2.588.g3102

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 3/3] pack-objects: honor '.keep' files
       [not found]               ` <1225643477-32319-2-git-send-email-foo@foo.com>
@ 2008-11-02 16:31                 ` drafnel
  2008-11-03 16:17                   ` Shawn O. Pearce
  0 siblings, 1 reply; 61+ messages in thread
From: drafnel @ 2008-11-02 16:31 UTC (permalink / raw)
  To: git; +Cc: gitster, nico, spearce, Brandon Casey

From: Brandon Casey <drafnel@gmail.com>

By default, pack-objects creates a pack file with every object specified by
the user. There are two options which can be used to exclude objects which
are accessible by the repository.

   1) --incremental
     This excludes any object which already exists in an accessible pack.

   2) --local
     This excludes any object which exists in a non-local pack.

With this patch, both arguments also cause objects which exist in packs
marked with a .keep file to be excluded. Only the --local option requires
an explicit check for the .keep file. If the user doesn't want the objects
in a pack marked with .keep to be exclude, then the .keep file should be
removed.

Additionally, this fixes the repack bug which allowed porcelain repack to
create packs which contained objects already contained in existing packs
marked with a .keep file.

Signed-off-by: Brandon Casey <drafnel@gmail.com>
---


This seems to be the correct fix.

I thought about two alternatives:

  1) New option to pack-objects which causes it to honor .keep files

     I didn't think this was necessary since that is why we have .keep
     files in the first place.

     So, now there are three variants:

     Neither --incremental or --local is used)
        Ignore repo packs, pack is created using all objects specified by
        user.
     --incremental is used)
        Look at repo packs, exclude already packed objects (including those
        marked with .keep file)
     --local is used)
        Look at repo packs, exclude objects in non-local packs (including local
        objects marked with .keep file)

  2) Always honor .keep files, even when --incremental or --local is
     not specified. 

     I think this may be counter-intuitive, since if you specify all of
     the objects explicitly, without using any other options, you expect
     all of those objects to be placed into the created pack file. If
     .keep is always honored, then a new option would be required to
     indicate ignoring .keep files, or we would adopt the platform that
     "the user must just remove the .keep file".


So, honoring .keep files only with --incremental (which is a side effect) or
 --local seems to be the most appropriate.

Comments welcome.

-brandon


 builtin-pack-objects.c |    2 +-
 t/t7700-repack.sh      |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 6a8b9bf..5199e54 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -701,7 +701,7 @@ static int add_object_entry(const unsigned char *sha1, enum object_type type,
 				break;
 			if (incremental)
 				return 0;
-			if (local && !ispacklocal(p))
+			if (local && (!ispacklocal(p) || haspackkeep(p)))
 				return 0;
 		}
 	}
diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
index 1489e68..b6b02da 100755
--- a/t/t7700-repack.sh
+++ b/t/t7700-repack.sh
@@ -4,7 +4,7 @@ test_description='git repack works correctly'
 
 . ./test-lib.sh
 
-test_expect_failure 'objects in packs marked .keep are not repacked' '
+test_expect_success 'objects in packs marked .keep are not repacked' '
 	echo content1 > file1 &&
 	echo content2 > file2 &&
 	git add . &&
-- 
1.6.0.2.588.g3102

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH] t7700: demonstrate mishandling of objects in packs with a .keep file
  2008-11-02  3:35           ` [PATCH] t7700: demonstrate mishandling of objects in packs with a .keep file drafnel
  2008-11-02 16:31             ` [PATCH 1/3] packed_git: convert pack_local flag into generic bit mask drafnel
       [not found]             ` <1225643477-32319-1-git-send-email-foo@foo.com>
@ 2008-11-03 10:35             ` Andreas Ericsson
  2008-11-03 18:20               ` Brandon Casey
  2 siblings, 1 reply; 61+ messages in thread
From: Andreas Ericsson @ 2008-11-03 10:35 UTC (permalink / raw)
  To: drafnel; +Cc: git, gitster, nico

drafnel@gmail.com wrote:
> From: Brandon Casey <drafnel@gmail.com>
> 
> Objects residing in pack files that have an associated .keep file are not
> supposed to be repacked into new pack files, but they are.
> 

I think that's a misconception. Packfiles that are marked with .keep files
should never be deleted. There are, afaik, no rules against packing the
same objects into other packfiles as well. This is nifty for dumb ref
walkers, as they can use a small pack for incremental fetching while using
a mega-pack for initial cloning.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 1/3] packed_git: convert pack_local flag into generic bit mask
  2008-11-02 16:31             ` [PATCH 1/3] packed_git: convert pack_local flag into generic bit mask drafnel
@ 2008-11-03 16:12               ` Shawn O. Pearce
  2008-11-03 18:24                 ` Brandon Casey
  2008-11-03 20:37                 ` [PATCH v2 1/3] t7700: demonstrate mishandling of objects in packs with a .keep file Brandon Casey
  0 siblings, 2 replies; 61+ messages in thread
From: Shawn O. Pearce @ 2008-11-03 16:12 UTC (permalink / raw)
  To: drafnel; +Cc: git, gitster, nico

drafnel@gmail.com wrote:
> This is in preparation for adding a flag indicating whether a .keep file is
> present.

Good idea.
 
> diff --git a/cache.h b/cache.h
> index b0edbf9..0cb9350 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -679,12 +679,15 @@ extern struct packed_git {
>  	int index_version;
>  	time_t mtime;
>  	int pack_fd;
> -	int pack_local;
> +	unsigned int flags;

Hmm, isn't this a smaller change to make?

-	int pack_local;
+	unsigned pack_local:1;

Then later you can do:

-	unsigned pack_local:1;
+	unsigned pack_local:1,
+		pack_keep:1;

and the compiler handles all the bitmask stuff for you?

In general in git.git we like to use the struct bitmask stuff when
possible as the code is easier to follow.  We only use explicit
mask constants and mask operations when the data is being stored
on disk or written over the network and we need to ensure it is
consistent across compilers.  But for in-core only stuff, struct
bitmasks are easier.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 3/3] pack-objects: honor '.keep' files
  2008-11-02 16:31                 ` [PATCH 3/3] pack-objects: honor '.keep' files drafnel
@ 2008-11-03 16:17                   ` Shawn O. Pearce
  0 siblings, 0 replies; 61+ messages in thread
From: Shawn O. Pearce @ 2008-11-03 16:17 UTC (permalink / raw)
  To: drafnel; +Cc: git, gitster, nico

drafnel@gmail.com wrote:
> From: Brandon Casey <drafnel@gmail.com>
> 
> By default, pack-objects creates a pack file with every object specified by
> the user. There are two options which can be used to exclude objects which
> are accessible by the repository.
> 
>    1) --incremental
>      This excludes any object which already exists in an accessible pack.
> 
>    2) --local
>      This excludes any object which exists in a non-local pack.
> 
> With this patch, both arguments also cause objects which exist in packs
> marked with a .keep file to be excluded. Only the --local option requires
> an explicit check for the .keep file. If the user doesn't want the objects
> in a pack marked with .keep to be exclude, then the .keep file should be
> removed.
> 
> Additionally, this fixes the repack bug which allowed porcelain repack to
> create packs which contained objects already contained in existing packs
> marked with a .keep file.

This looks sane to me, but I'm holding off on the ack because
I don't like the way the flags are managed in struct packed_git.
(See my reply to 1/3 in the series.)

 
> Signed-off-by: Brandon Casey <drafnel@gmail.com>
> ---
> 
> 
> This seems to be the correct fix.
> 
> I thought about two alternatives:
> 
>   1) New option to pack-objects which causes it to honor .keep files
> 
>      I didn't think this was necessary since that is why we have .keep
>      files in the first place.
> 
>      So, now there are three variants:
> 
>      Neither --incremental or --local is used)
>         Ignore repo packs, pack is created using all objects specified by
>         user.
>      --incremental is used)
>         Look at repo packs, exclude already packed objects (including those
>         marked with .keep file)
>      --local is used)
>         Look at repo packs, exclude objects in non-local packs (including local
>         objects marked with .keep file)
> 
>   2) Always honor .keep files, even when --incremental or --local is
>      not specified. 
> 
>      I think this may be counter-intuitive, since if you specify all of
>      the objects explicitly, without using any other options, you expect
>      all of those objects to be placed into the created pack file. If
>      .keep is always honored, then a new option would be required to
>      indicate ignoring .keep files, or we would adopt the platform that
>      "the user must just remove the .keep file".
> 
> 
> So, honoring .keep files only with --incremental (which is a side effect) or
>  --local seems to be the most appropriate.
> 
> Comments welcome.
> 
> -brandon
> 
> 
>  builtin-pack-objects.c |    2 +-
>  t/t7700-repack.sh      |    2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
> index 6a8b9bf..5199e54 100644
> --- a/builtin-pack-objects.c
> +++ b/builtin-pack-objects.c
> @@ -701,7 +701,7 @@ static int add_object_entry(const unsigned char *sha1, enum object_type type,
>  				break;
>  			if (incremental)
>  				return 0;
> -			if (local && !ispacklocal(p))
> +			if (local && (!ispacklocal(p) || haspackkeep(p)))
>  				return 0;
>  		}
>  	}
> diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
> index 1489e68..b6b02da 100755
> --- a/t/t7700-repack.sh
> +++ b/t/t7700-repack.sh
> @@ -4,7 +4,7 @@ test_description='git repack works correctly'
>  
>  . ./test-lib.sh
>  
> -test_expect_failure 'objects in packs marked .keep are not repacked' '
> +test_expect_success 'objects in packs marked .keep are not repacked' '
>  	echo content1 > file1 &&
>  	echo content2 > file2 &&
>  	git add . &&
> -- 
> 1.6.0.2.588.g3102
> 
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Shawn.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] t7700: demonstrate mishandling of objects in packs with a .keep file
  2008-11-03 10:35             ` [PATCH] t7700: demonstrate mishandling of objects in packs with a .keep file Andreas Ericsson
@ 2008-11-03 18:20               ` Brandon Casey
  2008-11-03 20:25                 ` Andreas Ericsson
  0 siblings, 1 reply; 61+ messages in thread
From: Brandon Casey @ 2008-11-03 18:20 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: drafnel, git, gitster, nico

Andreas Ericsson wrote:
> drafnel@gmail.com wrote:
>> From: Brandon Casey <drafnel@gmail.com>
>>
>> Objects residing in pack files that have an associated .keep file are not
>> supposed to be repacked into new pack files, but they are.
>>
> 
> I think that's a misconception. Packfiles that are marked with .keep files
> should never be deleted. There are, afaik, no rules against packing the
> same objects into other packfiles as well. This is nifty for dumb ref
> walkers, as they can use a small pack for incremental fetching while using
> a mega-pack for initial cloning.

Having no rules against an object residing in more than one pack is different
from intending for git to produce pack files with redundant objects.

I think one intention for the .keep mechanism was to allow for a size optimized
pack to be produced and distributed. Currently, if I am handed such a pack file,
I can not merely place it into my pack directory (along with the .idx and .keep
files) and then run git-gc to remove any redundancy. Instead, I would get
a _new_ pack file which would contain all of the objects in the repository and
effectively double the size of my objects store. That doesn't seem like
something a user would expect or should expect.

-brandon

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 1/3] packed_git: convert pack_local flag into generic bit mask
  2008-11-03 16:12               ` Shawn O. Pearce
@ 2008-11-03 18:24                 ` Brandon Casey
  2008-11-03 20:37                 ` [PATCH v2 1/3] t7700: demonstrate mishandling of objects in packs with a .keep file Brandon Casey
  1 sibling, 0 replies; 61+ messages in thread
From: Brandon Casey @ 2008-11-03 18:24 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: drafnel, git, gitster, nico

Shawn O. Pearce wrote:
> drafnel@gmail.com wrote:

>> diff --git a/cache.h b/cache.h
>> index b0edbf9..0cb9350 100644
>> --- a/cache.h
>> +++ b/cache.h
>> @@ -679,12 +679,15 @@ extern struct packed_git {
>>  	int index_version;
>>  	time_t mtime;
>>  	int pack_fd;
>> -	int pack_local;
>> +	unsigned int flags;
> 
> Hmm, isn't this a smaller change to make?

heh, well if you want to do it the "easy" way. :)

For some reason I've never used this bit field mechanism, but
I agree it is more readable and simpler, and you can't argue
with that.

-brandon

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] t7700: demonstrate mishandling of objects in packs with a .keep file
  2008-11-03 18:20               ` Brandon Casey
@ 2008-11-03 20:25                 ` Andreas Ericsson
  2008-11-03 22:02                   ` Brandon Casey
  0 siblings, 1 reply; 61+ messages in thread
From: Andreas Ericsson @ 2008-11-03 20:25 UTC (permalink / raw)
  To: Brandon Casey; +Cc: drafnel, git, gitster, nico

Brandon Casey wrote:
> Andreas Ericsson wrote:
>> drafnel@gmail.com wrote:
>>> From: Brandon Casey <drafnel@gmail.com>
>>>
>>> Objects residing in pack files that have an associated .keep file are not
>>> supposed to be repacked into new pack files, but they are.
>>>
>> I think that's a misconception. Packfiles that are marked with .keep files
>> should never be deleted. There are, afaik, no rules against packing the
>> same objects into other packfiles as well. This is nifty for dumb ref
>> walkers, as they can use a small pack for incremental fetching while using
>> a mega-pack for initial cloning.
> 
> Having no rules against an object residing in more than one pack is different
> from intending for git to produce pack files with redundant objects.
> 
> I think one intention for the .keep mechanism was to allow for a size optimized
> pack to be produced and distributed. Currently, if I am handed such a pack file,
> I can not merely place it into my pack directory (along with the .idx and .keep
> files) and then run git-gc to remove any redundancy. Instead, I would get
> a _new_ pack file which would contain all of the objects in the repository and
> effectively double the size of my objects store. That doesn't seem like
> something a user would expect or should expect.
> 

So long as "git repack -a" still creates a mega-pack, I'm fine with whatever.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v2 1/3] t7700: demonstrate mishandling of objects in packs with a .keep file
  2008-11-03 16:12               ` Shawn O. Pearce
  2008-11-03 18:24                 ` Brandon Casey
@ 2008-11-03 20:37                 ` Brandon Casey
  2008-11-03 20:41                   ` [PATCH v2 2/3] packed_git: convert pack_local flag into a bitfield and add pack_keep Brandon Casey
                                     ` (4 more replies)
  1 sibling, 5 replies; 61+ messages in thread
From: Brandon Casey @ 2008-11-03 20:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn O. Pearce, Git Mailing List, Nicolas Pitre

From: Brandon Casey <drafnel@gmail.com>

Objects residing in pack files that have an associated .keep file are not
supposed to be repacked into new pack files, but they are.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
---


This version replaces the use of 'head -n -1' with a grep, and should work on
all platforms.

-brandon


 t/t7700-repack.sh |   38 ++++++++++++++++++++++++++++++++++++++
 1 files changed, 38 insertions(+), 0 deletions(-)
 create mode 100755 t/t7700-repack.sh

diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
new file mode 100755
index 0000000..27af5ab
--- /dev/null
+++ b/t/t7700-repack.sh
@@ -0,0 +1,38 @@
+#!/bin/sh
+
+test_description='git repack works correctly'
+
+. ./test-lib.sh
+
+test_expect_failure 'objects in packs marked .keep are not repacked' '
+	echo content1 > file1 &&
+	echo content2 > file2 &&
+	git add . &&
+	git commit -m initial_commit &&
+	# Create two packs
+	# The first pack will contain all of the objects except one
+        git rev-list --objects --all | grep -v file2 |
+		git pack-objects pack > /dev/null &&
+	# The second pack will contain the excluded object
+        packsha1=$(git rev-list --objects --all | grep file2 |
+		git pack-objects pack) &&
+	touch -r pack-$packsha1.pack pack-$packsha1.keep &&
+	objsha1=$(git verify-pack -v pack-$packsha1.idx | head -n 1 |
+		sed -e "s/^\([0-9a-f]\{40\}\).*/\1/") &&
+	mv pack-* .git/objects/pack/ &&
+	git repack -A -d -l &&
+	git prune-packed &&
+	for p in .git/objects/pack/*.idx; do
+		idx=$(basename $p)
+		test "pack-$packsha1.idx" = "$idx" && continue
+		if git verify-pack -v $p | egrep "^$objsha1"; then
+			found_duplicate_object=1
+			echo "DUPLICATE OBJECT FOUND"
+			break
+		fi
+	done &&
+	test -z "$found_duplicate_object"
+'
+
+test_done
+
-- 
1.6.0.3.552.g12334

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v2 2/3] packed_git: convert pack_local flag into a bitfield and add pack_keep
  2008-11-03 20:37                 ` [PATCH v2 1/3] t7700: demonstrate mishandling of objects in packs with a .keep file Brandon Casey
@ 2008-11-03 20:41                   ` Brandon Casey
  2008-11-03 20:43                     ` [PATCH v2 3/3] pack-objects: honor '.keep' files Brandon Casey
  2008-11-03 22:14                   ` [PATCH v3] t7700: demonstrate mishandling of objects in packs with a .keep file Brandon Casey
                                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 61+ messages in thread
From: Brandon Casey @ 2008-11-03 20:41 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List, Shawn O. Pearce, Nicolas Pitre

From: Brandon Casey <drafnel@gmail.com>

pack_keep will be set when a pack file has an associated .keep file.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
---


This patch and the following one redo the previous 3-patch series using a
bitfield as prudently suggested by Shawn.

It seemed silly to keep the conversion of pack_local into a bitfield, and the
introduction of pack_keep separate, so all 7 lines are in this one patch.

-brandon


 cache.h     |    3 ++-
 sha1_file.c |    5 +++++
 2 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/cache.h b/cache.h
index b0edbf9..b80cb08 100644
--- a/cache.h
+++ b/cache.h
@@ -679,7 +679,8 @@ extern struct packed_git {
 	int index_version;
 	time_t mtime;
 	int pack_fd;
-	int pack_local;
+	unsigned pack_local:1,
+		 pack_keep:1;
 	unsigned char sha1[20];
 	/* something like ".git/objects/pack/xxxxx.pack" */
 	char pack_name[FLEX_ARRAY]; /* more */
diff --git a/sha1_file.c b/sha1_file.c
index ab2b520..f2b25bd 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -841,6 +841,11 @@ struct packed_git *add_packed_git(const char *path, int path_len, int local)
 		return NULL;
 	}
 	memcpy(p->pack_name, path, path_len);
+
+	strcpy(p->pack_name + path_len, ".keep");
+	if (!access(p->pack_name, F_OK))
+		p->pack_keep = 1;
+
 	strcpy(p->pack_name + path_len, ".pack");
 	if (stat(p->pack_name, &st) || !S_ISREG(st.st_mode)) {
 		free(p);
-- 
1.6.0.3.552.g12334

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v2 3/3] pack-objects: honor '.keep' files
  2008-11-03 20:41                   ` [PATCH v2 2/3] packed_git: convert pack_local flag into a bitfield and add pack_keep Brandon Casey
@ 2008-11-03 20:43                     ` Brandon Casey
  2008-11-03 20:49                       ` Shawn O. Pearce
  2008-11-05 22:37                       ` Brandon Casey
  0 siblings, 2 replies; 61+ messages in thread
From: Brandon Casey @ 2008-11-03 20:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List, Shawn O. Pearce, Nicolas Pitre

From: Brandon Casey <drafnel@gmail.com>

By default, pack-objects creates a pack file with every object specified by
the user. There are two options which can be used to exclude objects which
are accessible by the repository.

   1) --incremental
     This excludes any object which already exists in an accessible pack.

   2) --local
     This excludes any object which exists in a non-local pack.

With this patch, both arguments also cause objects which exist in packs
marked with a .keep file to be excluded. Only the --local option requires
an explicit check for the .keep file. If the user doesn't want the objects
in a pack marked with .keep to be exclude, then the .keep file should be
removed.

Additionally, this fixes the repack bug which allowed porcelain repack to
create packs which contained objects already contained in existing packs
marked with a .keep file.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
---
 builtin-pack-objects.c |    2 +-
 t/t7700-repack.sh      |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 15b80db..8be9113 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -701,7 +701,7 @@ static int add_object_entry(const unsigned char *sha1, enum object_type type,
 				break;
 			if (incremental)
 				return 0;
-			if (local && !p->pack_local)
+			if (local && (!p->pack_local || p->pack_keep))
 				return 0;
 		}
 	}
diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
index 27af5ab..5b1cd05 100755
--- a/t/t7700-repack.sh
+++ b/t/t7700-repack.sh
@@ -4,7 +4,7 @@ test_description='git repack works correctly'
 
 . ./test-lib.sh
 
-test_expect_failure 'objects in packs marked .keep are not repacked' '
+test_expect_success 'objects in packs marked .keep are not repacked' '
 	echo content1 > file1 &&
 	echo content2 > file2 &&
 	git add . &&
-- 
1.6.0.3.552.g12334

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 3/3] pack-objects: honor '.keep' files
  2008-11-03 20:43                     ` [PATCH v2 3/3] pack-objects: honor '.keep' files Brandon Casey
@ 2008-11-03 20:49                       ` Shawn O. Pearce
  2008-11-05 22:37                       ` Brandon Casey
  1 sibling, 0 replies; 61+ messages in thread
From: Shawn O. Pearce @ 2008-11-03 20:49 UTC (permalink / raw)
  To: Brandon Casey; +Cc: Junio C Hamano, Git Mailing List, Nicolas Pitre

Brandon Casey <casey@nrlssc.navy.mil> wrote:
> From: Brandon Casey <drafnel@gmail.com>
> 
> By default, pack-objects creates a pack file with every object specified by
> the user. There are two options which can be used to exclude objects which
> are accessible by the repository.
> 
>    1) --incremental
>      This excludes any object which already exists in an accessible pack.
> 
>    2) --local
>      This excludes any object which exists in a non-local pack.
> 
> With this patch, both arguments also cause objects which exist in packs
> marked with a .keep file to be excluded. Only the --local option requires
> an explicit check for the .keep file. If the user doesn't want the objects
> in a pack marked with .keep to be exclude, then the .keep file should be
> removed.
> 
> Additionally, this fixes the repack bug which allowed porcelain repack to
> create packs which contained objects already contained in existing packs
> marked with a .keep file.
> 
> Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>

This one and the one before it (2/3):

Acked-by: Shawn O. Pearce <spearce@spearce.org>

> ---
>  builtin-pack-objects.c |    2 +-
>  t/t7700-repack.sh      |    2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
> index 15b80db..8be9113 100644
> --- a/builtin-pack-objects.c
> +++ b/builtin-pack-objects.c
> @@ -701,7 +701,7 @@ static int add_object_entry(const unsigned char *sha1, enum object_type type,
>  				break;
>  			if (incremental)
>  				return 0;
> -			if (local && !p->pack_local)
> +			if (local && (!p->pack_local || p->pack_keep))
>  				return 0;
>  		}
>  	}
> diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
> index 27af5ab..5b1cd05 100755
> --- a/t/t7700-repack.sh
> +++ b/t/t7700-repack.sh
> @@ -4,7 +4,7 @@ test_description='git repack works correctly'
>  
>  . ./test-lib.sh
>  
> -test_expect_failure 'objects in packs marked .keep are not repacked' '
> +test_expect_success 'objects in packs marked .keep are not repacked' '
>  	echo content1 > file1 &&
>  	echo content2 > file2 &&
>  	git add . &&
> -- 
> 1.6.0.3.552.g12334
> 

-- 
Shawn.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] t7700: demonstrate mishandling of objects in packs with a .keep file
  2008-11-03 20:25                 ` Andreas Ericsson
@ 2008-11-03 22:02                   ` Brandon Casey
  2008-11-04 19:25                     ` Andreas Ericsson
  0 siblings, 1 reply; 61+ messages in thread
From: Brandon Casey @ 2008-11-03 22:02 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: drafnel, git, gitster, nico

Andreas Ericsson wrote:

> So long as "git repack -a" still creates a mega-pack, I'm fine with
> whatever.

I don't think it will after pack-objects is taught about .keep files, and
I don't think it will _now_ if all of your packs have .keep files.

'repack -a' will call pack-objects with either '--unpack=<packfile>' for
each pack file without a .keep file, or with '--unpacked --incremental' if
there are no pack files without .keep files.

In the first case, the modifications to pack-objects that I propose
will prevent objects that exist in local packs with .keep files
from being packed into the new pack.

In the second case, the --incremental option would have done the same thing.

So this inconsistency already existed, but will now be removed in favor of
honoring .keep files.

Mega-pack creation will become an "advanced operation" along the lines of:

    git rev-list --objects --all | git pack-objects

-brandon

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v3] t7700: demonstrate mishandling of objects in packs with a .keep file
  2008-11-03 20:37                 ` [PATCH v2 1/3] t7700: demonstrate mishandling of objects in packs with a .keep file Brandon Casey
  2008-11-03 20:41                   ` [PATCH v2 2/3] packed_git: convert pack_local flag into a bitfield and add pack_keep Brandon Casey
@ 2008-11-03 22:14                   ` Brandon Casey
  2008-11-04 19:17                   ` [PATCH v2 1/3] " Andreas Ericsson
                                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 61+ messages in thread
From: Brandon Casey @ 2008-11-03 22:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List, Shawn O. Pearce, Nicolas Pitre

From: Brandon Casey <drafnel@gmail.com>

Objects residing in pack files that have an associated .keep file are not
supposed to be repacked into new pack files, but they are.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
---


Whoops, white space contaminated in two spots, this one fixes it.

-brandon


 t/t7700-repack.sh |   38 ++++++++++++++++++++++++++++++++++++++
 1 files changed, 38 insertions(+), 0 deletions(-)
 create mode 100755 t/t7700-repack.sh

diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
new file mode 100755
index 0000000..7aaff0b
--- /dev/null
+++ b/t/t7700-repack.sh
@@ -0,0 +1,38 @@
+#!/bin/sh
+
+test_description='git repack works correctly'
+
+. ./test-lib.sh
+
+test_expect_failure 'objects in packs marked .keep are not repacked' '
+	echo content1 > file1 &&
+	echo content2 > file2 &&
+	git add . &&
+	git commit -m initial_commit &&
+	# Create two packs
+	# The first pack will contain all of the objects except one
+	git rev-list --objects --all | grep -v file2 |
+		git pack-objects pack > /dev/null &&
+	# The second pack will contain the excluded object
+	packsha1=$(git rev-list --objects --all | grep file2 |
+		git pack-objects pack) &&
+	touch -r pack-$packsha1.pack pack-$packsha1.keep &&
+	objsha1=$(git verify-pack -v pack-$packsha1.idx | head -n 1 |
+		sed -e "s/^\([0-9a-f]\{40\}\).*/\1/") &&
+	mv pack-* .git/objects/pack/ &&
+	git repack -A -d -l &&
+	git prune-packed &&
+	for p in .git/objects/pack/*.idx; do
+		idx=$(basename $p)
+		test "pack-$packsha1.idx" = "$idx" && continue
+		if git verify-pack -v $p | egrep "^$objsha1"; then
+			found_duplicate_object=1
+			echo "DUPLICATE OBJECT FOUND"
+			break
+		fi
+	done &&
+	test -z "$found_duplicate_object"
+'
+
+test_done
+
-- 
1.6.0.3.552.g12334

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 1/3] t7700: demonstrate mishandling of objects in packs with a .keep file
  2008-11-03 20:37                 ` [PATCH v2 1/3] t7700: demonstrate mishandling of objects in packs with a .keep file Brandon Casey
  2008-11-03 20:41                   ` [PATCH v2 2/3] packed_git: convert pack_local flag into a bitfield and add pack_keep Brandon Casey
  2008-11-03 22:14                   ` [PATCH v3] t7700: demonstrate mishandling of objects in packs with a .keep file Brandon Casey
@ 2008-11-04 19:17                   ` Andreas Ericsson
  2008-11-04 19:49                     ` Brandon Casey
  2008-11-04 23:55                   ` Junio C Hamano
  2008-11-12  8:09                   ` Jeff King
  4 siblings, 1 reply; 61+ messages in thread
From: Andreas Ericsson @ 2008-11-04 19:17 UTC (permalink / raw)
  To: Brandon Casey
  Cc: Junio C Hamano, Shawn O. Pearce, Git Mailing List, Nicolas Pitre

Brandon Casey wrote:
> From: Brandon Casey <drafnel@gmail.com>
> 
> Objects residing in pack files that have an associated .keep file are not
> supposed to be repacked into new pack files, but they are.
> 
> Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
> ---
> 
> 
> This version replaces the use of 'head -n -1' with a grep, and should work on
> all platforms.
> 

sed 1q is faster, as it stops parsing after the first line (the same as 'head
-n 1' does, but in a more portable fashion).

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH] t7700: demonstrate mishandling of objects in packs with a .keep file
  2008-11-03 22:02                   ` Brandon Casey
@ 2008-11-04 19:25                     ` Andreas Ericsson
  0 siblings, 0 replies; 61+ messages in thread
From: Andreas Ericsson @ 2008-11-04 19:25 UTC (permalink / raw)
  To: Brandon Casey; +Cc: drafnel, git, gitster, nico

Brandon Casey wrote:
> Andreas Ericsson wrote:
> 
>> So long as "git repack -a" still creates a mega-pack, I'm fine with
>> whatever.
> 
> I don't think it will after pack-objects is taught about .keep files, and

In that case you're almost certainly breaking something.

> I don't think it will _now_ if all of your packs have .keep files.
> 

It should, by copying the objects from the .keep-marked packfiles. Otherwise
either repack or the repack docs are in error.

> 'repack -a' will call pack-objects with either '--unpack=<packfile>' for
> each pack file without a .keep file, or with '--unpacked --incremental' if
> there are no pack files without .keep files.
> 
> In the first case, the modifications to pack-objects that I propose
> will prevent objects that exist in local packs with .keep files
> from being packed into the new pack.
> 
> In the second case, the --incremental option would have done the same thing.
> 
> So this inconsistency already existed, but will now be removed in favor of
> honoring .keep files.
> 

That means the nifty hack of incrementally and sometimes fully repack the
odb to speed up cloning over dumb protocols no longer works properly, then.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 1/3] t7700: demonstrate mishandling of objects in packs with a .keep file
  2008-11-04 19:17                   ` [PATCH v2 1/3] " Andreas Ericsson
@ 2008-11-04 19:49                     ` Brandon Casey
  2008-11-04 19:55                       ` Junio C Hamano
  2008-11-04 20:21                       ` Andreas Ericsson
  0 siblings, 2 replies; 61+ messages in thread
From: Brandon Casey @ 2008-11-04 19:49 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Junio C Hamano, Shawn O. Pearce, Git Mailing List, Nicolas Pitre

Andreas Ericsson wrote:
> Brandon Casey wrote:
>> From: Brandon Casey <drafnel@gmail.com>
>>
>> Objects residing in pack files that have an associated .keep file are not
>> supposed to be repacked into new pack files, but they are.
>>
>> Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
>> ---
>>
>>
>> This version replaces the use of 'head -n -1' with a grep, and should
>> work on
>> all platforms.
>>
> 
> sed 1q is faster, as it stops parsing after the first line (the same as
> 'head
> -n 1' does, but in a more portable fashion).

Except that I wanted all but the _last_ line though.

I didn't think about using sed. Perhaps I could have used something like

   sed -n -e '$q' -e 'p'

The grep works though.

-brandon

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 1/3] t7700: demonstrate mishandling of objects in packs with a .keep file
  2008-11-04 19:49                     ` Brandon Casey
@ 2008-11-04 19:55                       ` Junio C Hamano
  2008-11-04 20:01                         ` Brandon Casey
  2008-11-04 20:21                       ` Andreas Ericsson
  1 sibling, 1 reply; 61+ messages in thread
From: Junio C Hamano @ 2008-11-04 19:55 UTC (permalink / raw)
  To: Brandon Casey
  Cc: Andreas Ericsson, Shawn O. Pearce, Git Mailing List,
	Nicolas Pitre

Brandon Casey <casey@nrlssc.navy.mil> writes:

> Andreas Ericsson wrote:
>
>> sed 1q is faster, as it stops parsing after the first line (the same as
>> 'head
>> -n 1' does, but in a more portable fashion).

If your sed is GNU, and you are doing this for a small file, the startup
cost of it may dwarf such gain ;-)

> Except that I wanted all but the _last_ line though.
>
> I didn't think about using sed. Perhaps I could have used something like
>
>    sed -n -e '$q' -e 'p'

You surely meant "sed -e '$d'", right?

> The grep works though.

Indeed.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 1/3] t7700: demonstrate mishandling of objects in packs with a .keep file
  2008-11-04 19:55                       ` Junio C Hamano
@ 2008-11-04 20:01                         ` Brandon Casey
  0 siblings, 0 replies; 61+ messages in thread
From: Brandon Casey @ 2008-11-04 20:01 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Andreas Ericsson, Shawn O. Pearce, Git Mailing List,
	Nicolas Pitre

Junio C Hamano wrote:
> Brandon Casey <casey@nrlssc.navy.mil> writes:

>>    sed -n -e '$q' -e 'p'
> 
> You surely meant "sed -e '$d'", right?

Your sed-fu is quite impressive.

-b

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 1/3] t7700: demonstrate mishandling of objects in packs with a .keep file
  2008-11-04 19:49                     ` Brandon Casey
  2008-11-04 19:55                       ` Junio C Hamano
@ 2008-11-04 20:21                       ` Andreas Ericsson
  1 sibling, 0 replies; 61+ messages in thread
From: Andreas Ericsson @ 2008-11-04 20:21 UTC (permalink / raw)
  To: Brandon Casey
  Cc: Junio C Hamano, Shawn O. Pearce, Git Mailing List, Nicolas Pitre

Brandon Casey wrote:
> Andreas Ericsson wrote:
>> Brandon Casey wrote:
>>> From: Brandon Casey <drafnel@gmail.com>
>>>
>>> Objects residing in pack files that have an associated .keep file are not
>>> supposed to be repacked into new pack files, but they are.
>>>
>>> Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
>>> ---
>>>
>>>
>>> This version replaces the use of 'head -n -1' with a grep, and should
>>> work on
>>> all platforms.
>>>
>> sed 1q is faster, as it stops parsing after the first line (the same as
>> 'head
>> -n 1' does, but in a more portable fashion).
> 
> Except that I wanted all but the _last_ line though.
> 

Ach pooie. That's what I get for trying to review stuff while watching
old 70's samurai movies. I misread your 'head' command.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 1/3] t7700: demonstrate mishandling of objects in packs with a .keep file
  2008-11-03 20:37                 ` [PATCH v2 1/3] t7700: demonstrate mishandling of objects in packs with a .keep file Brandon Casey
                                     ` (2 preceding siblings ...)
  2008-11-04 19:17                   ` [PATCH v2 1/3] " Andreas Ericsson
@ 2008-11-04 23:55                   ` Junio C Hamano
  2008-11-12  8:09                   ` Jeff King
  4 siblings, 0 replies; 61+ messages in thread
From: Junio C Hamano @ 2008-11-04 23:55 UTC (permalink / raw)
  To: Brandon Casey; +Cc: Shawn O. Pearce, Git Mailing List, Nicolas Pitre

Thanks; queued.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 3/3] pack-objects: honor '.keep' files
  2008-11-03 20:43                     ` [PATCH v2 3/3] pack-objects: honor '.keep' files Brandon Casey
  2008-11-03 20:49                       ` Shawn O. Pearce
@ 2008-11-05 22:37                       ` Brandon Casey
  2008-11-06 23:22                         ` Brandon Casey
  1 sibling, 1 reply; 61+ messages in thread
From: Brandon Casey @ 2008-11-05 22:37 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List, Shawn O. Pearce, Nicolas Pitre


Junio,

Please leave this in pu for now, I have some concerns that I haven't
had time to write down yet.

-brandon


Brandon Casey wrote:
> From: Brandon Casey <drafnel@gmail.com>
> 
> By default, pack-objects creates a pack file with every object specified by
> the user. There are two options which can be used to exclude objects which
> are accessible by the repository.
> 
>    1) --incremental
>      This excludes any object which already exists in an accessible pack.
> 
>    2) --local
>      This excludes any object which exists in a non-local pack.
> 
> With this patch, both arguments also cause objects which exist in packs
> marked with a .keep file to be excluded. Only the --local option requires
> an explicit check for the .keep file. If the user doesn't want the objects
> in a pack marked with .keep to be exclude, then the .keep file should be
> removed.
> 
> Additionally, this fixes the repack bug which allowed porcelain repack to
> create packs which contained objects already contained in existing packs
> marked with a .keep file.
> 
> Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
> ---
>  builtin-pack-objects.c |    2 +-
>  t/t7700-repack.sh      |    2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
> index 15b80db..8be9113 100644
> --- a/builtin-pack-objects.c
> +++ b/builtin-pack-objects.c
> @@ -701,7 +701,7 @@ static int add_object_entry(const unsigned char *sha1, enum object_type type,
>  				break;
>  			if (incremental)
>  				return 0;
> -			if (local && !p->pack_local)
> +			if (local && (!p->pack_local || p->pack_keep))
>  				return 0;
>  		}
>  	}
> diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
> index 27af5ab..5b1cd05 100755
> --- a/t/t7700-repack.sh
> +++ b/t/t7700-repack.sh
> @@ -4,7 +4,7 @@ test_description='git repack works correctly'
>  
>  . ./test-lib.sh
>  
> -test_expect_failure 'objects in packs marked .keep are not repacked' '
> +test_expect_success 'objects in packs marked .keep are not repacked' '
>  	echo content1 > file1 &&
>  	echo content2 > file2 &&
>  	git add . &&

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 3/3] pack-objects: honor '.keep' files
  2008-11-05 22:37                       ` Brandon Casey
@ 2008-11-06 23:22                         ` Brandon Casey
  2008-11-07  0:30                           ` Junio C Hamano
  2008-11-07  1:52                           ` [PATCH 1/4] pack-objects: new option --honor-pack-keep Brandon Casey
  0 siblings, 2 replies; 61+ messages in thread
From: Brandon Casey @ 2008-11-06 23:22 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Shawn O. Pearce, Nicolas Pitre,
	Andreas Ericsson

Brandon Casey wrote:
> Junio,
> 
> Please leave this in pu for now, I have some concerns that I haven't
> had time to write down yet.

I've been thinking about pack-objects and repack.

Here's how I think the semantics of repack should be defined:

repack:

  <no-options>
    -incremental repack which does not repack any object currently packed
     in any accessible pack.

    Works (it currently works this way).

  <-a>
    -create a new pack containing all objects required by the repository
     including those accessible through alternates, but excluding objects
     in _local_ packs with .keep

    Flawed, even with my recent patches.

    If there are no local packs (or they all have .keep files), then the
    pack-objects call will use --incremental which will exclude objects
    packed in alt object store, even though -l was not used.
 
    My patches do not differentiate local .keep files from remote .keep files,
    which a user may have no control over.

  <-A>
     -Like -a, but local unreferenced objects which were previously packed
      are made to be loose.

    Ditto.

  <-a -l>
     -Restrict operation to only local objects. Only has any effect with -a|-A.
     -Like -a, but additionally exclude objects in packs accessible through
      alternates.

     Works with my recent patches.

  <-A -l>
     -Like '-a -l', but loosen unreferenced local packed objects.

     Ditto.



That set of repack operations needs to map to a combination of pack-objects
options:

  <no-options>
     -Create a pack with _all_ specified objects

  <--unpacked>
     -Exclude from packing any object already in an accessible pack.

     (Ahh, this came from rev-list interface, and rejects objects at an
      earlier stage than --incremental)

  <--unpacked=sha1>
     -Like '--unpacked', exclude already packed objects, but treat the objects
      in the pack with specified sha1 as unpacked.

  <--incremental>
     -Exclude from packing any object already in an accessible pack,
      regardless of whether it is in a pack specified by --unpacked=

      (How is this different from --unpacked, even though the exclusion
       operation is performed at a different stage? See my epiphany above
       about the source of the --unpacked option)

  <--unpacked --incremental>
     -seems redundant, is there any functional difference?

  <--local>
     -Exclude objects from being packed that are not in the local object store.


The issue is how to provide my described 'repack -a' functionality.
There does not seem to be a mapping between the above options and the
required functionality.

I see two solutions, both require introducing a new option to pack-objects.
  1) allow specifying a set of packs such that if an object resides
     in any of the set, the object will not be included in the produced
     pack.

     Benefits:
     -allows keeping pack-objects ignorant of .keep mechanism
     -repack can easily be modified to produce the set of packs to ignore

     Drawback:
     -very round-about way just to have functionality to skip packs with
      .keep file

  2) New option telling pack-objects to skip objects in local .keep'd packs

     Benefits:
     -easy to implement in pack-objects
     -easy to modify repack

     Drawbacks:
     -introduces new concept to pack-objects


Questions aside:
  1) Are both --incremental and --unpacked still necessary pack-objects options?
  2) Can --incremental become an alias for --unpacked, and go away?


patch(es) will follow.

-brandon

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 3/3] pack-objects: honor '.keep' files
  2008-11-06 23:22                         ` Brandon Casey
@ 2008-11-07  0:30                           ` Junio C Hamano
  2008-11-07  1:17                             ` Brandon Casey
                                               ` (2 more replies)
  2008-11-07  1:52                           ` [PATCH 1/4] pack-objects: new option --honor-pack-keep Brandon Casey
  1 sibling, 3 replies; 61+ messages in thread
From: Junio C Hamano @ 2008-11-07  0:30 UTC (permalink / raw)
  To: Brandon Casey
  Cc: Git Mailing List, Shawn O. Pearce, Nicolas Pitre,
	Andreas Ericsson

Brandon Casey <casey@nrlssc.navy.mil> writes:

>   <-a>
>     -create a new pack containing all objects required by the repository
>      including those accessible through alternates, but excluding objects
>      in _local_ packs with .keep

I have a feeling that it is debatable if this "fattening to dissociate
from alternates" is what people want.

>   <-a -l>
>      -Restrict operation to only local objects. Only has any effect with -a|-A.
>      -Like -a, but additionally exclude objects in packs accessible through
>       alternates.

Presumably you meant "exclude objects accessible through alternates,
either in packs or in loose form"?  If so then I think it is a good thing
to have.

I am not sure if listing the behaviour by combination of flags is a good
way to start thinking about this.  Wouldn't it be more productive to list
what kinds of repacking are needed, and then label them with combination
of flags?  Otherwise you would miss a potentially useful operation that
cannot be expressed with the current set of flags you have.

I think the useful kinds are only these five:

 - scoop loose objects that exist in local repository into a new pack,
   without touching existing packs at all; exclude anything available in
   any existing pack or in alternate repository (either loose or packed);

 - pack everything that is needed by the local ref, except the ones that
   are borrowed from alternate repositories (either loose or packed), into
   a single new pack.  There are two variants of this: eject what is
   currently packed but unnecessary into loose format when existing local
   packs are replaced with the new pack, or lose them (i.e. -A).

 - fatten local repository by packing everything that is needed by the
   local ref into a single new pack, including things that are currently
   borrowed from alternates.  There are two variants of this: eject what
   is currently packed but unnecessary into loose format when existing
   local packs are replaced with the new pack, or lose them (i.e. -A).

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 3/3] pack-objects: honor '.keep' files
  2008-11-07  0:30                           ` Junio C Hamano
@ 2008-11-07  1:17                             ` Brandon Casey
  2008-11-07  8:12                               ` Andreas Ericsson
  2008-11-10  5:59                             ` recognize loose local objects during repack drafnel
       [not found]                             ` <1226296798-31522-1-git-send-email-foo@foo.com>
  2 siblings, 1 reply; 61+ messages in thread
From: Brandon Casey @ 2008-11-07  1:17 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Shawn O. Pearce, Nicolas Pitre,
	Andreas Ericsson

Junio C Hamano wrote:
> Brandon Casey <casey@nrlssc.navy.mil> writes:
> 
>>   <-a>
>>     -create a new pack containing all objects required by the repository
>>      including those accessible through alternates, but excluding objects
>>      in _local_ packs with .keep
> 
> I have a feeling that it is debatable if this "fattening to dissociate
> from alternates" is what people want.

I'm not sure I understand you here.

Andreas has suggested previously that 'repack -a' should pack everything,
including objects in packs with .keep. Is that what you mean?

With my current understanding it seems that that would muddy the semantics
of repack. If -a does not honor packs with .keep, then would it be intuitive
to expect that adding -l (i.e. exclude alternate packed objects) _would_
honor .keep?

>>   <-a -l>
>>      -Restrict operation to only local objects. Only has any effect with -a|-A.
>>      -Like -a, but additionally exclude objects in packs accessible through
>>       alternates.
> 
> Presumably you meant "exclude objects accessible through alternates,
> either in packs or in loose form"?  If so then I think it is a good thing
> to have.

Would that be an enhancement to the current behavior? I don't think I saw
any mechanism to exclude packing remote loose objects.

The documentation for pack-objects --local says:

  --local
         This flag is similar to --incremental; instead of ignoring  all
         packed objects, it only ignores objects that are packed and not
         in the local object store (i.e. borrowed from an alternate).

It only mentions packed alternate objects.

> 
> I am not sure if listing the behaviour by combination of flags is a good
> way to start thinking about this.  Wouldn't it be more productive to list
> what kinds of repacking are needed, and then label them with combination
> of flags?  Otherwise you would miss a potentially useful operation that
> cannot be expressed with the current set of flags you have.

I agree. I made a list of the options because I was trying to understand what
effect each option had, then I turned it into an email.

> I think the useful kinds are only these five:
> 
>  - scoop loose objects that exist in local repository into a new pack,
>    without touching existing packs at all; exclude anything available in
>    any existing pack or in alternate repository (either loose or packed);
>
>  - pack everything that is needed by the local ref, except the ones that
>    are borrowed from alternate repositories (either loose or packed), into
>    a single new pack.  There are two variants of this: eject what is
>    currently packed but unnecessary into loose format when existing local
>    packs are replaced with the new pack, or lose them (i.e. -A).
>
>  - fatten local repository by packing everything that is needed by the
>    local ref into a single new pack, including things that are currently
>    borrowed from alternates.  There are two variants of this: eject what
>    is currently packed but unnecessary into loose format when existing
>    local packs are replaced with the new pack, or lose them (i.e. -A).

You didn't say when local .keep should be honored, i.e. objects in local
packs with .keep should be excluded from repacking. always, never, only
with -l, new repack option?

-brandon

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH 1/4] pack-objects: new option --honor-pack-keep
  2008-11-06 23:22                         ` Brandon Casey
  2008-11-07  0:30                           ` Junio C Hamano
@ 2008-11-07  1:52                           ` Brandon Casey
  2008-11-07  1:54                             ` [PATCH 2/4] repack: don't repack local objects in packs with .keep file Brandon Casey
  2008-11-07  8:13                             ` [PATCH 1/4] pack-objects: new option --honor-pack-keep Andreas Ericsson
  1 sibling, 2 replies; 61+ messages in thread
From: Brandon Casey @ 2008-11-07  1:52 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Shawn O. Pearce, Nicolas Pitre,
	Andreas Ericsson

This adds a new option to pack-objects which will cause it to ignore an
object which appears in a local pack which has a .keep file, even if it
was specified for packing.

This option will be used by the porcelain repack.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
---


This series replaces the previous series starting at
6ee726bc "pack-objects: honor '.keep' files"

It should be applied on top of
f34cf12d "packed_git: convert pack_local flag into a bitfield and add pack_keep"

I created the series on top of f34cf12d rebased on top of master.

Suggestions for a more appropriate name for --honor-pack-keep are very welcome.

-brandon


 Documentation/git-pack-objects.txt |    5 +++++
 builtin-pack-objects.c             |    7 +++++++
 2 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt
index 8c354bd..f9fac2c 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -109,6 +109,11 @@ base-name::
 	The default is unlimited, unless the config variable
 	`pack.packSizeLimit` is set.
 
+--honor-pack-keep::
+	This flag causes an object already in a local pack that
+	has a .keep file to be ignored, even if it appears in the
+	standard input.
+
 --incremental::
 	This flag causes an object already in a pack ignored
 	even if it appears in the standard input.
diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 15b80db..ddec341 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -71,6 +71,7 @@ static int reuse_delta = 1, reuse_object = 1;
 static int keep_unreachable, unpack_unreachable, include_tag;
 static int local;
 static int incremental;
+static int ignore_packed_keep;
 static int allow_ofs_delta;
 static const char *base_name;
 static int progress = 1;
@@ -703,6 +704,8 @@ static int add_object_entry(const unsigned char *sha1, enum object_type type,
 				return 0;
 			if (local && !p->pack_local)
 				return 0;
+			if (ignore_packed_keep && p->pack_local && p->pack_keep)
+				return 0;
 		}
 	}
 
@@ -2048,6 +2051,10 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 			incremental = 1;
 			continue;
 		}
+		if (!strcmp("--honor-pack-keep", arg)) {
+			ignore_packed_keep = 1;
+			continue;
+		}
 		if (!prefixcmp(arg, "--compression=")) {
 			char *end;
 			int level = strtoul(arg+14, &end, 0);
-- 
1.6.0.3.552.g12334

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 2/4] repack: don't repack local objects in packs with .keep file
  2008-11-07  1:52                           ` [PATCH 1/4] pack-objects: new option --honor-pack-keep Brandon Casey
@ 2008-11-07  1:54                             ` Brandon Casey
  2008-11-07  1:55                               ` [PATCH 3/4] repack: do not fall back to incremental repacking with [-a|-A] Brandon Casey
  2008-11-07  8:14                               ` [PATCH 2/4] repack: don't repack local objects in packs with .keep file Andreas Ericsson
  2008-11-07  8:13                             ` [PATCH 1/4] pack-objects: new option --honor-pack-keep Andreas Ericsson
  1 sibling, 2 replies; 61+ messages in thread
From: Brandon Casey @ 2008-11-07  1:54 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Shawn O. Pearce, Nicolas Pitre,
	Andreas Ericsson

If the user created a .keep file for a local pack, then it can be inferred
that the user does not want those objects repacked.

This fixes the repack bug tested by t7700.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
---
 git-repack.sh     |    2 +-
 t/t7700-repack.sh |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/git-repack.sh b/git-repack.sh
index d39eb6c..8bb2201 100755
--- a/git-repack.sh
+++ b/git-repack.sh
@@ -83,7 +83,7 @@ case ",$all_into_one," in
 esac
 
 args="$args $local $quiet $no_reuse$extra"
-names=$(git pack-objects --non-empty --all --reflog $args </dev/null "$PACKTMP") ||
+names=$(git pack-objects --honor-pack-keep --non-empty --all --reflog $args </dev/null "$PACKTMP") ||
 	exit 1
 if [ -z "$names" ]; then
 	if test -z "$quiet"; then
diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
index 7aaff0b..356afe3 100755
--- a/t/t7700-repack.sh
+++ b/t/t7700-repack.sh
@@ -4,7 +4,7 @@ test_description='git repack works correctly'
 
 . ./test-lib.sh
 
-test_expect_failure 'objects in packs marked .keep are not repacked' '
+test_expect_success 'objects in packs marked .keep are not repacked' '
 	echo content1 > file1 &&
 	echo content2 > file2 &&
 	git add . &&
-- 
1.6.0.3.552.g12334

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 3/4] repack: do not fall back to incremental repacking with [-a|-A]
  2008-11-07  1:54                             ` [PATCH 2/4] repack: don't repack local objects in packs with .keep file Brandon Casey
@ 2008-11-07  1:55                               ` Brandon Casey
  2008-11-07  1:56                                 ` [PATCH 4/4] builtin-gc.c: use new pack_keep bitfield to detect .keep file existence Brandon Casey
  2008-11-07  8:14                               ` [PATCH 2/4] repack: don't repack local objects in packs with .keep file Andreas Ericsson
  1 sibling, 1 reply; 61+ messages in thread
From: Brandon Casey @ 2008-11-07  1:55 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Shawn O. Pearce, Nicolas Pitre,
	Andreas Ericsson

When repack is called with either the -a or -A option, the user has
requested to repack all objects including those referenced by the
alternates mechanism. Currently, if there are no local packs without
.keep files, then repack will call pack-objects with the
'--unpacked --incremental' options which causes it to exclude alternate
packed objects. So, remove this fallback.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
---
 git-repack.sh |   11 ++++-------
 1 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/git-repack.sh b/git-repack.sh
index 8bb2201..4d313d1 100755
--- a/git-repack.sh
+++ b/git-repack.sh
@@ -71,13 +71,10 @@ case ",$all_into_one," in
 				existing="$existing $e"
 			fi
 		done
-	fi
-	if test -z "$args"
-	then
-		args='--unpacked --incremental'
-	elif test -n "$unpack_unreachable"
-	then
-		args="$args $unpack_unreachable"
+		if test -n "$args" -a -n "$unpack_unreachable"
+		then
+			args="$args $unpack_unreachable"
+		fi
 	fi
 	;;
 esac
-- 
1.6.0.3.552.g12334

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 4/4] builtin-gc.c: use new pack_keep bitfield to detect .keep file existence
  2008-11-07  1:55                               ` [PATCH 3/4] repack: do not fall back to incremental repacking with [-a|-A] Brandon Casey
@ 2008-11-07  1:56                                 ` Brandon Casey
  0 siblings, 0 replies; 61+ messages in thread
From: Brandon Casey @ 2008-11-07  1:56 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Shawn O. Pearce, Nicolas Pitre,
	Andreas Ericsson

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
---


I think the existing code was broken. Looks like it would only skip
counting a pack if the pack disappeared between the prepare_packed_git()
and the access() call. It never used the path it create with the .keep
extension.

-brandon


 builtin-gc.c |   12 +-----------
 1 files changed, 1 insertions(+), 11 deletions(-)

diff --git a/builtin-gc.c b/builtin-gc.c
index 7af65bb..781df60 100644
--- a/builtin-gc.c
+++ b/builtin-gc.c
@@ -131,19 +131,9 @@ static int too_many_packs(void)
 
 	prepare_packed_git();
 	for (cnt = 0, p = packed_git; p; p = p->next) {
-		char path[PATH_MAX];
-		size_t len;
-		int keep;
-
 		if (!p->pack_local)
 			continue;
-		len = strlen(p->pack_name);
-		if (PATH_MAX <= len + 1)
-			continue; /* oops, give up */
-		memcpy(path, p->pack_name, len-5);
-		memcpy(path + len - 5, ".keep", 6);
-		keep = access(p->pack_name, F_OK) && (errno == ENOENT);
-		if (keep)
+		if (p->pack_keep)
 			continue;
 		/*
 		 * Perhaps check the size of the pack and count only
-- 
1.6.0.3.552.g12334

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 3/3] pack-objects: honor '.keep' files
  2008-11-07  1:17                             ` Brandon Casey
@ 2008-11-07  8:12                               ` Andreas Ericsson
  2008-11-07 19:25                                 ` Shawn O. Pearce
  0 siblings, 1 reply; 61+ messages in thread
From: Andreas Ericsson @ 2008-11-07  8:12 UTC (permalink / raw)
  To: Brandon Casey
  Cc: Junio C Hamano, Git Mailing List, Shawn O. Pearce, Nicolas Pitre

Brandon Casey wrote:
> Junio C Hamano wrote:
>> Brandon Casey <casey@nrlssc.navy.mil> writes:
>>
>>>   <-a>
>>>     -create a new pack containing all objects required by the repository
>>>      including those accessible through alternates, but excluding objects
>>>      in _local_ packs with .keep
>> I have a feeling that it is debatable if this "fattening to dissociate
>> from alternates" is what people want.
> 
> I'm not sure I understand you here.
> 
> Andreas has suggested previously that 'repack -a' should pack everything,
> including objects in packs with .keep. Is that what you mean?
> 
> With my current understanding it seems that that would muddy the semantics
> of repack. If -a does not honor packs with .keep, then would it be intuitive
> to expect that adding -l (i.e. exclude alternate packed objects) _would_
> honor .keep?
> 

Only -d should honor .keep, imo. .keep-files is nothing about "don't copy
objects from this file" and all about "never delete this file".

The only muddying comes from you, who decided that .keep-files should
have impact on anything else than deleting the protected pack. Before that,
.keep files had a clear semantic, and repack's documentation was correct.

How do you explain .keep-files now? "protects pack-files that will forever
be used"? Then why the hell is it called ".keep" instead of "eternal"?

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 1/4] pack-objects: new option --honor-pack-keep
  2008-11-07  1:52                           ` [PATCH 1/4] pack-objects: new option --honor-pack-keep Brandon Casey
  2008-11-07  1:54                             ` [PATCH 2/4] repack: don't repack local objects in packs with .keep file Brandon Casey
@ 2008-11-07  8:13                             ` Andreas Ericsson
  1 sibling, 0 replies; 61+ messages in thread
From: Andreas Ericsson @ 2008-11-07  8:13 UTC (permalink / raw)
  To: Brandon Casey
  Cc: Junio C Hamano, Git Mailing List, Shawn O. Pearce, Nicolas Pitre

Brandon Casey wrote:
> This adds a new option to pack-objects which will cause it to ignore an
> object which appears in a local pack which has a .keep file, even if it
> was specified for packing.
> 
> This option will be used by the porcelain repack.
> 
> Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
> ---
> 
> 
> This series replaces the previous series starting at
> 6ee726bc "pack-objects: honor '.keep' files"
> 
> It should be applied on top of
> f34cf12d "packed_git: convert pack_local flag into a bitfield and add pack_keep"
> 
> I created the series on top of f34cf12d rebased on top of master.
> 
> Suggestions for a more appropriate name for --honor-pack-keep are very welcome.
> 
> -brandon
> 
> 
>  Documentation/git-pack-objects.txt |    5 +++++
>  builtin-pack-objects.c             |    7 +++++++
>  2 files changed, 12 insertions(+), 0 deletions(-)
> 
> diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt
> index 8c354bd..f9fac2c 100644
> --- a/Documentation/git-pack-objects.txt
> +++ b/Documentation/git-pack-objects.txt
> @@ -109,6 +109,11 @@ base-name::
>  	The default is unlimited, unless the config variable
>  	`pack.packSizeLimit` is set.
>  
> +--honor-pack-keep::
> +	This flag causes an object already in a local pack that
> +	has a .keep file to be ignored, even if it appears in the
> +	standard input.
> +

Keep-files are *always* honored. Make this option "--ignore-kept" or
something instead, otherwise people will see the synopsis and think
they need to always pass it to not remove .keep-protected packs,
which is stupid.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH 2/4] repack: don't repack local objects in packs with .keep file
  2008-11-07  1:54                             ` [PATCH 2/4] repack: don't repack local objects in packs with .keep file Brandon Casey
  2008-11-07  1:55                               ` [PATCH 3/4] repack: do not fall back to incremental repacking with [-a|-A] Brandon Casey
@ 2008-11-07  8:14                               ` Andreas Ericsson
  1 sibling, 0 replies; 61+ messages in thread
From: Andreas Ericsson @ 2008-11-07  8:14 UTC (permalink / raw)
  To: Brandon Casey
  Cc: Junio C Hamano, Git Mailing List, Shawn O. Pearce, Nicolas Pitre

Brandon Casey wrote:
> If the user created a .keep file for a local pack, then it can be inferred
> that the user does not want those objects repacked.
> 

I disagree. It can be inferred that the user doesn't want those packfiles
*removed*.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 3/3] pack-objects: honor '.keep' files
  2008-11-07  8:12                               ` Andreas Ericsson
@ 2008-11-07 19:25                                 ` Shawn O. Pearce
  0 siblings, 0 replies; 61+ messages in thread
From: Shawn O. Pearce @ 2008-11-07 19:25 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Brandon Casey, Junio C Hamano, Git Mailing List, Nicolas Pitre

Andreas Ericsson <ae@op5.se> wrote:
>
> Only -d should honor .keep, imo. .keep-files is nothing about "don't copy
> objects from this file" and all about "never delete this file".
>
> The only muddying comes from you, who decided that .keep-files should
> have impact on anything else than deleting the protected pack. Before that,
> .keep files had a clear semantic, and repack's documentation was correct.
>
> How do you explain .keep-files now? "protects pack-files that will forever
> be used"? Then why the hell is it called ".keep" instead of "eternal"?

The _intent_ behind .keep files has always been to avoid copying
the objects into a new pack during git-repack, because the objects
are held in a pack that cannot be deleted.

Sadly the implementation has never quite worked right, which is
what led to this patch series I think.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* recognize loose local objects during repack
  2008-11-07  0:30                           ` Junio C Hamano
  2008-11-07  1:17                             ` Brandon Casey
@ 2008-11-10  5:59                             ` drafnel
  2008-11-10 21:03                               ` Junio C Hamano
       [not found]                             ` <1226296798-31522-1-git-send-email-foo@foo.com>
  2 siblings, 1 reply; 61+ messages in thread
From: drafnel @ 2008-11-10  5:59 UTC (permalink / raw)
  To: gitster; +Cc: git, spearce, nico, ae


This was developed on top of the previous repack/pack-objects series.


Junio wrote:
> Presumably you meant "exclude objects accessible through alternates,
> either in packs or in loose form"?  If so then I think it is a good thing
> to have.


This patch set looks like what is necessary to exclude loose objects accessible
through alternates from packing.


> I think the useful kinds are only these five:
> 
>  - scoop loose objects that exist in local repository into a new pack,
>    without touching existing packs at all; exclude anything available in
>    any existing pack or in alternate repository (either loose or packed);

  repack -l

>  - pack everything that is needed by the local ref, except the ones that
>    are borrowed from alternate repositories (either loose or packed), into
>    a single new pack.  There are two variants of this: eject what is
>    currently packed but unnecessary into loose format when existing local
>    packs are replaced with the new pack, or lose them (i.e. -A).

  repack -a -l
  repack -A -l

>  - fatten local repository by packing everything that is needed by the
>    local ref into a single new pack, including things that are currently
>    borrowed from alternates.  There are two variants of this: eject what
>    is currently packed but unnecessary into loose format when existing
>    local packs are replaced with the new pack, or lose them (i.e. -A).

  repack -a
  repack -A

-brandon

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH 1/3] t7700: demonstrate mishandling of loose objects in an alternate ODB
       [not found]                             ` <1226296798-31522-1-git-send-email-foo@foo.com>
@ 2008-11-10  5:59                               ` drafnel
       [not found]                               ` <1226296798-31522-2-git-send-email-foo@foo.com>
  1 sibling, 0 replies; 61+ messages in thread
From: drafnel @ 2008-11-10  5:59 UTC (permalink / raw)
  To: gitster; +Cc: git, spearce, nico, ae, Brandon Casey

From: Brandon Casey <drafnel@gmail.com>

Loose objects residing in an alternate object database should not be packed
when the -l option to repack is used.

Signed-off-by: Brandon Casey <drafnel@gmail.com>
---
 t/t7700-repack.sh |   19 +++++++++++++++++++
 1 files changed, 19 insertions(+), 0 deletions(-)

diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
index 356afe3..43c9cf9 100755
--- a/t/t7700-repack.sh
+++ b/t/t7700-repack.sh
@@ -34,5 +34,24 @@ test_expect_success 'objects in packs marked .keep are not repacked' '
 	test -z "$found_duplicate_object"
 '
 
+test_expect_failure 'loose objects in alternate ODB are not repacked' '
+	mkdir alt_objects &&
+	echo `pwd`/alt_objects > .git/objects/info/alternates &&
+	echo content3 > file3 &&
+	objsha1=$(GIT_OBJECT_DIRECTORY=alt_objects git hash-object -w file3) &&
+	git add file3 &&
+	git commit -m commit_file3 &&
+	git repack -a -d -l &&
+	git prune-packed &&
+	for p in .git/objects/pack/*.idx; do
+		if git verify-pack -v $p | egrep "^$objsha1"; then
+			found_duplicate_object=1
+			echo "DUPLICATE OBJECT FOUND"
+			break
+		fi
+	done &&
+	test -z "$found_duplicate_object"
+'
+
 test_done
 
-- 
1.6.0.2.588.g3102

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 2/3] sha1_file.c: split has_loose_object() into local and non-local counterparts
       [not found]                               ` <1226296798-31522-2-git-send-email-foo@foo.com>
@ 2008-11-10  5:59                                 ` drafnel
       [not found]                                 ` <1226296798-31522-3-git-send-email-foo@foo.com>
  1 sibling, 0 replies; 61+ messages in thread
From: drafnel @ 2008-11-10  5:59 UTC (permalink / raw)
  To: gitster; +Cc: git, spearce, nico, ae, Brandon Casey

From: Brandon Casey <drafnel@gmail.com>


Signed-off-by: Brandon Casey <drafnel@gmail.com>
---
 cache.h     |    1 +
 sha1_file.c |   19 +++++++++++++------
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/cache.h b/cache.h
index 37ab457..1ec90f1 100644
--- a/cache.h
+++ b/cache.h
@@ -578,6 +578,7 @@ extern int move_temp_to_file(const char *tmpfile, const char *filename);
 
 extern int has_sha1_pack(const unsigned char *sha1, const char **ignore);
 extern int has_sha1_file(const unsigned char *sha1);
+extern int has_loose_object_nonlocal(const unsigned char *sha1);
 
 extern int has_pack_file(const unsigned char *sha1);
 extern int has_pack_index(const unsigned char *sha1);
diff --git a/sha1_file.c b/sha1_file.c
index f2b25bd..4912205 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -423,23 +423,30 @@ void prepare_alt_odb(void)
 	read_info_alternates(get_object_directory(), 0);
 }
 
-static int has_loose_object(const unsigned char *sha1)
+static int has_loose_object_local(const unsigned char *sha1)
 {
 	char *name = sha1_file_name(sha1);
-	struct alternate_object_database *alt;
+	return !access(name, F_OK);
+}
 
-	if (!access(name, F_OK))
-		return 1;
+int has_loose_object_nonlocal(const unsigned char *sha1)
+{
+	struct alternate_object_database *alt;
 	prepare_alt_odb();
 	for (alt = alt_odb_list; alt; alt = alt->next) {
-		name = alt->name;
-		fill_sha1_path(name, sha1);
+		fill_sha1_path(alt->name, sha1);
 		if (!access(alt->base, F_OK))
 			return 1;
 	}
 	return 0;
 }
 
+static int has_loose_object(const unsigned char *sha1)
+{
+	return has_loose_object_local(sha1) ||
+	       has_loose_object_nonlocal(sha1);
+}
+
 static unsigned int pack_used_ctr;
 static unsigned int pack_mmap_calls;
 static unsigned int peak_pack_open_windows;
-- 
1.6.0.2.588.g3102

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 3/3] pack-objects: extend --local to mean ignore non-local loose objects too
       [not found]                                 ` <1226296798-31522-3-git-send-email-foo@foo.com>
@ 2008-11-10  5:59                                   ` drafnel
  0 siblings, 0 replies; 61+ messages in thread
From: drafnel @ 2008-11-10  5:59 UTC (permalink / raw)
  To: gitster; +Cc: git, spearce, nico, ae, Brandon Casey

From: Brandon Casey <drafnel@gmail.com>

With this patch, --local means pack only local objects that are not already
packed.

Additionally, this fixes t7700 testing whether loose objects in an alternate
object database are repacked.

Signed-off-by: Brandon Casey <drafnel@gmail.com>
---
 Documentation/git-pack-objects.txt |    2 +-
 builtin-pack-objects.c             |    3 +++
 t/t7700-repack.sh                  |    2 +-
 3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt
index f9fac2c..7d4c1a7 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -121,7 +121,7 @@ base-name::
 --local::
 	This flag is similar to `--incremental`; instead of
 	ignoring all packed objects, it only ignores objects
-	that are packed and not in the local object store
+	that are packed and/or not in the local object store
 	(i.e. borrowed from an alternate).
 
 --non-empty::
diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index ddec341..69f351a 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -691,6 +691,9 @@ static int add_object_entry(const unsigned char *sha1, enum object_type type,
 		return 0;
 	}
 
+	if (!exclude && local && has_loose_object_nonlocal(sha1))
+		return 0;
+
 	for (p = packed_git; p; p = p->next) {
 		off_t offset = find_pack_entry_one(sha1, p);
 		if (offset) {
diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
index 43c9cf9..960bff4 100755
--- a/t/t7700-repack.sh
+++ b/t/t7700-repack.sh
@@ -34,7 +34,7 @@ test_expect_success 'objects in packs marked .keep are not repacked' '
 	test -z "$found_duplicate_object"
 '
 
-test_expect_failure 'loose objects in alternate ODB are not repacked' '
+test_expect_success 'loose objects in alternate ODB are not repacked' '
 	mkdir alt_objects &&
 	echo `pwd`/alt_objects > .git/objects/info/alternates &&
 	echo content3 > file3 &&
-- 
1.6.0.2.588.g3102

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: recognize loose local objects during repack
  2008-11-10  5:59                             ` recognize loose local objects during repack drafnel
@ 2008-11-10 21:03                               ` Junio C Hamano
  0 siblings, 0 replies; 61+ messages in thread
From: Junio C Hamano @ 2008-11-10 21:03 UTC (permalink / raw)
  To: drafnel; +Cc: git, spearce, nico, ae

drafnel@gmail.com writes:

> This was developed on top of the previous repack/pack-objects series.

Thanks.  Looked alright from a cursory reading.

By the way, I've been meaning to suggest that we should straighten out the
semantics of "unpacked" vs "incremental".

What the latter means is quite clear.  We are creating a new packfile to
bundle loose ones into one, and after the new packfile is installed we can
remove the loose objects.

But what --unpacked means often confuses people, primarily because it is a
performance heuristics that makes certain assumptions on how the objects
are packed.

Namely, "unpacked" is about discovery process of objects to be packed.
Without the option, we enumerate all objects that are reachable from the
commits in the given range, excluding the trees and blobs that should
exist in commits that are excluded (e.g, if you say "--objects A..B", we
exclude trees and blobs referenced by commit A).

With the option, we also omit commits that are packed.  What's funny is
that their trees and blobs are omitted, even if they are loose ;-)

This is typically not an issue, because you do not say "pack only this
commit object, without its trees or blobs" when repacking, and because you
must have all the trees and blobs necessary for a commit available when
you pack a commit; for these reasons, the trees and blobs are typically
packed together with the commit.  It is not an issue that the rev-list
with --unpacked option does not list loose trees and blobs that belong to
a packed commit for this reason.

You could however arrange so that a commit itself is packed but some of
the tree and blob objects it refers to are loose, in which case these
loose objects may not ever get repacked incrementally.

In an empty directory, try this:

        git init &&
        echo 0 >file && git add file && git commit -m initial &&

        P=$(git rev-list HEAD | git pack-objects pack) &&
        mv pack-$P.* .git/objects/pack/ &&
        git prune && git count-objects -v

        git repack && git count-objects -v

It packs only the commit object (and prunes it), leaving a tree and a blob
loose.  The repack won't find anything to pack.

This is not so bad in the sense that it will never corrupt your
repository, but it is confusing.  Admittedly, not peeking into a commit
that is packed is a reasonable good heuristics for performance reasons.

Interestingly enough, the object listing machinery do traverse into
parents of packed commits when --unpacked is given.  So if you pack a
commit and arrange to keep its parents unpacked, they are subject to the
incremental repacking.  In other words, the performance heuristics may not
be buying us very much --- we are traversing the history down to the root
commits regardless.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 1/3] t7700: demonstrate mishandling of objects in packs with a .keep file
  2008-11-03 20:37                 ` [PATCH v2 1/3] t7700: demonstrate mishandling of objects in packs with a .keep file Brandon Casey
                                     ` (3 preceding siblings ...)
  2008-11-04 23:55                   ` Junio C Hamano
@ 2008-11-12  8:09                   ` Jeff King
  2008-11-12 17:10                     ` Junio C Hamano
  2008-11-12 17:30                     ` Brandon Casey
  4 siblings, 2 replies; 61+ messages in thread
From: Jeff King @ 2008-11-12  8:09 UTC (permalink / raw)
  To: Brandon Casey
  Cc: Junio C Hamano, Shawn O. Pearce, Git Mailing List, Nicolas Pitre

On Mon, Nov 03, 2008 at 02:37:05PM -0600, Brandon Casey wrote:

> This version replaces the use of 'head -n -1' with a grep, and should work on
> all platforms.

Hmm. I'm not sure what happened, but the version in 'next' has "head -n
-1" in it.

-Peff

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 1/3] t7700: demonstrate mishandling of objects in packs with a .keep file
  2008-11-12  8:09                   ` Jeff King
@ 2008-11-12 17:10                     ` Junio C Hamano
  2008-11-12 19:17                       ` Jeff King
  2008-11-12 17:30                     ` Brandon Casey
  1 sibling, 1 reply; 61+ messages in thread
From: Junio C Hamano @ 2008-11-12 17:10 UTC (permalink / raw)
  To: Jeff King; +Cc: Brandon Casey, Shawn O. Pearce, Git Mailing List, Nicolas Pitre

Jeff King <peff@peff.net> writes:

> On Mon, Nov 03, 2008 at 02:37:05PM -0600, Brandon Casey wrote:
>
>> This version replaces the use of 'head -n -1' with a grep, and should work on
>> all platforms.
>
> Hmm. I'm not sure what happened, but the version in 'next' has "head -n
> -1" in it.

You mean this one?

commit 31d92611e45d1286b805e362dbc451936af24121
Author:     Brandon Casey <drafnel@gmail.com>
AuthorDate: Sat Nov 1 22:35:19 2008 -0500
Commit:     Junio C Hamano <gitster@pobox.com>
CommitDate: Sun Nov 2 15:58:09 2008 -0800

    t7700: demonstrate mishandling of objects in packs with a .keep file

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 1/3] t7700: demonstrate mishandling of objects in packs with a .keep file
  2008-11-12  8:09                   ` Jeff King
  2008-11-12 17:10                     ` Junio C Hamano
@ 2008-11-12 17:30                     ` Brandon Casey
  2008-11-12 17:59                       ` repack and .keep series Brandon Casey
  2008-11-12 18:10                       ` [PATCH v2 1/3] t7700: demonstrate mishandling of objects in packs with a .keep file Junio C Hamano
  1 sibling, 2 replies; 61+ messages in thread
From: Brandon Casey @ 2008-11-12 17:30 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Shawn O. Pearce, Git Mailing List, Nicolas Pitre

Jeff King wrote:
> On Mon, Nov 03, 2008 at 02:37:05PM -0600, Brandon Casey wrote:
> 
>> This version replaces the use of 'head -n -1' with a grep, and should work on
>> all platforms.
> 
> Hmm. I'm not sure what happened, but the version in 'next' has "head -n
> -1" in it.

Well, there were so many revisions, I probably should have re-rolled the
whole series. I wasn't sure this was going to go in as is, based on the
[Dropped] message in the "What's Cooking" email and Junio's last email about
reconciling --unpacked and --incremental. I have been working through
the --unpacked code path, but I'm not to the point where I can suggest
a change there.

But, I think it's worse than just the wrong t7700.

These two:

   packed_git: convert pack_local flag into generic bit mask
   packed_git: add new PACK_KEEP flag and haspackkeep() access macro

should have been replaced by:

   packed_git: convert pack_local flag into a bitfield and add pack_keep

which uses a struct bitfield rather than a bitmask.

And then this

   pack-objects: honor '.keep' files

was replaced by this

   pack-objects: new option --honor-pack-keep
   repack: don't repack local objects in packs with .keep file

if that's the way we want to go. I'm not partial to the phrase honor-pack-keep,
but I don't think ignore-pack-keep is appropriate, and it's the best I've come
up with.

So,

   31d92611e45d1286b805e362dbc451936af24121
   7c335327be664751fa4c04e81b2fe3bfedceaada
   77b5a5478a77cc04b674891b542db1ba1a1bf4f7
   13e7f5d2f1da42619bd545590d0044b30d00ce4b

should be reverted, and replaced by the series to follow.

-brandon

^ permalink raw reply	[flat|nested] 61+ messages in thread

* repack and .keep series
  2008-11-12 17:30                     ` Brandon Casey
@ 2008-11-12 17:59                       ` Brandon Casey
  2008-11-12 17:59                         ` [PATCH 1/6] t7700: demonstrate mishandling of objects in packs with a .keep file Brandon Casey
  2008-11-12 18:10                       ` [PATCH v2 1/3] t7700: demonstrate mishandling of objects in packs with a .keep file Junio C Hamano
  1 sibling, 1 reply; 61+ messages in thread
From: Brandon Casey @ 2008-11-12 17:59 UTC (permalink / raw)
  To: gitster; +Cc: peff, git, spearce, nico

Here is an updated series built on top of master.

What follows is the 6 patches that were sent, but are
not in next.

-brandon

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH 1/6] t7700: demonstrate mishandling of objects in packs with a .keep file
  2008-11-12 17:59                       ` repack and .keep series Brandon Casey
@ 2008-11-12 17:59                         ` Brandon Casey
  2008-11-12 17:59                           ` [PATCH 2/6] packed_git: convert pack_local flag into a bitfield and add pack_keep Brandon Casey
  0 siblings, 1 reply; 61+ messages in thread
From: Brandon Casey @ 2008-11-12 17:59 UTC (permalink / raw)
  To: gitster; +Cc: peff, git, spearce, nico, Brandon Casey, Brandon Casey

From: Brandon Casey <drafnel@gmail.com>

Objects residing in pack files that have an associated .keep file are not
supposed to be repacked into new pack files, but they are.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
---
 t/t7700-repack.sh |   38 ++++++++++++++++++++++++++++++++++++++
 1 files changed, 38 insertions(+), 0 deletions(-)
 create mode 100755 t/t7700-repack.sh

diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
new file mode 100755
index 0000000..7aaff0b
--- /dev/null
+++ b/t/t7700-repack.sh
@@ -0,0 +1,38 @@
+#!/bin/sh
+
+test_description='git repack works correctly'
+
+. ./test-lib.sh
+
+test_expect_failure 'objects in packs marked .keep are not repacked' '
+	echo content1 > file1 &&
+	echo content2 > file2 &&
+	git add . &&
+	git commit -m initial_commit &&
+	# Create two packs
+	# The first pack will contain all of the objects except one
+	git rev-list --objects --all | grep -v file2 |
+		git pack-objects pack > /dev/null &&
+	# The second pack will contain the excluded object
+	packsha1=$(git rev-list --objects --all | grep file2 |
+		git pack-objects pack) &&
+	touch -r pack-$packsha1.pack pack-$packsha1.keep &&
+	objsha1=$(git verify-pack -v pack-$packsha1.idx | head -n 1 |
+		sed -e "s/^\([0-9a-f]\{40\}\).*/\1/") &&
+	mv pack-* .git/objects/pack/ &&
+	git repack -A -d -l &&
+	git prune-packed &&
+	for p in .git/objects/pack/*.idx; do
+		idx=$(basename $p)
+		test "pack-$packsha1.idx" = "$idx" && continue
+		if git verify-pack -v $p | egrep "^$objsha1"; then
+			found_duplicate_object=1
+			echo "DUPLICATE OBJECT FOUND"
+			break
+		fi
+	done &&
+	test -z "$found_duplicate_object"
+'
+
+test_done
+
-- 
1.6.0.3.552.g12334

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 2/6] packed_git: convert pack_local flag into a bitfield and add pack_keep
  2008-11-12 17:59                         ` [PATCH 1/6] t7700: demonstrate mishandling of objects in packs with a .keep file Brandon Casey
@ 2008-11-12 17:59                           ` Brandon Casey
  2008-11-12 17:59                             ` [PATCH 3/6] pack-objects: new option --honor-pack-keep Brandon Casey
  0 siblings, 1 reply; 61+ messages in thread
From: Brandon Casey @ 2008-11-12 17:59 UTC (permalink / raw)
  To: gitster; +Cc: peff, git, spearce, nico, Brandon Casey, Brandon Casey

From: Brandon Casey <drafnel@gmail.com>

pack_keep will be set when a pack file has an associated .keep file.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
---
 cache.h     |    3 ++-
 sha1_file.c |    5 +++++
 2 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/cache.h b/cache.h
index eda7028..37ab457 100644
--- a/cache.h
+++ b/cache.h
@@ -686,7 +686,8 @@ extern struct packed_git {
 	int index_version;
 	time_t mtime;
 	int pack_fd;
-	int pack_local;
+	unsigned pack_local:1,
+		 pack_keep:1;
 	unsigned char sha1[20];
 	/* something like ".git/objects/pack/xxxxx.pack" */
 	char pack_name[FLEX_ARRAY]; /* more */
diff --git a/sha1_file.c b/sha1_file.c
index ab2b520..f2b25bd 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -841,6 +841,11 @@ struct packed_git *add_packed_git(const char *path, int path_len, int local)
 		return NULL;
 	}
 	memcpy(p->pack_name, path, path_len);
+
+	strcpy(p->pack_name + path_len, ".keep");
+	if (!access(p->pack_name, F_OK))
+		p->pack_keep = 1;
+
 	strcpy(p->pack_name + path_len, ".pack");
 	if (stat(p->pack_name, &st) || !S_ISREG(st.st_mode)) {
 		free(p);
-- 
1.6.0.3.552.g12334

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 3/6] pack-objects: new option --honor-pack-keep
  2008-11-12 17:59                           ` [PATCH 2/6] packed_git: convert pack_local flag into a bitfield and add pack_keep Brandon Casey
@ 2008-11-12 17:59                             ` Brandon Casey
  2008-11-12 17:59                               ` [PATCH 4/6] repack: don't repack local objects in packs with .keep file Brandon Casey
  0 siblings, 1 reply; 61+ messages in thread
From: Brandon Casey @ 2008-11-12 17:59 UTC (permalink / raw)
  To: gitster; +Cc: peff, git, spearce, nico, Brandon Casey

This adds a new option to pack-objects which will cause it to ignore an
object which appears in a local pack which has a .keep file, even if it
was specified for packing.

This option will be used by the porcelain repack.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
---
 Documentation/git-pack-objects.txt |    5 +++++
 builtin-pack-objects.c             |    7 +++++++
 2 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/Documentation/git-pack-objects.txt b/Documentation/git-pack-objects.txt
index 8c354bd..f9fac2c 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -109,6 +109,11 @@ base-name::
 	The default is unlimited, unless the config variable
 	`pack.packSizeLimit` is set.
 
+--honor-pack-keep::
+	This flag causes an object already in a local pack that
+	has a .keep file to be ignored, even if it appears in the
+	standard input.
+
 --incremental::
 	This flag causes an object already in a pack ignored
 	even if it appears in the standard input.
diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 15b80db..ddec341 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -71,6 +71,7 @@ static int reuse_delta = 1, reuse_object = 1;
 static int keep_unreachable, unpack_unreachable, include_tag;
 static int local;
 static int incremental;
+static int ignore_packed_keep;
 static int allow_ofs_delta;
 static const char *base_name;
 static int progress = 1;
@@ -703,6 +704,8 @@ static int add_object_entry(const unsigned char *sha1, enum object_type type,
 				return 0;
 			if (local && !p->pack_local)
 				return 0;
+			if (ignore_packed_keep && p->pack_local && p->pack_keep)
+				return 0;
 		}
 	}
 
@@ -2048,6 +2051,10 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
 			incremental = 1;
 			continue;
 		}
+		if (!strcmp("--honor-pack-keep", arg)) {
+			ignore_packed_keep = 1;
+			continue;
+		}
 		if (!prefixcmp(arg, "--compression=")) {
 			char *end;
 			int level = strtoul(arg+14, &end, 0);
-- 
1.6.0.3.552.g12334

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 4/6] repack: don't repack local objects in packs with .keep file
  2008-11-12 17:59                             ` [PATCH 3/6] pack-objects: new option --honor-pack-keep Brandon Casey
@ 2008-11-12 17:59                               ` Brandon Casey
  2008-11-12 17:59                                 ` [PATCH 5/6] repack: do not fall back to incremental repacking with [-a|-A] Brandon Casey
  0 siblings, 1 reply; 61+ messages in thread
From: Brandon Casey @ 2008-11-12 17:59 UTC (permalink / raw)
  To: gitster; +Cc: peff, git, spearce, nico, Brandon Casey

If the user created a .keep file for a local pack, then it can be inferred
that the user does not want those objects repacked.

This fixes the repack bug tested by t7700.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
---
 git-repack.sh     |    2 +-
 t/t7700-repack.sh |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/git-repack.sh b/git-repack.sh
index d39eb6c..8bb2201 100755
--- a/git-repack.sh
+++ b/git-repack.sh
@@ -83,7 +83,7 @@ case ",$all_into_one," in
 esac
 
 args="$args $local $quiet $no_reuse$extra"
-names=$(git pack-objects --non-empty --all --reflog $args </dev/null "$PACKTMP") ||
+names=$(git pack-objects --honor-pack-keep --non-empty --all --reflog $args </dev/null "$PACKTMP") ||
 	exit 1
 if [ -z "$names" ]; then
 	if test -z "$quiet"; then
diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
index 7aaff0b..356afe3 100755
--- a/t/t7700-repack.sh
+++ b/t/t7700-repack.sh
@@ -4,7 +4,7 @@ test_description='git repack works correctly'
 
 . ./test-lib.sh
 
-test_expect_failure 'objects in packs marked .keep are not repacked' '
+test_expect_success 'objects in packs marked .keep are not repacked' '
 	echo content1 > file1 &&
 	echo content2 > file2 &&
 	git add . &&
-- 
1.6.0.3.552.g12334

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 5/6] repack: do not fall back to incremental repacking with [-a|-A]
  2008-11-12 17:59                               ` [PATCH 4/6] repack: don't repack local objects in packs with .keep file Brandon Casey
@ 2008-11-12 17:59                                 ` Brandon Casey
  2008-11-12 17:59                                   ` [PATCH 6/6] builtin-gc.c: use new pack_keep bitfield to detect .keep file existence Brandon Casey
  2008-11-13  0:50                                   ` [PATCH] t7700: test that 'repack -a' packs alternate packed objects Brandon Casey
  0 siblings, 2 replies; 61+ messages in thread
From: Brandon Casey @ 2008-11-12 17:59 UTC (permalink / raw)
  To: gitster; +Cc: peff, git, spearce, nico, Brandon Casey

When repack is called with either the -a or -A option, the user has
requested to repack all objects including those referenced by the
alternates mechanism. Currently, if there are no local packs without
.keep files, then repack will call pack-objects with the
'--unpacked --incremental' options which causes it to exclude alternate
packed objects. So, remove this fallback.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
---
 git-repack.sh |   11 ++++-------
 1 files changed, 4 insertions(+), 7 deletions(-)

diff --git a/git-repack.sh b/git-repack.sh
index 8bb2201..4d313d1 100755
--- a/git-repack.sh
+++ b/git-repack.sh
@@ -71,13 +71,10 @@ case ",$all_into_one," in
 				existing="$existing $e"
 			fi
 		done
-	fi
-	if test -z "$args"
-	then
-		args='--unpacked --incremental'
-	elif test -n "$unpack_unreachable"
-	then
-		args="$args $unpack_unreachable"
+		if test -n "$args" -a -n "$unpack_unreachable"
+		then
+			args="$args $unpack_unreachable"
+		fi
 	fi
 	;;
 esac
-- 
1.6.0.3.552.g12334

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH 6/6] builtin-gc.c: use new pack_keep bitfield to detect .keep file existence
  2008-11-12 17:59                                 ` [PATCH 5/6] repack: do not fall back to incremental repacking with [-a|-A] Brandon Casey
@ 2008-11-12 17:59                                   ` Brandon Casey
  2008-11-13  0:50                                   ` [PATCH] t7700: test that 'repack -a' packs alternate packed objects Brandon Casey
  1 sibling, 0 replies; 61+ messages in thread
From: Brandon Casey @ 2008-11-12 17:59 UTC (permalink / raw)
  To: gitster; +Cc: peff, git, spearce, nico, Brandon Casey

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
---
 builtin-gc.c |   12 +-----------
 1 files changed, 1 insertions(+), 11 deletions(-)

diff --git a/builtin-gc.c b/builtin-gc.c
index 7af65bb..781df60 100644
--- a/builtin-gc.c
+++ b/builtin-gc.c
@@ -131,19 +131,9 @@ static int too_many_packs(void)
 
 	prepare_packed_git();
 	for (cnt = 0, p = packed_git; p; p = p->next) {
-		char path[PATH_MAX];
-		size_t len;
-		int keep;
-
 		if (!p->pack_local)
 			continue;
-		len = strlen(p->pack_name);
-		if (PATH_MAX <= len + 1)
-			continue; /* oops, give up */
-		memcpy(path, p->pack_name, len-5);
-		memcpy(path + len - 5, ".keep", 6);
-		keep = access(p->pack_name, F_OK) && (errno == ENOENT);
-		if (keep)
+		if (p->pack_keep)
 			continue;
 		/*
 		 * Perhaps check the size of the pack and count only
-- 
1.6.0.3.552.g12334

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 1/3] t7700: demonstrate mishandling of objects in packs with a .keep file
  2008-11-12 17:30                     ` Brandon Casey
  2008-11-12 17:59                       ` repack and .keep series Brandon Casey
@ 2008-11-12 18:10                       ` Junio C Hamano
  2008-11-12 18:19                         ` Junio C Hamano
  1 sibling, 1 reply; 61+ messages in thread
From: Junio C Hamano @ 2008-11-12 18:10 UTC (permalink / raw)
  To: Brandon Casey; +Cc: Jeff King, Shawn O. Pearce, Git Mailing List, Nicolas Pitre

Brandon Casey <casey@nrlssc.navy.mil> writes:

> Jeff King wrote:
>> On Mon, Nov 03, 2008 at 02:37:05PM -0600, Brandon Casey wrote:
>> 
>>> This version replaces the use of 'head -n -1' with a grep, and should work on
>>> all platforms.
>> 
>> Hmm. I'm not sure what happened, but the version in 'next' has "head -n
>> -1" in it.
>
> Well, there were so many revisions, I probably should have re-rolled the
> whole series....
> ...
> if that's the way we want to go. I'm not partial to the phrase honor-pack-keep,
> but I don't think ignore-pack-keep is appropriate, and it's the best I've come
> up with.

Ok, care to send in a series in incremental updates format, please?

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 1/3] t7700: demonstrate mishandling of objects in packs with a .keep file
  2008-11-12 18:10                       ` [PATCH v2 1/3] t7700: demonstrate mishandling of objects in packs with a .keep file Junio C Hamano
@ 2008-11-12 18:19                         ` Junio C Hamano
  0 siblings, 0 replies; 61+ messages in thread
From: Junio C Hamano @ 2008-11-12 18:19 UTC (permalink / raw)
  To: Brandon Casey; +Cc: Jeff King, Shawn O. Pearce, Git Mailing List, Nicolas Pitre

Junio C Hamano <gitster@pobox.com> writes:

> Brandon Casey <casey@nrlssc.navy.mil> writes:
>
>> Jeff King wrote:
>>> On Mon, Nov 03, 2008 at 02:37:05PM -0600, Brandon Casey wrote:
>>> 
>>>> This version replaces the use of 'head -n -1' with a grep, and should work on
>>>> all platforms.
>>> 
>>> Hmm. I'm not sure what happened, but the version in 'next' has "head -n
>>> -1" in it.
>>
>> Well, there were so many revisions, I probably should have re-rolled the
>> whole series....
>> ...
>> if that's the way we want to go. I'm not partial to the phrase honor-pack-keep,
>> but I don't think ignore-pack-keep is appropriate, and it's the best I've come
>> up with.
>
> Ok, care to send in a series in incremental updates format, please?

Nah, nevermind.  I see you already sent a replacement series overriding what
is in 'next'.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 1/3] t7700: demonstrate mishandling of objects in packs with a .keep file
  2008-11-12 17:10                     ` Junio C Hamano
@ 2008-11-12 19:17                       ` Jeff King
  0 siblings, 0 replies; 61+ messages in thread
From: Jeff King @ 2008-11-12 19:17 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Brandon Casey, Shawn O. Pearce, Git Mailing List, Nicolas Pitre

On Wed, Nov 12, 2008 at 09:10:45AM -0800, Junio C Hamano wrote:

> > Hmm. I'm not sure what happened, but the version in 'next' has "head -n
> > -1" in it.
> 
> You mean this one?
> 
> commit 31d92611e45d1286b805e362dbc451936af24121

Yes, exactly.

-Peff

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH] t7700: test that 'repack -a' packs alternate packed objects
  2008-11-12 17:59                                 ` [PATCH 5/6] repack: do not fall back to incremental repacking with [-a|-A] Brandon Casey
  2008-11-12 17:59                                   ` [PATCH 6/6] builtin-gc.c: use new pack_keep bitfield to detect .keep file existence Brandon Casey
@ 2008-11-13  0:50                                   ` Brandon Casey
  1 sibling, 0 replies; 61+ messages in thread
From: Brandon Casey @ 2008-11-13  0:50 UTC (permalink / raw)
  To: gitster; +Cc: peff, git, spearce, nico, Brandon Casey

Previously, when 'repack -a' was called and there were no packs in the local
repository without a .keep file, the repack would fall back to calling
pack-objects with '--unpacked --incremental'. This resulted in the created
pack file, if any, to be missing the packed objects in the alternate object
store. Test that this specific case has been fixed.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
---


Here is a test to demonstrate what
  [PATCH 5/6] repack: do not fall back to incremental repacking with [-a|-A]
fixes.

This should apply cleanly to next since it includes
  a836cfa3 t7700: demonstrate mishandling of loose objects in an alternate ODB
which has a few context lines showing through in the diff below.

-brandon


 t/t7700-repack.sh |   16 ++++++++++++++++
 1 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh
index 960bff4..3f602ea 100755
--- a/t/t7700-repack.sh
+++ b/t/t7700-repack.sh
@@ -53,5 +53,21 @@ test_expect_success 'loose objects in alternate ODB are not repacked' '
 	test -z "$found_duplicate_object"
 '
 
+test_expect_success 'packed obs in alt ODB are repacked even when local repo is packless' '
+	mkdir alt_objects/pack
+	mv .git/objects/pack/* alt_objects/pack &&
+	git repack -a &&
+	myidx=$(ls -1 .git/objects/pack/*.idx) &&
+	test -f "$myidx" &&
+	for p in alt_objects/pack/*.idx; do
+		git verify-pack -v $p | sed -n -e "/^[0-9a-f]\{40\}/p"
+	done | while read sha1 rest; do
+		if ! ( git verify-pack -v $myidx | grep "^$sha1" ); then
+			echo "Missing object in local pack: $sha1"
+			return 1
+		fi
+	done
+'
+
 test_done
 
-- 
1.6.0.3.552.g12334

^ permalink raw reply related	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2008-11-13  0:52 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-31 19:32 getting list of objects for packing Brandon Casey
2008-10-31 20:40 ` Nicolas Pitre
2008-10-31 20:48   ` Brandon Casey
2008-10-31 21:30     ` Junio C Hamano
2008-10-31 21:40       ` Brandon Casey
2008-10-31 22:23         ` Jakub Narebski
2008-11-01  0:00         ` Brandon Casey
2008-11-02  3:35           ` [PATCH] t7700: demonstrate mishandling of objects in packs with a .keep file drafnel
2008-11-02 16:31             ` [PATCH 1/3] packed_git: convert pack_local flag into generic bit mask drafnel
2008-11-03 16:12               ` Shawn O. Pearce
2008-11-03 18:24                 ` Brandon Casey
2008-11-03 20:37                 ` [PATCH v2 1/3] t7700: demonstrate mishandling of objects in packs with a .keep file Brandon Casey
2008-11-03 20:41                   ` [PATCH v2 2/3] packed_git: convert pack_local flag into a bitfield and add pack_keep Brandon Casey
2008-11-03 20:43                     ` [PATCH v2 3/3] pack-objects: honor '.keep' files Brandon Casey
2008-11-03 20:49                       ` Shawn O. Pearce
2008-11-05 22:37                       ` Brandon Casey
2008-11-06 23:22                         ` Brandon Casey
2008-11-07  0:30                           ` Junio C Hamano
2008-11-07  1:17                             ` Brandon Casey
2008-11-07  8:12                               ` Andreas Ericsson
2008-11-07 19:25                                 ` Shawn O. Pearce
2008-11-10  5:59                             ` recognize loose local objects during repack drafnel
2008-11-10 21:03                               ` Junio C Hamano
     [not found]                             ` <1226296798-31522-1-git-send-email-foo@foo.com>
2008-11-10  5:59                               ` [PATCH 1/3] t7700: demonstrate mishandling of loose objects in an alternate ODB drafnel
     [not found]                               ` <1226296798-31522-2-git-send-email-foo@foo.com>
2008-11-10  5:59                                 ` [PATCH 2/3] sha1_file.c: split has_loose_object() into local and non-local counterparts drafnel
     [not found]                                 ` <1226296798-31522-3-git-send-email-foo@foo.com>
2008-11-10  5:59                                   ` [PATCH 3/3] pack-objects: extend --local to mean ignore non-local loose objects too drafnel
2008-11-07  1:52                           ` [PATCH 1/4] pack-objects: new option --honor-pack-keep Brandon Casey
2008-11-07  1:54                             ` [PATCH 2/4] repack: don't repack local objects in packs with .keep file Brandon Casey
2008-11-07  1:55                               ` [PATCH 3/4] repack: do not fall back to incremental repacking with [-a|-A] Brandon Casey
2008-11-07  1:56                                 ` [PATCH 4/4] builtin-gc.c: use new pack_keep bitfield to detect .keep file existence Brandon Casey
2008-11-07  8:14                               ` [PATCH 2/4] repack: don't repack local objects in packs with .keep file Andreas Ericsson
2008-11-07  8:13                             ` [PATCH 1/4] pack-objects: new option --honor-pack-keep Andreas Ericsson
2008-11-03 22:14                   ` [PATCH v3] t7700: demonstrate mishandling of objects in packs with a .keep file Brandon Casey
2008-11-04 19:17                   ` [PATCH v2 1/3] " Andreas Ericsson
2008-11-04 19:49                     ` Brandon Casey
2008-11-04 19:55                       ` Junio C Hamano
2008-11-04 20:01                         ` Brandon Casey
2008-11-04 20:21                       ` Andreas Ericsson
2008-11-04 23:55                   ` Junio C Hamano
2008-11-12  8:09                   ` Jeff King
2008-11-12 17:10                     ` Junio C Hamano
2008-11-12 19:17                       ` Jeff King
2008-11-12 17:30                     ` Brandon Casey
2008-11-12 17:59                       ` repack and .keep series Brandon Casey
2008-11-12 17:59                         ` [PATCH 1/6] t7700: demonstrate mishandling of objects in packs with a .keep file Brandon Casey
2008-11-12 17:59                           ` [PATCH 2/6] packed_git: convert pack_local flag into a bitfield and add pack_keep Brandon Casey
2008-11-12 17:59                             ` [PATCH 3/6] pack-objects: new option --honor-pack-keep Brandon Casey
2008-11-12 17:59                               ` [PATCH 4/6] repack: don't repack local objects in packs with .keep file Brandon Casey
2008-11-12 17:59                                 ` [PATCH 5/6] repack: do not fall back to incremental repacking with [-a|-A] Brandon Casey
2008-11-12 17:59                                   ` [PATCH 6/6] builtin-gc.c: use new pack_keep bitfield to detect .keep file existence Brandon Casey
2008-11-13  0:50                                   ` [PATCH] t7700: test that 'repack -a' packs alternate packed objects Brandon Casey
2008-11-12 18:10                       ` [PATCH v2 1/3] t7700: demonstrate mishandling of objects in packs with a .keep file Junio C Hamano
2008-11-12 18:19                         ` Junio C Hamano
     [not found]             ` <1225643477-32319-1-git-send-email-foo@foo.com>
2008-11-02 16:31               ` [PATCH 2/3] packed_git: add new PACK_KEEP flag and haspackkeep() access macro drafnel
     [not found]               ` <1225643477-32319-2-git-send-email-foo@foo.com>
2008-11-02 16:31                 ` [PATCH 3/3] pack-objects: honor '.keep' files drafnel
2008-11-03 16:17                   ` Shawn O. Pearce
2008-11-03 10:35             ` [PATCH] t7700: demonstrate mishandling of objects in packs with a .keep file Andreas Ericsson
2008-11-03 18:20               ` Brandon Casey
2008-11-03 20:25                 ` Andreas Ericsson
2008-11-03 22:02                   ` Brandon Casey
2008-11-04 19:25                     ` Andreas Ericsson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).