git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] pack-refs: remove newly empty directories
@ 2010-07-05 22:27 Greg Price
  2010-07-06  3:02 ` Junio C Hamano
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Greg Price @ 2010-07-05 22:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Greg Price

In a large repository which uses directories to organize many refs,
"git pack-refs --all --prune" does not improve performance so much
as it should, unless we remove all the now-empty directories as well.

Signed-off-by: Greg Price <price@ksplice.com>
---
 pack-refs.c          |    8 ++++++++
 t/t3210-pack-refs.sh |    6 ++++++
 2 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/pack-refs.c b/pack-refs.c
index 7f43f8a..33d7358 100644
--- a/pack-refs.c
+++ b/pack-refs.c
@@ -63,11 +63,19 @@ static int handle_one_ref(const char *path, const unsigned char *sha1,
 /* make sure nobody touched the ref, and unlink */
 static void prune_ref(struct ref_to_prune *r)
 {
+	char *p, *q;
 	struct ref_lock *lock = lock_ref_sha1(r->name + 5, r->sha1);
 
 	if (lock) {
 		unlink_or_warn(git_path("%s", r->name));
 		unlock_ref(lock);
+
+		/* remove the directory if empty */
+		for (q = p = r->name; *p; p++)
+			if (*p == '/')
+				q = p;
+		*q = '\0';
+		rmdir(r->name);
 	}
 }
 
diff --git a/t/t3210-pack-refs.sh b/t/t3210-pack-refs.sh
index 413019a..c60ede1 100755
--- a/t/t3210-pack-refs.sh
+++ b/t/t3210-pack-refs.sh
@@ -60,6 +60,12 @@ test_expect_success 'see if git pack-refs --prune remove ref files' '
      ! test -f .git/refs/heads/f
 '
 
+test_expect_success 'see if git pack-refs --prune removes empty dirs' '
+     git branch r/s &&
+     git pack-refs --all --prune &&
+     ! test -e .git/refs/heads/r
+'
+
 test_expect_success \
     'git branch g should work when git branch g/h has been deleted' \
     'git branch g/h &&
-- 
1.6.6.32.g6380e

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] pack-refs: remove newly empty directories
  2010-07-05 22:27 [PATCH] pack-refs: remove newly empty directories Greg Price
@ 2010-07-06  3:02 ` Junio C Hamano
  2010-07-06  3:25   ` Greg Price
  2010-07-06  6:10 ` [PATCH] " Johannes Sixt
  2010-07-06 18:49 ` Andreas Schwab
  2 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2010-07-06  3:02 UTC (permalink / raw)
  To: Greg Price; +Cc: git

Greg Price <price@ksplice.com> writes:

> diff --git a/pack-refs.c b/pack-refs.c
> index 7f43f8a..33d7358 100644
> --- a/pack-refs.c
> +++ b/pack-refs.c
> @@ -63,11 +63,19 @@ static int handle_one_ref(const char *path, const unsigned char *sha1,
>  /* make sure nobody touched the ref, and unlink */
>  static void prune_ref(struct ref_to_prune *r)
>  {
> +	char *p, *q;
>  	struct ref_lock *lock = lock_ref_sha1(r->name + 5, r->sha1);
>  
>  	if (lock) {
>  		unlink_or_warn(git_path("%s", r->name));
>  		unlock_ref(lock);
> +
> +		/* remove the directory if empty */
> +		for (q = p = r->name; *p; p++)
> +			if (*p == '/')
> +				q = p;
> +		*q = '\0';
> +		rmdir(r->name);
>  	}
>  }

Will this keep refs/heads/p/q that is empty after packing p/q/r/s branch
that happens to be the only branch whose name begins with p/?

I do not want a careless loop that will remove refs/heads after packing
"master" that happens to be the only local branch, but still...

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] pack-refs: remove newly empty directories
  2010-07-06  3:02 ` Junio C Hamano
@ 2010-07-06  3:25   ` Greg Price
  2010-07-06 23:29     ` [PATCH v2] " Greg Price
  0 siblings, 1 reply; 8+ messages in thread
From: Greg Price @ 2010-07-06  3:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Mon, Jul 5, 2010 at 11:02 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Will this keep refs/heads/p/q that is empty after packing p/q/r/s branch
> that happens to be the only branch whose name begins with p/?
>
> I do not want a careless loop that will remove refs/heads after packing
> "master" that happens to be the only local branch, but still...

It will.  I could fix that with something like this (untested):

/* Remove empty parents, but spare refs/ and immediate subdirs.
   Note, munges *name. */
static void try_remove_empty_parents(char *name)
{
  char *p, *q;
  int i;
  p = name;
  for (i = 0; i < 2; i++) { /* refs/{heads,tags,...}/ */
    while (*p && *p != '/')
      p++;
    if (*p)
      p++;
  }
  for (q = p; *q; q++)
    ;
  while (1) {
    for ( ; q > p && *q != '/'; q--)
      ;
    if (q == p)
      break;
    *q = '\0';
    if (rmdir(git_path("%s", name)))
      break;
  }
}

and then

      if (lock) {
              unlink_or_warn(git_path("%s", r->name));
              unlock_ref(lock);
+              try_remove_empty_parents(r->name);
      }

Sound reasonable?

Greg

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] pack-refs: remove newly empty directories
  2010-07-05 22:27 [PATCH] pack-refs: remove newly empty directories Greg Price
  2010-07-06  3:02 ` Junio C Hamano
@ 2010-07-06  6:10 ` Johannes Sixt
  2010-07-06  6:15   ` Junio C Hamano
  2010-07-06 18:49 ` Andreas Schwab
  2 siblings, 1 reply; 8+ messages in thread
From: Johannes Sixt @ 2010-07-06  6:10 UTC (permalink / raw)
  To: Greg Price; +Cc: Junio C Hamano, git

Am 7/6/2010 0:27, schrieb Greg Price:
> In a large repository which uses directories to organize many refs,
> "git pack-refs --all --prune" does not improve performance so much
> as it should, unless we remove all the now-empty directories as well.

Before your patch, when you create a ref refs/heads/foo/bar, then pack
refs, then it is impossible to create a ref refs/heads/foo because
refs/heads/foo still exists as directory. And this is a good thing.

With your patch, is there any mechanism that inhibits that refs/heads/foo
is created when directory refs/heads/foo does not exist, but a packed ref
refs/heads/foo/bar is present?

-- Hannes

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] pack-refs: remove newly empty directories
  2010-07-06  6:10 ` [PATCH] " Johannes Sixt
@ 2010-07-06  6:15   ` Junio C Hamano
  0 siblings, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2010-07-06  6:15 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Greg Price, Junio C Hamano, git

Johannes Sixt <j.sixt@viscovery.net> writes:

> With your patch, is there any mechanism that inhibits that refs/heads/foo
> is created when directory refs/heads/foo does not exist, but a packed ref
> refs/heads/foo/bar is present?

$ git grep -e is_refname_available refs.c

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] pack-refs: remove newly empty directories
  2010-07-05 22:27 [PATCH] pack-refs: remove newly empty directories Greg Price
  2010-07-06  3:02 ` Junio C Hamano
  2010-07-06  6:10 ` [PATCH] " Johannes Sixt
@ 2010-07-06 18:49 ` Andreas Schwab
  2010-07-06 19:13   ` Greg Price
  2 siblings, 1 reply; 8+ messages in thread
From: Andreas Schwab @ 2010-07-06 18:49 UTC (permalink / raw)
  To: Greg Price; +Cc: Junio C Hamano, git

Greg Price <price@ksplice.com> writes:

> In a large repository which uses directories to organize many refs,
> "git pack-refs --all --prune" does not improve performance so much
> as it should, unless we remove all the now-empty directories as well.

What happens if a parallel running git wants to update a ref in one of
the now-empty directories?  Can it get a spurious error after it has
called safe_create_leading_directories?

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] pack-refs: remove newly empty directories
  2010-07-06 18:49 ` Andreas Schwab
@ 2010-07-06 19:13   ` Greg Price
  0 siblings, 0 replies; 8+ messages in thread
From: Greg Price @ 2010-07-06 19:13 UTC (permalink / raw)
  To: Andreas Schwab; +Cc: Junio C Hamano, git

On Tue, Jul 6, 2010 at 2:49 PM, Andreas Schwab <schwab@linux-m68k.org> wrote:
> What happens if a parallel running git wants to update a ref in one of
> the now-empty directories?  Can it get a spurious error after it has
> called safe_create_leading_directories?

Good question!  I asked the same question. =)  It can get the same
kind of error that happens if two parallel running git processes try
to update the same ref.

Anything that tries to update a ref will call lock_ref_sha1_basic(),
which calls safe_create_leading_directories() to make the directory
and then hold_lock_file_for_update() to open(O_CREAT|O_EXCL) a lock
file inside the directory.  So if the prune happens after the open(),
it will simply not remove the directory and all is well.  If it
happens between the safe_create_leading_directories() and the open(),
then the open() will fail.  This is also what happens if the lock has
been taken by another git process updating the same ref.  Currently
lock_ref_sha1_basic() chooses to have the process die in this case and
print an error message.

Similarly the prune could remove a directory just before
safe_create_leading_directories() creates one inside it.  Then
safe_create_leading_directories() will fail and loc_ref_sha1_basic()
will print an error message and return failure.  In both cases, since
we are exposed to a similar failure if a parallel process is updating
the same ref (a common operation) rather than packing refs (an
uncommon operation), I don't think this is a problem.

Greg

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2] pack-refs: remove newly empty directories
  2010-07-06  3:25   ` Greg Price
@ 2010-07-06 23:29     ` Greg Price
  0 siblings, 0 replies; 8+ messages in thread
From: Greg Price @ 2010-07-06 23:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Greg Price

In a large repository which uses directories to organize many refs,
"git pack-refs --all --prune" does not improve performance so much
as it should, unless we remove all the now-empty directories as well.

Signed-off-by: Greg Price <price@ksplice.com>
---
This version removes empty grandparent directories, etc, but always
leaves in place refs/heads/ and its siblings.

We also tolerate duplicate slashes in refnames, because
check_ref_format() in refs.c does the same.

 pack-refs.c          |   30 ++++++++++++++++++++++++++++++
 t/t3210-pack-refs.sh |    6 ++++++
 2 files changed, 36 insertions(+), 0 deletions(-)

diff --git a/pack-refs.c b/pack-refs.c
index 7f43f8a..a935856 100644
--- a/pack-refs.c
+++ b/pack-refs.c
@@ -60,6 +60,35 @@ static int handle_one_ref(const char *path, const unsigned char *sha1,
 	return 0;
 }
 
+/* Remove empty parents, but spare refs/ and immediate subdirs.
+   Note, munges *name. */
+static void try_remove_empty_parents(char *name)
+{
+	char *p, *q;
+	int i;
+	p = name;
+	for (i = 0; i < 2; i++) { /* refs/{heads,tags,...}/ */
+		while (*p && *p != '/')
+			p++;
+		/* tolerate duplicate slashes; see check_ref_format() */
+		while (*p == '/')
+			p++;
+	}
+	for (q = p; *q; q++)
+		;
+	while (1) {
+		while (q > p && *q != '/')
+			q--;
+		while (q > p && *(q-1) == '/')
+			q--;
+		if (q == p)
+			break;
+		*q = '\0';
+		if (rmdir(git_path("%s", name)))
+			break;
+	}
+}
+
 /* make sure nobody touched the ref, and unlink */
 static void prune_ref(struct ref_to_prune *r)
 {
@@ -68,6 +97,7 @@ static void prune_ref(struct ref_to_prune *r)
 	if (lock) {
 		unlink_or_warn(git_path("%s", r->name));
 		unlock_ref(lock);
+		try_remove_empty_parents(r->name);
 	}
 }
 
diff --git a/t/t3210-pack-refs.sh b/t/t3210-pack-refs.sh
index 413019a..ffd4e9f 100755
--- a/t/t3210-pack-refs.sh
+++ b/t/t3210-pack-refs.sh
@@ -60,6 +60,12 @@ test_expect_success 'see if git pack-refs --prune remove ref files' '
      ! test -f .git/refs/heads/f
 '
 
+test_expect_success 'see if git pack-refs --prune removes empty dirs' '
+     git branch r/s/t &&
+     git pack-refs --all --prune &&
+     ! test -e .git/refs/heads/r
+'
+
 test_expect_success \
     'git branch g should work when git branch g/h has been deleted' \
     'git branch g/h &&
-- 
1.6.6.32.g6380e

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-07-06 23:29 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-05 22:27 [PATCH] pack-refs: remove newly empty directories Greg Price
2010-07-06  3:02 ` Junio C Hamano
2010-07-06  3:25   ` Greg Price
2010-07-06 23:29     ` [PATCH v2] " Greg Price
2010-07-06  6:10 ` [PATCH] " Johannes Sixt
2010-07-06  6:15   ` Junio C Hamano
2010-07-06 18:49 ` Andreas Schwab
2010-07-06 19:13   ` Greg Price

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).