* [PATCH] pack-refs: remove newly empty directories
@ 2010-07-05 22:27 Greg Price
2010-07-06 3:02 ` Junio C Hamano
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Greg Price @ 2010-07-05 22:27 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Greg Price
In a large repository which uses directories to organize many refs,
"git pack-refs --all --prune" does not improve performance so much
as it should, unless we remove all the now-empty directories as well.
Signed-off-by: Greg Price <price@ksplice.com>
---
pack-refs.c | 8 ++++++++
t/t3210-pack-refs.sh | 6 ++++++
2 files changed, 14 insertions(+), 0 deletions(-)
diff --git a/pack-refs.c b/pack-refs.c
index 7f43f8a..33d7358 100644
--- a/pack-refs.c
+++ b/pack-refs.c
@@ -63,11 +63,19 @@ static int handle_one_ref(const char *path, const unsigned char *sha1,
/* make sure nobody touched the ref, and unlink */
static void prune_ref(struct ref_to_prune *r)
{
+ char *p, *q;
struct ref_lock *lock = lock_ref_sha1(r->name + 5, r->sha1);
if (lock) {
unlink_or_warn(git_path("%s", r->name));
unlock_ref(lock);
+
+ /* remove the directory if empty */
+ for (q = p = r->name; *p; p++)
+ if (*p == '/')
+ q = p;
+ *q = '\0';
+ rmdir(r->name);
}
}
diff --git a/t/t3210-pack-refs.sh b/t/t3210-pack-refs.sh
index 413019a..c60ede1 100755
--- a/t/t3210-pack-refs.sh
+++ b/t/t3210-pack-refs.sh
@@ -60,6 +60,12 @@ test_expect_success 'see if git pack-refs --prune remove ref files' '
! test -f .git/refs/heads/f
'
+test_expect_success 'see if git pack-refs --prune removes empty dirs' '
+ git branch r/s &&
+ git pack-refs --all --prune &&
+ ! test -e .git/refs/heads/r
+'
+
test_expect_success \
'git branch g should work when git branch g/h has been deleted' \
'git branch g/h &&
--
1.6.6.32.g6380e
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] pack-refs: remove newly empty directories
2010-07-05 22:27 [PATCH] pack-refs: remove newly empty directories Greg Price
@ 2010-07-06 3:02 ` Junio C Hamano
2010-07-06 3:25 ` Greg Price
2010-07-06 6:10 ` [PATCH] " Johannes Sixt
2010-07-06 18:49 ` Andreas Schwab
2 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2010-07-06 3:02 UTC (permalink / raw)
To: Greg Price; +Cc: git
Greg Price <price@ksplice.com> writes:
> diff --git a/pack-refs.c b/pack-refs.c
> index 7f43f8a..33d7358 100644
> --- a/pack-refs.c
> +++ b/pack-refs.c
> @@ -63,11 +63,19 @@ static int handle_one_ref(const char *path, const unsigned char *sha1,
> /* make sure nobody touched the ref, and unlink */
> static void prune_ref(struct ref_to_prune *r)
> {
> + char *p, *q;
> struct ref_lock *lock = lock_ref_sha1(r->name + 5, r->sha1);
>
> if (lock) {
> unlink_or_warn(git_path("%s", r->name));
> unlock_ref(lock);
> +
> + /* remove the directory if empty */
> + for (q = p = r->name; *p; p++)
> + if (*p == '/')
> + q = p;
> + *q = '\0';
> + rmdir(r->name);
> }
> }
Will this keep refs/heads/p/q that is empty after packing p/q/r/s branch
that happens to be the only branch whose name begins with p/?
I do not want a careless loop that will remove refs/heads after packing
"master" that happens to be the only local branch, but still...
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] pack-refs: remove newly empty directories
2010-07-06 3:02 ` Junio C Hamano
@ 2010-07-06 3:25 ` Greg Price
2010-07-06 23:29 ` [PATCH v2] " Greg Price
0 siblings, 1 reply; 8+ messages in thread
From: Greg Price @ 2010-07-06 3:25 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
On Mon, Jul 5, 2010 at 11:02 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Will this keep refs/heads/p/q that is empty after packing p/q/r/s branch
> that happens to be the only branch whose name begins with p/?
>
> I do not want a careless loop that will remove refs/heads after packing
> "master" that happens to be the only local branch, but still...
It will. I could fix that with something like this (untested):
/* Remove empty parents, but spare refs/ and immediate subdirs.
Note, munges *name. */
static void try_remove_empty_parents(char *name)
{
char *p, *q;
int i;
p = name;
for (i = 0; i < 2; i++) { /* refs/{heads,tags,...}/ */
while (*p && *p != '/')
p++;
if (*p)
p++;
}
for (q = p; *q; q++)
;
while (1) {
for ( ; q > p && *q != '/'; q--)
;
if (q == p)
break;
*q = '\0';
if (rmdir(git_path("%s", name)))
break;
}
}
and then
if (lock) {
unlink_or_warn(git_path("%s", r->name));
unlock_ref(lock);
+ try_remove_empty_parents(r->name);
}
Sound reasonable?
Greg
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2] pack-refs: remove newly empty directories
2010-07-06 3:25 ` Greg Price
@ 2010-07-06 23:29 ` Greg Price
0 siblings, 0 replies; 8+ messages in thread
From: Greg Price @ 2010-07-06 23:29 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Greg Price
In a large repository which uses directories to organize many refs,
"git pack-refs --all --prune" does not improve performance so much
as it should, unless we remove all the now-empty directories as well.
Signed-off-by: Greg Price <price@ksplice.com>
---
This version removes empty grandparent directories, etc, but always
leaves in place refs/heads/ and its siblings.
We also tolerate duplicate slashes in refnames, because
check_ref_format() in refs.c does the same.
pack-refs.c | 30 ++++++++++++++++++++++++++++++
t/t3210-pack-refs.sh | 6 ++++++
2 files changed, 36 insertions(+), 0 deletions(-)
diff --git a/pack-refs.c b/pack-refs.c
index 7f43f8a..a935856 100644
--- a/pack-refs.c
+++ b/pack-refs.c
@@ -60,6 +60,35 @@ static int handle_one_ref(const char *path, const unsigned char *sha1,
return 0;
}
+/* Remove empty parents, but spare refs/ and immediate subdirs.
+ Note, munges *name. */
+static void try_remove_empty_parents(char *name)
+{
+ char *p, *q;
+ int i;
+ p = name;
+ for (i = 0; i < 2; i++) { /* refs/{heads,tags,...}/ */
+ while (*p && *p != '/')
+ p++;
+ /* tolerate duplicate slashes; see check_ref_format() */
+ while (*p == '/')
+ p++;
+ }
+ for (q = p; *q; q++)
+ ;
+ while (1) {
+ while (q > p && *q != '/')
+ q--;
+ while (q > p && *(q-1) == '/')
+ q--;
+ if (q == p)
+ break;
+ *q = '\0';
+ if (rmdir(git_path("%s", name)))
+ break;
+ }
+}
+
/* make sure nobody touched the ref, and unlink */
static void prune_ref(struct ref_to_prune *r)
{
@@ -68,6 +97,7 @@ static void prune_ref(struct ref_to_prune *r)
if (lock) {
unlink_or_warn(git_path("%s", r->name));
unlock_ref(lock);
+ try_remove_empty_parents(r->name);
}
}
diff --git a/t/t3210-pack-refs.sh b/t/t3210-pack-refs.sh
index 413019a..ffd4e9f 100755
--- a/t/t3210-pack-refs.sh
+++ b/t/t3210-pack-refs.sh
@@ -60,6 +60,12 @@ test_expect_success 'see if git pack-refs --prune remove ref files' '
! test -f .git/refs/heads/f
'
+test_expect_success 'see if git pack-refs --prune removes empty dirs' '
+ git branch r/s/t &&
+ git pack-refs --all --prune &&
+ ! test -e .git/refs/heads/r
+'
+
test_expect_success \
'git branch g should work when git branch g/h has been deleted' \
'git branch g/h &&
--
1.6.6.32.g6380e
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] pack-refs: remove newly empty directories
2010-07-05 22:27 [PATCH] pack-refs: remove newly empty directories Greg Price
2010-07-06 3:02 ` Junio C Hamano
@ 2010-07-06 6:10 ` Johannes Sixt
2010-07-06 6:15 ` Junio C Hamano
2010-07-06 18:49 ` Andreas Schwab
2 siblings, 1 reply; 8+ messages in thread
From: Johannes Sixt @ 2010-07-06 6:10 UTC (permalink / raw)
To: Greg Price; +Cc: Junio C Hamano, git
Am 7/6/2010 0:27, schrieb Greg Price:
> In a large repository which uses directories to organize many refs,
> "git pack-refs --all --prune" does not improve performance so much
> as it should, unless we remove all the now-empty directories as well.
Before your patch, when you create a ref refs/heads/foo/bar, then pack
refs, then it is impossible to create a ref refs/heads/foo because
refs/heads/foo still exists as directory. And this is a good thing.
With your patch, is there any mechanism that inhibits that refs/heads/foo
is created when directory refs/heads/foo does not exist, but a packed ref
refs/heads/foo/bar is present?
-- Hannes
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] pack-refs: remove newly empty directories
2010-07-06 6:10 ` [PATCH] " Johannes Sixt
@ 2010-07-06 6:15 ` Junio C Hamano
0 siblings, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2010-07-06 6:15 UTC (permalink / raw)
To: Johannes Sixt; +Cc: Greg Price, Junio C Hamano, git
Johannes Sixt <j.sixt@viscovery.net> writes:
> With your patch, is there any mechanism that inhibits that refs/heads/foo
> is created when directory refs/heads/foo does not exist, but a packed ref
> refs/heads/foo/bar is present?
$ git grep -e is_refname_available refs.c
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] pack-refs: remove newly empty directories
2010-07-05 22:27 [PATCH] pack-refs: remove newly empty directories Greg Price
2010-07-06 3:02 ` Junio C Hamano
2010-07-06 6:10 ` [PATCH] " Johannes Sixt
@ 2010-07-06 18:49 ` Andreas Schwab
2010-07-06 19:13 ` Greg Price
2 siblings, 1 reply; 8+ messages in thread
From: Andreas Schwab @ 2010-07-06 18:49 UTC (permalink / raw)
To: Greg Price; +Cc: Junio C Hamano, git
Greg Price <price@ksplice.com> writes:
> In a large repository which uses directories to organize many refs,
> "git pack-refs --all --prune" does not improve performance so much
> as it should, unless we remove all the now-empty directories as well.
What happens if a parallel running git wants to update a ref in one of
the now-empty directories? Can it get a spurious error after it has
called safe_create_leading_directories?
Andreas.
--
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5
"And now for something completely different."
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] pack-refs: remove newly empty directories
2010-07-06 18:49 ` Andreas Schwab
@ 2010-07-06 19:13 ` Greg Price
0 siblings, 0 replies; 8+ messages in thread
From: Greg Price @ 2010-07-06 19:13 UTC (permalink / raw)
To: Andreas Schwab; +Cc: Junio C Hamano, git
On Tue, Jul 6, 2010 at 2:49 PM, Andreas Schwab <schwab@linux-m68k.org> wrote:
> What happens if a parallel running git wants to update a ref in one of
> the now-empty directories? Can it get a spurious error after it has
> called safe_create_leading_directories?
Good question! I asked the same question. =) It can get the same
kind of error that happens if two parallel running git processes try
to update the same ref.
Anything that tries to update a ref will call lock_ref_sha1_basic(),
which calls safe_create_leading_directories() to make the directory
and then hold_lock_file_for_update() to open(O_CREAT|O_EXCL) a lock
file inside the directory. So if the prune happens after the open(),
it will simply not remove the directory and all is well. If it
happens between the safe_create_leading_directories() and the open(),
then the open() will fail. This is also what happens if the lock has
been taken by another git process updating the same ref. Currently
lock_ref_sha1_basic() chooses to have the process die in this case and
print an error message.
Similarly the prune could remove a directory just before
safe_create_leading_directories() creates one inside it. Then
safe_create_leading_directories() will fail and loc_ref_sha1_basic()
will print an error message and return failure. In both cases, since
we are exposed to a similar failure if a parallel process is updating
the same ref (a common operation) rather than packing refs (an
uncommon operation), I don't think this is a problem.
Greg
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-07-06 23:29 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-05 22:27 [PATCH] pack-refs: remove newly empty directories Greg Price
2010-07-06 3:02 ` Junio C Hamano
2010-07-06 3:25 ` Greg Price
2010-07-06 23:29 ` [PATCH v2] " Greg Price
2010-07-06 6:10 ` [PATCH] " Johannes Sixt
2010-07-06 6:15 ` Junio C Hamano
2010-07-06 18:49 ` Andreas Schwab
2010-07-06 19:13 ` Greg Price
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).