git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* notify alternative to auto gc?
@ 2010-06-28 16:10 Karl Stenerud
  2010-06-28 16:26 ` Ævar Arnfjörð Bjarmason
                   ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Karl Stenerud @ 2010-06-28 16:10 UTC (permalink / raw)
  To: git

Hi,

As I did a git pull on my project today, git went into some kind of auto gc mode:

Auto packing the repository for optimum performance. You may also
run "git gc" manually. See "git help gc" for more information.
Counting objects: 4531, done.

This is, of course, quite an annoying feature since it could hit at any (inconvenient) time.

The git help tells me I can disable it by setting gc.auto to 0, while the mailing list archive tells me I also have to set gc.autopacklimit to 0.  This is fine, but if I do that, I won't know when the repo is in need of cleanup.  Is there any option I can set to instruct it to simply TELL me when it's in need of gc?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: notify alternative to auto gc?
  2010-06-28 16:10 notify alternative to auto gc? Karl Stenerud
@ 2010-06-28 16:26 ` Ævar Arnfjörð Bjarmason
  2010-06-28 16:52   ` Karl Stenerud
  2010-06-30  2:27   ` Sam Vilain
  2010-06-28 16:27 ` Matthieu Moy
  2010-06-28 16:29 ` Chris Packham
  2 siblings, 2 replies; 21+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2010-06-28 16:26 UTC (permalink / raw)
  To: Karl Stenerud; +Cc: git

On Mon, Jun 28, 2010 at 16:10, Karl Stenerud <kstenerud@gmail.com> wrote:
> The git help tells me I can disable it by setting gc.auto to 0, while the mailing list archive tells me I also have to set gc.autopacklimit to 0.  This is fine, but if I do that, I won't know when the repo is in need of cleanup.  Is there any option I can set to instruct it to simply TELL me when it's in need of gc?

Anything that tells you whether you need to gc would incur much of the
speed penalty that running gc itself does.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: notify alternative to auto gc?
  2010-06-28 16:10 notify alternative to auto gc? Karl Stenerud
  2010-06-28 16:26 ` Ævar Arnfjörð Bjarmason
@ 2010-06-28 16:27 ` Matthieu Moy
  2010-06-28 16:29 ` Chris Packham
  2 siblings, 0 replies; 21+ messages in thread
From: Matthieu Moy @ 2010-06-28 16:27 UTC (permalink / raw)
  To: Karl Stenerud; +Cc: git

Karl Stenerud <kstenerud@gmail.com> writes:

> The git help tells me I can disable it by setting gc.auto to 0,
> while the mailing list archive tells me I also have to set
> gc.autopacklimit to 0.  This is fine, but if I do that, I won't know
> when the repo is in need of cleanup.  Is there any option I can set
> to instruct it to simply TELL me when it's in need of gc?

An alternative is to actually run "git gc" periodically, to make sure
that git gc --auto is never needed.

I do that from a cron job for example.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: notify alternative to auto gc?
  2010-06-28 16:10 notify alternative to auto gc? Karl Stenerud
  2010-06-28 16:26 ` Ævar Arnfjörð Bjarmason
  2010-06-28 16:27 ` Matthieu Moy
@ 2010-06-28 16:29 ` Chris Packham
  2010-06-28 16:56   ` Karl Stenerud
                     ` (2 more replies)
  2 siblings, 3 replies; 21+ messages in thread
From: Chris Packham @ 2010-06-28 16:29 UTC (permalink / raw)
  To: Karl Stenerud; +Cc: git

On Mon, Jun 28, 2010 at 9:10 AM, Karl Stenerud <kstenerud@gmail.com> wrote:
> Hi,
>
> As I did a git pull on my project today, git went into some kind of auto gc mode:
>
> Auto packing the repository for optimum performance. You may also
> run "git gc" manually. See "git help gc" for more information.
> Counting objects: 4531, done.
>
> This is, of course, quite an annoying feature since it could hit at any (inconvenient) time.
>
> The git help tells me I can disable it by setting gc.auto to 0, while the mailing list archive tells me I also have to set gc.autopacklimit to 0.  This is fine, but if I do that, I won't know when the repo is in need of cleanup.  Is there any option I can set to instruct it to simply TELL me when it's in need of gc?
>

I don't think there is an existing configuration for this but I think
you can achieve what you want with the "pre-auto-gc" hook. From the
githooks(5) man page

  pre-auto-gc
     This hook is invoked by git gc --auto. It takes no parameter, and
     exiting with non-zero status from this script causes the git gc --auto
     to abort.

So a hook like

  #! /bin/sh
  echo "repository needs git gc"
  exit 1

Should cause the auto gc to be skipped.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: notify alternative to auto gc?
  2010-06-28 16:26 ` Ævar Arnfjörð Bjarmason
@ 2010-06-28 16:52   ` Karl Stenerud
  2010-06-29  6:46     ` Matthieu Moy
  2010-06-30  2:27   ` Sam Vilain
  1 sibling, 1 reply; 21+ messages in thread
From: Karl Stenerud @ 2010-06-28 16:52 UTC (permalink / raw)
  To: git


On 2010-06-28, at 12:26 PM, Ævar Arnfjörð Bjarmason wrote:

> On Mon, Jun 28, 2010 at 16:10, Karl Stenerud <kstenerud@gmail.com> wrote:
>> The git help tells me I can disable it by setting gc.auto to 0, while the mailing list archive tells me I also have to set gc.autopacklimit to 0.  This is fine, but if I do that, I won't know when the repo is in need of cleanup.  Is there any option I can set to instruct it to simply TELL me when it's in need of gc?
> 
> Anything that tells you whether you need to gc would incur much of the
> speed penalty that running gc itself does.

Actually, no it wouldn't.  If checking did incur much of the speed penalty, then git, in general, would already be horribly slow all the time because it automatically does a gc-check with most commonly used commands, according to the documentation.

In the case which I observed, the time from typing "git pull" until the auto-gc notification was a few seconds.  The time from the notification until gc completion, however, was 3-4 minutes.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: notify alternative to auto gc?
  2010-06-28 16:29 ` Chris Packham
@ 2010-06-28 16:56   ` Karl Stenerud
  2010-06-28 17:07   ` [PATCH/RFC] Documentation/git-gc.txt: add reference to githooks Chris Packham
  2010-06-28 18:58   ` notify alternative to auto gc? Eric Raible
  2 siblings, 0 replies; 21+ messages in thread
From: Karl Stenerud @ 2010-06-28 16:56 UTC (permalink / raw)
  To: git


On 2010-06-28, at 12:29 PM, Chris Packham wrote:

> On Mon, Jun 28, 2010 at 9:10 AM, Karl Stenerud <kstenerud@gmail.com> wrote:
>> Hi,
>> 
>> As I did a git pull on my project today, git went into some kind of auto gc mode:
>> 
>> Auto packing the repository for optimum performance. You may also
>> run "git gc" manually. See "git help gc" for more information.
>> Counting objects: 4531, done.
>> 
>> This is, of course, quite an annoying feature since it could hit at any (inconvenient) time.
>> 
>> The git help tells me I can disable it by setting gc.auto to 0, while the mailing list archive tells me I also have to set gc.autopacklimit to 0.  This is fine, but if I do that, I won't know when the repo is in need of cleanup.  Is there any option I can set to instruct it to simply TELL me when it's in need of gc?
>> 
> 
> I don't think there is an existing configuration for this but I think
> you can achieve what you want with the "pre-auto-gc" hook. From the
> githooks(5) man page
> 
>  pre-auto-gc
>     This hook is invoked by git gc --auto. It takes no parameter, and
>     exiting with non-zero status from this script causes the git gc --auto
>     to abort.
> 
> So a hook like
> 
>  #! /bin/sh
>  echo "repository needs git gc"
>  exit 1
> 
> Should cause the auto gc to be skipped.


Cool thanks!  I'll give that a shot.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH/RFC] Documentation/git-gc.txt: add reference to githooks
  2010-06-28 16:29 ` Chris Packham
  2010-06-28 16:56   ` Karl Stenerud
@ 2010-06-28 17:07   ` Chris Packham
  2010-06-28 18:43     ` Marc Branchaud
                       ` (2 more replies)
  2010-06-28 18:58   ` notify alternative to auto gc? Eric Raible
  2 siblings, 3 replies; 21+ messages in thread
From: Chris Packham @ 2010-06-28 17:07 UTC (permalink / raw)
  To: spearce; +Cc: git, Chris Packham

This advertises the existence of the 'pre-auto-gc' hook and adds a cross
reference to where the hook is documented.

Signed-off-by: Chris Packham <judge.packham@gmail.com>
---
I had to go fishing in the code to find out about the pre-auto-gc hook. From
reading the git-gc man page it wasn't obvious to me that there would be a hook
for 'git gc --auto'. The relevant config variables are mentioned so it seems
logical to mention the hooks also. The only precedent I found for this was in
the git-commit man page which has a section listing the hooks that are
available.

 Documentation/git-gc.txt |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
index a9e0882..a514c52 100644
--- a/Documentation/git-gc.txt
+++ b/Documentation/git-gc.txt
@@ -137,6 +137,11 @@ If you are expecting some objects to be collected and they aren't, check
 all of those locations and decide whether it makes sense in your case to
 remove those references.
 
+HOOKS
+-----
+This command can run `pre-auto-gc` hook.  See linkgit:githooks[5] for more
+information.
+
 SEE ALSO
 --------
 linkgit:git-prune[1]
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH/RFC] Documentation/git-gc.txt: add reference to githooks
  2010-06-28 17:07   ` [PATCH/RFC] Documentation/git-gc.txt: add reference to githooks Chris Packham
@ 2010-06-28 18:43     ` Marc Branchaud
  2010-06-29 15:43       ` Junio C Hamano
  2010-06-29 16:26     ` Junio C Hamano
  2010-09-21 18:14     ` [PATCH/RFC] " David Brown
  2 siblings, 1 reply; 21+ messages in thread
From: Marc Branchaud @ 2010-06-28 18:43 UTC (permalink / raw)
  To: Chris Packham; +Cc: spearce, git

How about also adding a templates/hooks--pre-auto-gc.sample file?

		M.


On 10-06-28 01:07 PM, Chris Packham wrote:
> This advertises the existence of the 'pre-auto-gc' hook and adds a cross
> reference to where the hook is documented.
> 
> Signed-off-by: Chris Packham <judge.packham@gmail.com>
> ---
> I had to go fishing in the code to find out about the pre-auto-gc hook. From
> reading the git-gc man page it wasn't obvious to me that there would be a hook
> for 'git gc --auto'. The relevant config variables are mentioned so it seems
> logical to mention the hooks also. The only precedent I found for this was in
> the git-commit man page which has a section listing the hooks that are
> available.
> 
>  Documentation/git-gc.txt |    5 +++++
>  1 files changed, 5 insertions(+), 0 deletions(-)
> 
> diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
> index a9e0882..a514c52 100644
> --- a/Documentation/git-gc.txt
> +++ b/Documentation/git-gc.txt
> @@ -137,6 +137,11 @@ If you are expecting some objects to be collected and they aren't, check
>  all of those locations and decide whether it makes sense in your case to
>  remove those references.
>  
> +HOOKS
> +-----
> +This command can run `pre-auto-gc` hook.  See linkgit:githooks[5] for more
> +information.
> +
>  SEE ALSO
>  --------
>  linkgit:git-prune[1]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: notify alternative to auto gc?
  2010-06-28 16:29 ` Chris Packham
  2010-06-28 16:56   ` Karl Stenerud
  2010-06-28 17:07   ` [PATCH/RFC] Documentation/git-gc.txt: add reference to githooks Chris Packham
@ 2010-06-28 18:58   ` Eric Raible
  2010-06-28 19:02     ` Jacob Helwig
  2 siblings, 1 reply; 21+ messages in thread
From: Eric Raible @ 2010-06-28 18:58 UTC (permalink / raw)
  To: git

Chris Packham <judge.packham <at> gmail.com> writes:

> So a hook like
> 
>   #! /bin/sh
>   echo "repository needs git gc"
>   exit 1
> 
> Should cause the auto gc to be skipped.

Except wouldn't you also need a mechanism
to allow an explicit gc to actually proceed?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: notify alternative to auto gc?
  2010-06-28 18:58   ` notify alternative to auto gc? Eric Raible
@ 2010-06-28 19:02     ` Jacob Helwig
  0 siblings, 0 replies; 21+ messages in thread
From: Jacob Helwig @ 2010-06-28 19:02 UTC (permalink / raw)
  To: Eric Raible; +Cc: git

On Mon, Jun 28, 2010 at 11:58, Eric Raible <raible@gmail.com> wrote:
> Chris Packham <judge.packham <at> gmail.com> writes:
>
>> So a hook like
>>
>>   #! /bin/sh
>>   echo "repository needs git gc"
>>   exit 1
>>
>> Should cause the auto gc to be skipped.
>
> Except wouldn't you also need a mechanism
> to allow an explicit gc to actually proceed?
>

Only if you're explicitly calling "git gc --auto" instead of "git gc".
 That hook should only run for "git gc --auto".

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: notify alternative to auto gc?
  2010-06-28 16:52   ` Karl Stenerud
@ 2010-06-29  6:46     ` Matthieu Moy
  0 siblings, 0 replies; 21+ messages in thread
From: Matthieu Moy @ 2010-06-29  6:46 UTC (permalink / raw)
  To: Karl Stenerud; +Cc: git

Karl Stenerud <kstenerud@gmail.com> writes:

> In the case which I observed, the time from typing "git pull" until
> the auto-gc notification was a few seconds.  The time from the
> notification until gc completion, however, was 3-4 minutes.

The missing bit here is how often this happens.

Once you did an actual "git gc", your repository is stored in a single
pack file, and only new files are stored unpacked or in separate
packs. Subsequent "git gc --auto" will just pack unpacked objects
together by groups of approximately "gc.auto" (i.e. 6700), which is
quick, until you reach "gc.packlimit" (i.e. 50). You'll trigger an
expansive repack only after that. So, roughly, you'll get an expansive
repack after creating 300,000 object files (unless you ran "git gc" in
the meantime). Unless you do very nasty things with your repo, that
really means "not often", and especially "not often enough that you
should care" in 99.9% cases.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH/RFC] Documentation/git-gc.txt: add reference to githooks
  2010-06-28 18:43     ` Marc Branchaud
@ 2010-06-29 15:43       ` Junio C Hamano
  2010-06-29 16:00         ` Chris Packham
  0 siblings, 1 reply; 21+ messages in thread
From: Junio C Hamano @ 2010-06-29 15:43 UTC (permalink / raw)
  To: Marc Branchaud; +Cc: Chris Packham, spearce, git

Marc Branchaud <marcnarc@xiplink.com> writes:

> How about also adding a templates/hooks--pre-auto-gc.sample file?

Please don't, as samples are propagated to all newly created repositories.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH/RFC] Documentation/git-gc.txt: add reference to githooks
  2010-06-29 15:43       ` Junio C Hamano
@ 2010-06-29 16:00         ` Chris Packham
  0 siblings, 0 replies; 21+ messages in thread
From: Chris Packham @ 2010-06-29 16:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Marc Branchaud, spearce, git

On Tue, Jun 29, 2010 at 8:43 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Marc Branchaud <marcnarc@xiplink.com> writes:
>
>> How about also adding a templates/hooks--pre-auto-gc.sample file?
>
> Please don't, as samples are propagated to all newly created repositories.
>

OK. I won't add a sample hook.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH/RFC] Documentation/git-gc.txt: add reference to githooks
  2010-06-28 17:07   ` [PATCH/RFC] Documentation/git-gc.txt: add reference to githooks Chris Packham
  2010-06-28 18:43     ` Marc Branchaud
@ 2010-06-29 16:26     ` Junio C Hamano
  2010-06-29 18:16       ` Chris Packham
  2010-09-21 18:14     ` [PATCH/RFC] " David Brown
  2 siblings, 1 reply; 21+ messages in thread
From: Junio C Hamano @ 2010-06-29 16:26 UTC (permalink / raw)
  To: Chris Packham; +Cc: spearce, git

Chris Packham <judge.packham@gmail.com> writes:

> +HOOKS
> +-----
> +This command can run `pre-auto-gc` hook.  See linkgit:githooks[5] for more
> +information.

Hmm.  "git gc --auto" does, but "git gc" doesn't, and saying "can run"
only adds to the sense of incompleteness of the description here without
giving useful information to the reader (iow, the user will have to check
the referred-to page anyway).  We would need to either remove the first
sentence (leaving only "See ... for information") or clarify the first
sentence a bit better, I think.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH/RFC] Documentation/git-gc.txt: add reference to githooks
  2010-06-29 16:26     ` Junio C Hamano
@ 2010-06-29 18:16       ` Chris Packham
  2010-06-30 20:41         ` [PATCHv2] " Chris Packham
  0 siblings, 1 reply; 21+ messages in thread
From: Chris Packham @ 2010-06-29 18:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: spearce, git

On 29/06/10 09:26, Junio C Hamano wrote:
> Chris Packham <judge.packham@gmail.com> writes:
>   
>> +HOOKS
>> +-----
>> +This command can run `pre-auto-gc` hook.  See linkgit:githooks[5] for more
>> +information.
>>     
> Hmm.  "git gc --auto" does, but "git gc" doesn't, and saying "can run"
> only adds to the sense of incompleteness of the description here without
> giving useful information to the reader (iow, the user will have to check
> the referred-to page anyway).  We would need to either remove the first
> sentence (leaving only "See ... for information") or clarify the first
> sentence a bit better, I think.
>   
How about

The `git gc --auto` command will run the `pre-auto-gc` hook, if
enabled.  See linkgit:githooks[5] for more information.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: notify alternative to auto gc?
  2010-06-28 16:26 ` Ævar Arnfjörð Bjarmason
  2010-06-28 16:52   ` Karl Stenerud
@ 2010-06-30  2:27   ` Sam Vilain
  1 sibling, 0 replies; 21+ messages in thread
From: Sam Vilain @ 2010-06-30  2:27 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason; +Cc: Karl Stenerud, git

On Mon, 2010-06-28 at 16:26 +0000, Ævar Arnfjörð Bjarmason wrote:
> On Mon, Jun 28, 2010 at 16:10, Karl Stenerud <kstenerud@gmail.com> wrote:
> > The git help tells me I can disable it by setting gc.auto to 0, while the mailing list archive tells me I also have to set gc.autopacklimit to 0.  This is fine, but if I do that, I won't know when the repo is in need of cleanup.  Is there any option I can set to instruct it to simply TELL me when it's in need of gc?
> 
> Anything that tells you whether you need to gc would incur much of the
> speed penalty that running gc itself does.

See builtin/gc.c:too_many_loose_objects

Checking that gc is required involves opening one directory (objects/17
IIRC), reading all of the entries in it and counting them.  It really
doesn't hurt.

Sam

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCHv2] Documentation/git-gc.txt: add reference to githooks
  2010-06-29 18:16       ` Chris Packham
@ 2010-06-30 20:41         ` Chris Packham
  0 siblings, 0 replies; 21+ messages in thread
From: Chris Packham @ 2010-06-30 20:41 UTC (permalink / raw)
  To: gitster; +Cc: git, Chris Packham

This advertises the existence of the 'pre-auto-gc' hook and adds a cross
reference to where the hook is documented.

Signed-off-by: Chris Packham <judge.packham@gmail.com>
---

I removed the ',if enabled' part from what I suggested on the list. It never
did read quite right to me and I think it just adds confusion to what should be
a simple sentence.

 Documentation/git-gc.txt |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
index a9e0882..315f07e 100644
--- a/Documentation/git-gc.txt
+++ b/Documentation/git-gc.txt
@@ -137,6 +137,13 @@ If you are expecting some objects to be collected and they aren't, check
 all of those locations and decide whether it makes sense in your case to
 remove those references.
 
+HOOKS
+-----
+
+The 'git gc --auto' command will run the 'pre-auto-gc' hook.  See
+linkgit:githooks[5] for more information.
+
+
 SEE ALSO
 --------
 linkgit:git-prune[1]
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH/RFC] Documentation/git-gc.txt: add reference to githooks
  2010-06-28 17:07   ` [PATCH/RFC] Documentation/git-gc.txt: add reference to githooks Chris Packham
  2010-06-28 18:43     ` Marc Branchaud
  2010-06-29 16:26     ` Junio C Hamano
@ 2010-09-21 18:14     ` David Brown
  2010-09-21 20:52       ` Allow git remote add but not git clone ( was Re: [PATCH/RFC] Documentation/git-gc.txt: add reference to githooks) Chris Packham
  2010-09-21 23:18       ` Re* [PATCH/RFC] Documentation/git-gc.txt: add reference to githooks Junio C Hamano
  2 siblings, 2 replies; 21+ messages in thread
From: David Brown @ 2010-09-21 18:14 UTC (permalink / raw)
  To: git

Suppose I want to publish some changes to a tree.  I have a server
available where I can run a git daemon, but for one reason or another
I want to force people to use the another git repo as a reference.
The reason could be one of bandwidth, or someone who isn't comfortable
making all of the other source available.  Ideally, someone who
already has the other git repo cloned, and just adds mine as a remote
wouldn't notice the difference.

Is there a way to do this?  I've tried various ways of using
alternates to keep the blobs out of the repository I want to export,
but the daemon just follows the alternates.  If I remove the
alternates, I then seem to have a broken repository.  Most things I
try, at least carry objects for all of the files in the HEAD tree,
which most of the time is a large portion of the data.

If there isn't a way of doing this currently, is this something that
others would find useful?

Thanks,
David

-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Allow git remote add but not git clone ( was Re: [PATCH/RFC] Documentation/git-gc.txt: add reference to githooks)
  2010-09-21 18:14     ` [PATCH/RFC] " David Brown
@ 2010-09-21 20:52       ` Chris Packham
  2010-09-21 21:01         ` David Brown
  2010-09-21 23:18       ` Re* [PATCH/RFC] Documentation/git-gc.txt: add reference to githooks Junio C Hamano
  1 sibling, 1 reply; 21+ messages in thread
From: Chris Packham @ 2010-09-21 20:52 UTC (permalink / raw)
  To: David Brown; +Cc: git

On 21/09/10 11:14, David Brown wrote:
> Subject: Re: [PATCH/RFC] Documentation/git-gc.txt: add reference to githooks

Wow odd thread to pickup. Hopefully someone that knows more than me will
notice the change of subject an provide a better answer.

> Suppose I want to publish some changes to a tree.  I have a server
> available where I can run a git daemon, but for one reason or another
> I want to force people to use the another git repo as a reference.
> The reason could be one of bandwidth, or someone who isn't comfortable
> making all of the other source available.  Ideally, someone who
> already has the other git repo cloned, and just adds mine as a remote
> wouldn't notice the difference.

Sounds like a reasonable motivation.

> Is there a way to do this?

As far as I know no. The mechanisms that git clone and git remote
add/git fetch are fairly generic so I doubt there is a way for the git
daemon to know which was run by the user at the other end. Maybe there
are other possible solutions outside of git to but a cap the amount of
data sent. Doesn't look like there are any hooks on the upload-pack side
of git daemon.

> If there isn't a way of doing this currently, is this something that
> others would find useful?

I personally wouldn't but I can see why some people might want this.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Allow git remote add but not git clone ( was Re: [PATCH/RFC] Documentation/git-gc.txt: add reference to githooks)
  2010-09-21 20:52       ` Allow git remote add but not git clone ( was Re: [PATCH/RFC] Documentation/git-gc.txt: add reference to githooks) Chris Packham
@ 2010-09-21 21:01         ` David Brown
  0 siblings, 0 replies; 21+ messages in thread
From: David Brown @ 2010-09-21 21:01 UTC (permalink / raw)
  To: Chris Packham; +Cc: git

On Tue, Sep 21, 2010 at 01:52:57PM -0700, Chris Packham wrote:
> On 21/09/10 11:14, David Brown wrote:
> > Subject: Re: [PATCH/RFC] Documentation/git-gc.txt: add reference to githooks
> 
> Wow odd thread to pickup. Hopefully someone that knows more than me will
> notice the change of subject an provide a better answer.

Yeah, I sent another message with an entirely new subject and new
thread so that hopefully people will see it.  This one can just die.

David

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re* [PATCH/RFC] Documentation/git-gc.txt: add reference to githooks
  2010-09-21 18:14     ` [PATCH/RFC] " David Brown
  2010-09-21 20:52       ` Allow git remote add but not git clone ( was Re: [PATCH/RFC] Documentation/git-gc.txt: add reference to githooks) Chris Packham
@ 2010-09-21 23:18       ` Junio C Hamano
  1 sibling, 0 replies; 21+ messages in thread
From: Junio C Hamano @ 2010-09-21 23:18 UTC (permalink / raw)
  To: David Brown; +Cc: git

David Brown <davidb@codeaurora.org> writes:

> Suppose I want to publish some changes to a tree.  I have a server
> available where I can run a git daemon, but for one reason or another
> I want to force people to use the another git repo as a reference.
> The reason could be one of bandwidth, or someone who isn't comfortable
> making all of the other source available.  Ideally, someone who
> already has the other git repo cloned, and just adds mine as a remote
> wouldn't notice the difference.
>
> Is there a way to do this?  I've tried various ways of using
> alternates to keep the blobs out of the repository I want to export,
> but the daemon just follows the alternates.  If I remove the
> alternates, I then seem to have a broken repository.  Most things I
> try, at least carry objects for all of the files in the HEAD tree,
> which most of the time is a large portion of the data.

I've seen people ask for something like this in the past for a few times.
Once might have been during GitTogether'08 by you.

I don't think there is a way to do that, though.

Perhaps something like this totally untested patch?  The code may be
utterly wrong but you should be able to get the idea from the test
scripts.  Add configuration to the repository for you to declare which
objects are prerequiste for pulling from you, with a message to suggest
where to grab them from.

I am a bit reluctant to suggest upload-pack to start reading from the
repository configuration, though.

 t/t5532-upload-limit.sh |  107 +++++++++++++++++++++++++++++++++++++++++
 upload-pack.c           |  120 +++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 227 insertions(+), 0 deletions(-)

diff --git a/t/t5532-upload-limit.sh b/t/t5532-upload-limit.sh
new file mode 100755
index 0000000..070594e
--- /dev/null
+++ b/t/t5532-upload-limit.sh
@@ -0,0 +1,107 @@
+#!/bin/sh
+#
+# Copyright (c) 2010 Google Inc.
+#
+
+test_description='git upload-pack honoring clonelimit configuration'
+
+. ./test-lib.sh
+
+grow () {
+	for a
+	do
+		echo "$a" >file &&
+		git add file &&
+		git commit -m "$a" || exit
+	done
+}
+
+setup_clone () {
+	git clone "file://$(pwd)/src" "$1" &&
+	(
+		cd "$1" &&
+		git rev-parse --verify "$2" >actual
+	) &&
+	test_cmp src/expect "$1/actual"
+}
+
+test_expect_success 'setup' '
+	mkdir src &&
+	(
+		cd src &&
+		git init &&
+		grow a b c &&
+		git rev-parse --verify master >expect
+	) &&
+	setup_clone dst-0 master &&
+	setup_clone dst-1 master &&
+	(
+		cd src &&
+		grow d e &&
+		git tag -a -m mark mark &&
+		git rev-parse --verify mark >expect
+	) &&
+	setup_clone dst-2 mark &&
+	setup_clone dst-3 mark &&
+	(
+		cd src &&
+		grow f g h &&
+		git tag -a -m tip tip &&
+		git rev-parse --verify tip >expect
+	)
+'
+
+test_expect_success 'baseline - no limitation in fetch' '
+	(
+		cd dst-0 &&
+		git fetch &&
+		git rev-parse --verify tip >actual
+	) &&
+	test_cmp src/expect dst-0/actual
+'
+
+test_expect_success 'clone refused when limitation is set' '
+	(
+		cd src &&
+		git config clonelimit.mark.message "go away"
+	) &&
+	test_must_fail git clone "file://$(pwd)/src" dst-4 &&
+	! test -d dst-4
+'
+
+test_expect_success 'fetch refused when limitation is unmet' '
+	(
+		cd dst-1 &&
+		test_must_fail git fetch &&
+		test_must_fail git rev-parse --verify tip
+	)
+'
+
+test_expect_success 'fetch works when limitation is met' '
+	(
+		cd src &&
+		git config clonelimit.mark.message "go away"
+	) &&
+	(
+		cd dst-2 &&
+		git fetch &&
+		git rev-parse --verify tip >actual
+	) &&
+	test_cmp src/expect dst-2/actual
+'
+
+test_expect_success 'missing tag is not a limitation violation' '
+	(
+		cd src &&
+		git config clonelimit.mark.message "go away"
+	) &&
+	(
+		cd dst-3 &&
+		git tag -d mark &&
+		git fetch &&
+		git rev-parse --verify tip >actual
+	) &&
+	test_cmp src/expect dst-3/actual
+'
+
+test_done
diff --git a/upload-pack.c b/upload-pack.c
index 92f9530..9d4a367 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -655,6 +655,124 @@ static int mark_our_ref(const char *refname, const unsigned char *sha1, int flag
 	return 0;
 }
 
+/*
+ * Some repositories may not want to allow a full clone, and
+ * want users to first fetch from other repositories with
+ * better connection.  By having a configuration variable like
+ * this:
+ *
+ * [clonelimit "v0.99"]
+ *   message = go to git://git.kernel.org/pub/scm/git/...
+ *
+ * a fetch request by a fetcher that does not have the named commit
+ * is denied with the given message.
+ */
+struct clone_limit {
+	struct clone_limit *next;
+	struct commit *commit;
+	char *ref; /* points at the tail of msg to store symbolic ref */
+	char msg[FLEX_ARRAY]; /* message and more */
+};
+
+static int check_clone_limit(const char *var, const char *value, void *cb_)
+{
+	int msglen, reflen;
+	const char *lastdot;
+	struct clone_limit **clp = cb_;
+	struct clone_limit *limit;
+	unsigned char sha1[20];
+
+	if (prefixcmp(var, "clonelimit."))
+		return 0;
+
+	if (debug_fd) {
+		write_str_in_full(debug_fd, "config: ");
+		write_str_in_full(debug_fd, var);
+		write_str_in_full(debug_fd, " = <");
+		write_str_in_full(debug_fd, value);
+		write_str_in_full(debug_fd, ">\n");
+	}
+
+	var += 11; /* skip "clonelimit." */
+	lastdot = strrchr(var, '.');
+	reflen = lastdot - var;
+	if (reflen < 0 || memcmp(var + reflen, ".message", 9))
+		return 0; /* not our variable */
+
+	/*
+	 * The remainder will ignore a malformed entry; we might
+	 * want to abort the whole operation.  I dunno.
+	 */
+	if (!value)
+		return 0;
+	msglen = strlen(value) + 1;
+	limit = xmalloc(sizeof(*limit) + msglen + reflen + 1);
+	memcpy(limit->msg, value, msglen);
+	limit->ref = limit->msg + msglen;
+	memcpy(limit->ref, var, reflen);
+	limit->ref[reflen] = '\0';
+	if (get_sha1(limit->ref, sha1)) {
+		free(limit);
+		return 0;
+	}
+	limit->commit = lookup_commit_reference_gently(sha1, 0);
+	if (!limit->commit) {
+		free(limit);
+		return 0;
+	}
+	limit->next = *clp;
+	*clp = limit;
+	return 0;
+}
+
+static int limit_served_history(void)
+{
+	int has_missing, i, n;
+	struct clone_limit *clone_limit = NULL;
+	struct clone_limit *cl;
+	struct commit **twos = NULL;
+	struct commit_list *l;
+
+	git_config(check_clone_limit, &clone_limit);
+	if (!clone_limit)
+		return 0;
+
+	twos = xcalloc(have_obj.nr, sizeof(*twos));
+	for (i = n = 0; i < have_obj.nr; i++) {
+		struct object *o = have_obj.objects[i].item;
+		struct commit *have = lookup_commit_reference_gently(o->sha1, 0);
+		if (!have)
+			continue;
+		twos[n++] = have;
+	}
+
+	has_missing = 0;
+	for (cl = clone_limit; cl; cl = cl->next) {
+		int seen = 0;
+		l = get_merge_bases_many(cl->commit, n, twos, 1);
+		while (l) {
+			struct commit_list *next = l->next;
+			if (!hashcmp(l->item->object.sha1, cl->commit->object.sha1))
+				seen = 1;
+			free(l);
+			l = next;
+		}
+		if (!seen) {
+			has_missing = 1;
+			error("you do not have object '%s'; %s",
+			      cl->ref, cl->msg);
+		}
+	}
+
+	while (clone_limit) {
+		struct clone_limit *next = clone_limit->next;
+		free(clone_limit);
+		clone_limit = next;
+	}
+	free(twos);
+	return has_missing;
+}
+
 static void upload_pack(void)
 {
 	if (advertise_refs || !stateless_rpc) {
@@ -672,6 +790,8 @@ static void upload_pack(void)
 	receive_needs();
 	if (want_obj.nr) {
 		get_common_commits();
+		if (limit_served_history())
+			die("git upload-pack: missing prerequisites");
 		create_pack_file();
 	}
 }

^ permalink raw reply related	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2010-09-21 23:20 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-28 16:10 notify alternative to auto gc? Karl Stenerud
2010-06-28 16:26 ` Ævar Arnfjörð Bjarmason
2010-06-28 16:52   ` Karl Stenerud
2010-06-29  6:46     ` Matthieu Moy
2010-06-30  2:27   ` Sam Vilain
2010-06-28 16:27 ` Matthieu Moy
2010-06-28 16:29 ` Chris Packham
2010-06-28 16:56   ` Karl Stenerud
2010-06-28 17:07   ` [PATCH/RFC] Documentation/git-gc.txt: add reference to githooks Chris Packham
2010-06-28 18:43     ` Marc Branchaud
2010-06-29 15:43       ` Junio C Hamano
2010-06-29 16:00         ` Chris Packham
2010-06-29 16:26     ` Junio C Hamano
2010-06-29 18:16       ` Chris Packham
2010-06-30 20:41         ` [PATCHv2] " Chris Packham
2010-09-21 18:14     ` [PATCH/RFC] " David Brown
2010-09-21 20:52       ` Allow git remote add but not git clone ( was Re: [PATCH/RFC] Documentation/git-gc.txt: add reference to githooks) Chris Packham
2010-09-21 21:01         ` David Brown
2010-09-21 23:18       ` Re* [PATCH/RFC] Documentation/git-gc.txt: add reference to githooks Junio C Hamano
2010-06-28 18:58   ` notify alternative to auto gc? Eric Raible
2010-06-28 19:02     ` Jacob Helwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).