Git development

Git development
 help / color / mirror / Atom feed

* Re: Trying to use xfuncname without success.
From: Jack Adrian Zappa @ 2017-02-08 20:42 UTC (permalink / raw)
  To: Samuel Lijin; +Cc: René Scharfe, git-mailing-list
In-Reply-To: <CAJZjrdXjnDMi8gMY6f_UDbMZrZJ=AoPM+g01hqPCO2pB9csoOw@mail.gmail.com>

Thanks Samuel,

So, the question is, what is causing this problem on my system?

Anyone have an idea to help diagnose this problem?

On Wed, Feb 8, 2017 at 3:24 PM, Samuel Lijin <sxlijin@gmail.com> wrote:
> On Windows 7, it works for me in both CMD and Git Bash:
>
> $ git --version
> git version 2.11.0.windows.3
>
> $ git diff HEAD^ --word-diff
> diff --git a/test.natvis b/test.natvis
> index 93396ad..1233b8c 100644
> --- a/test.natvis
> +++ b/test.natvis
> @@ -18,6 +18,7 @@ test
>
>
>       <!-- Non-blank line -->
>       {+<Item Name="added var">added_var</Item>+}
>
>
>       <Item Name="var2">var2</Item>
>
> On Wed, Feb 8, 2017 at 12:37 PM, René Scharfe <l.s.r@web.de> wrote:
>> Am 08.02.2017 um 18:11 schrieb Jack Adrian Zappa:
>>> Thanks Rene, but you seem to have missed the point.  NOTHING is
>>> working.  No matter what I put there, it doesn't seem to get matched.
>>
>> I'm not so sure about that.  With your example I get this diff without
>> setting diff.natvis.xfuncname:
>>
>> diff --git a/a.natvis b/a.natvis
>> index 7f9bdf5..bc3c090 100644
>> --- a/a.natvis
>> +++ b/a.natvis
>> @@ -19,7 +19,7 @@ xmlns="http://schemas.microsoft.com/vstudio/debugger/natvis/2010">
>>
>>
>>        <!-- Non-blank line -->
>> -      <Item Name="added var">added_var</Item>
>> +      <Item Name="added var">added_vars</Item>
>>
>>
>>        <Item Name="var2">var2</Item>
>>
>> Note the XML namespace in the hunk header.  It's put there by the
>> default rule because "xmlns" starts at the beginning of the line.  Your
>> diff has nothing there, which means the default rule is not used, i.e.
>> your user-defined rule is in effect.
>>
>> Come to think of it, this line break in the middle of the AutoVisualizer
>> tab might have been added by your email client unintentionally, so that
>> we use different test files, which then of course results in different
>> diffs.  Is that the case?
>>
>> Anyway, if I run the following two commands:
>>
>> $ git config diff.natvis.xfuncname "^[\t ]*<Type[\t ]+Name=\"([^\"]+)\".*$"
>> $ echo '*.natvis diff=natvis' >.gitattributes
>>
>> ... then I get this, both on Linux (git version 2.11.1) and on Windows
>> (git version 2.11.1.windows.1):
>>
>> diff --git a/a.natvis b/a.natvis
>> index 7f9bdf5..bc3c090 100644
>> --- a/a.natvis
>> +++ b/a.natvis
>> @@ -19,7 +19,7 @@ test
>>
>>
>>        <!-- Non-blank line -->
>> -      <Item Name="added var">added_var</Item>
>> +      <Item Name="added var">added_vars</Item>
>>
>>
>>        <Item Name="var2">var2</Item>
>>
>>> Just to be sure, I tested your regex and again it didn't work.
>>
>> At this point I'm out of ideas, sorry. :(  The only way I was able to
>> break it was due to mistyping the extension as "netvis" several times
>> for some reason.
>>
>> René

^ permalink raw reply

* Re: Trying to use xfuncname without success.
From: Samuel Lijin @ 2017-02-08 20:46 UTC (permalink / raw)
  To: Jack Adrian Zappa; +Cc: René Scharfe, git-mailing-list
In-Reply-To: <CAKepmagwMeky4jPZ-YFgPsZSsyOZZQ-kJSWV8QGg4cUNu-ZS8Q@mail.gmail.com>

I just put this togther: https://github.com/sxlijin/xfuncname-test

Try cloning and then for any of config1 thru 3,

$ cp configX .git/config
$ git diff HEAD^ -- test.natvis

On Wed, Feb 8, 2017 at 2:42 PM, Jack Adrian Zappa <adrianh.bsc@gmail.com> wrote:
> Thanks Samuel,
>
> So, the question is, what is causing this problem on my system?
>
> Anyone have an idea to help diagnose this problem?
>
> On Wed, Feb 8, 2017 at 3:24 PM, Samuel Lijin <sxlijin@gmail.com> wrote:
>> On Windows 7, it works for me in both CMD and Git Bash:
>>
>> $ git --version
>> git version 2.11.0.windows.3
>>
>> $ git diff HEAD^ --word-diff
>> diff --git a/test.natvis b/test.natvis
>> index 93396ad..1233b8c 100644
>> --- a/test.natvis
>> +++ b/test.natvis
>> @@ -18,6 +18,7 @@ test
>>
>>
>>       <!-- Non-blank line -->
>>       {+<Item Name="added var">added_var</Item>+}
>>
>>
>>       <Item Name="var2">var2</Item>
>>
>> On Wed, Feb 8, 2017 at 12:37 PM, René Scharfe <l.s.r@web.de> wrote:
>>> Am 08.02.2017 um 18:11 schrieb Jack Adrian Zappa:
>>>> Thanks Rene, but you seem to have missed the point.  NOTHING is
>>>> working.  No matter what I put there, it doesn't seem to get matched.
>>>
>>> I'm not so sure about that.  With your example I get this diff without
>>> setting diff.natvis.xfuncname:
>>>
>>> diff --git a/a.natvis b/a.natvis
>>> index 7f9bdf5..bc3c090 100644
>>> --- a/a.natvis
>>> +++ b/a.natvis
>>> @@ -19,7 +19,7 @@ xmlns="http://schemas.microsoft.com/vstudio/debugger/natvis/2010">
>>>
>>>
>>>        <!-- Non-blank line -->
>>> -      <Item Name="added var">added_var</Item>
>>> +      <Item Name="added var">added_vars</Item>
>>>
>>>
>>>        <Item Name="var2">var2</Item>
>>>
>>> Note the XML namespace in the hunk header.  It's put there by the
>>> default rule because "xmlns" starts at the beginning of the line.  Your
>>> diff has nothing there, which means the default rule is not used, i.e.
>>> your user-defined rule is in effect.
>>>
>>> Come to think of it, this line break in the middle of the AutoVisualizer
>>> tab might have been added by your email client unintentionally, so that
>>> we use different test files, which then of course results in different
>>> diffs.  Is that the case?
>>>
>>> Anyway, if I run the following two commands:
>>>
>>> $ git config diff.natvis.xfuncname "^[\t ]*<Type[\t ]+Name=\"([^\"]+)\".*$"
>>> $ echo '*.natvis diff=natvis' >.gitattributes
>>>
>>> ... then I get this, both on Linux (git version 2.11.1) and on Windows
>>> (git version 2.11.1.windows.1):
>>>
>>> diff --git a/a.natvis b/a.natvis
>>> index 7f9bdf5..bc3c090 100644
>>> --- a/a.natvis
>>> +++ b/a.natvis
>>> @@ -19,7 +19,7 @@ test
>>>
>>>
>>>        <!-- Non-blank line -->
>>> -      <Item Name="added var">added_var</Item>
>>> +      <Item Name="added var">added_vars</Item>
>>>
>>>
>>>        <Item Name="var2">var2</Item>
>>>
>>>> Just to be sure, I tested your regex and again it didn't work.
>>>
>>> At this point I'm out of ideas, sorry. :(  The only way I was able to
>>> break it was due to mistyping the extension as "netvis" several times
>>> for some reason.
>>>
>>> René

^ permalink raw reply

* Re: [PATCH] push options: fail properly in the stateless case
From: Stefan Beller @ 2017-02-08 20:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git@vger.kernel.org, Jonathan Nieder
In-Reply-To: <xmqqpoiswo0l.fsf@gitster.mtv.corp.google.com>

On Wed, Feb 8, 2017 at 10:11 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Stefan Beller <sbeller@google.com> writes:
>
>> When using non-builtin protocols relying on a transport helper
>> (such as http), push options are not propagated to the helper.
>>
>> Fix this by propagating the push options to the transport helper and
>
> The description up to this point is VERY readable and sensible.  But
> that makes the title sound a bit strange.

Yes, the title was there first and then I massaged the commit message
until it was readable. Originally I thought the issue is with stateless
protocols, but that is wrong. The underlying issue is just the transport
helper communication lacking.

> I read it as if it were
> saying "stateless case can never support push-options so fail if the
> caller attempts to use one", but that does not seem to be what is
> going on.

Indeed the subject is wrong.

>
>> adding a test that push options using http fail properly.
>
> Sounds sensible.  What end-user visible effect does this fix have?
> IOW, what feature do we use "push-option" for?

The Gerrit world started using push options for having a better git
experience, not needing to navigate to the web UI, e.g.:

    # push for review and set me as a reviewer:
    git push --push-option reviewer=sbeller@google.com  ssh://gerrit...

    # same with http, it looks like it worked, but the push option
    # never made it to the server
    git push --push-option reviewer=sbeller@google.com  http://gerrit...

    # This patch will make the second command fail, reporting
    # the http helper not supporting push options.

>
> Ahh, OK, so you need to describe that there are two issues in order
> to be understood by the readers:
>
>  (1) the helper protocol does not propagate push-option
>  (2) the http helper is not prepared to handle push-option
>
> You fix (1), and you take advantage of the fact (2) to ensure that
> (1) is fixed in the new test.
>
> With such an understanding, the title makes (sort of) sense and you
> wouldn't have to be asked "what end-user visible effect/benefit does
> this have?"

Your analysis is correct.

>
>> +'option push-option <c-string>::
>> +     Transmit this push option.
>> +
>
> There is no "c-string" in the current documentation used or
> defined.  The closest thing I found is
>
>     ... that field will be quoted in the manner of a C string ...
>
> in git-status page, but I do not think you send the value for an
> push-option after running quote_c_style(), so I am puzzled.

When implementing push options, we discussed that and according to
Documentation/git-push:

    The given string must not contain a NUL or LF character.

>
> I'd rather see 'option push-option <string>' as the bullet item, and
> in its description say how arbitrary values (if you allow them, that
> is) can be used, e.g. "Transmit <string> encoded in such and such
> way a the value of the push-option".

okay.

^ permalink raw reply

* [PATCH v2 0/11] reducing resource usage of for_each_alternate_ref
From: Jeff King @ 2017-02-08 20:52 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano
In-Reply-To: <20170124003729.j4ygjcgypdq7hceg@sigill.intra.peff.net>

This is a minor re-roll of the patches from:

  http://public-inbox.org/git/20170124003729.j4ygjcgypdq7hceg@sigill.intra.peff.net/

(which got some review, but I don't think was picked up for even 'pu').

I won't repeat the numbers and background from that message, but the
gist of it is that this reduces memory usage significantly when your
alternate has a lot of refs in it.

This version makes two minor changes:

  - it drops the save_commit_buffer patch to clone; it's redundant with
    what fetch_pack() is doing internally, and I wasn't able to measure
    any improvement

  - it adds a missing "static" to an internal function

The only other possible change from the review would be sorting the
expected output in the test of the final script. I'm on the fence
whether it is a feature that we expect a particular ordering. It's not
set in stone ,but it _is_ deterministic, and if we change the order, it
might be worth somebody actually noticing.

  [01/11]: for_each_alternate_ref: handle failure from real_pathdup()
  [02/11]: for_each_alternate_ref: stop trimming trailing slashes
  [03/11]: for_each_alternate_ref: use strbuf for path allocation
  [04/11]: for_each_alternate_ref: pass name/oid instead of ref struct
  [05/11]: for_each_alternate_ref: replace transport code with for-each-ref
  [06/11]: fetch-pack: cache results of for_each_alternate_ref
  [07/11]: add oidset API
  [08/11]: receive-pack: use oidset to de-duplicate .have lines
  [09/11]: receive-pack: fix misleading namespace/.have comment
  [10/11]: receive-pack: treat namespace .have lines like alternates
  [11/11]: receive-pack: avoid duplicates between our refs and alternates

 Makefile               |  1 +
 builtin/receive-pack.c | 41 +++++++++++++++-------------
 fetch-pack.c           | 48 ++++++++++++++++++++++++++++-----
 object.h               |  2 +-
 oidset.c               | 49 ++++++++++++++++++++++++++++++++++
 oidset.h               | 45 +++++++++++++++++++++++++++++++
 t/t5400-send-pack.sh   | 38 ++++++++++++++++++++++++++
 transport.c            | 72 +++++++++++++++++++++++++++++++++++---------------
 transport.h            |  2 +-
 9 files changed, 249 insertions(+), 49 deletions(-)
 create mode 100644 oidset.c
 create mode 100644 oidset.h


^ permalink raw reply

* [PATCH v2 01/11] for_each_alternate_ref: handle failure from real_pathdup()
From: Jeff King @ 2017-02-08 20:52 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano
In-Reply-To: <20170208205219.twgm5rggovqbepte@sigill.intra.peff.net>

In older versions of git, if real_path() failed to resolve
the alternate object store path, we would die() with an
error. However, since 4ac9006f8 (real_path: have callers use
real_pathdup and strbuf_realpath, 2016-12-12) we use the
real_pathdup() function, which may return NULL. Since we
don't check the return value, we can segfault.

This is hard to trigger in practice, since we check that the
path is accessible before creating the alternate_object_database
struct. But it could be removed racily, or we could see a
transient filesystem error.

We could restore the original behavior by switching back to
xstrdup(real_path()).  However, dying is probably not the
best option here. This whole function is best-effort
already; there might not even be a repository around the
shared objects at all. And if the alternate store has gone
away, there are no objects to show.

So let's just quietly return, as we would if we failed to
open "refs/", or if upload-pack failed to start, etc.

Signed-off-by: Jeff King <peff@peff.net>
---
 transport.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/transport.c b/transport.c
index d72e08948..9ce0ee96b 100644
--- a/transport.c
+++ b/transport.c
@@ -1222,6 +1222,8 @@ static int refs_from_alternate_cb(struct alternate_object_database *e,
 	struct alternate_refs_data *cb = data;

 	other = real_pathdup(e->path);
+	if (!other)
+		return 0;
 	len = strlen(other);

 	while (other[len-1] == '/')
-- 
2.12.0.rc0.371.ga6cf8653b

^ permalink raw reply related

* [PATCH v2 02/11] for_each_alternate_ref: stop trimming trailing slashes
From: Jeff King @ 2017-02-08 20:52 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano
In-Reply-To: <20170208205219.twgm5rggovqbepte@sigill.intra.peff.net>

The real_pathdup() function will have removed extra slashes
for us already (on top of the normalize_path() done when we
created the alternate_object_database struct in the first
place).

Incidentally, this also fixes the case where the path is
just "/", which would read off the start of the array.
That doesn't seem possible to trigger in practice, though,
as link_alt_odb_entry() blindly eats trailing slashes,
including a bare "/".

Signed-off-by: Jeff King <peff@peff.net>
---
 transport.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/transport.c b/transport.c
index 9ce0ee96b..6fba9e95b 100644
--- a/transport.c
+++ b/transport.c
@@ -1226,8 +1226,6 @@ static int refs_from_alternate_cb(struct alternate_object_database *e,
 		return 0;
 	len = strlen(other);

-	while (other[len-1] == '/')
-		other[--len] = '\0';
 	if (len < 8 || memcmp(other + len - 8, "/objects", 8))
 		goto out;
 	/* Is this a git repository with refs? */
-- 
2.12.0.rc0.371.ga6cf8653b

^ permalink raw reply related

* Re: [PATCH] dir: avoid allocation in fill_directory()
From: Brandon Williams @ 2017-02-08 19:54 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: René Scharfe, Git List
In-Reply-To: <CACsJy8CE-cyTZHZZhvhdsNau7iSqBci1BdUqDYvxoE5odV2SBA@mail.gmail.com>

On 02/08, Duy Nguyen wrote:
> On Wed, Feb 8, 2017 at 5:04 AM, René Scharfe <l.s.r@web.de> wrote:
> > Pass the match member of the first pathspec item directly to
> > read_directory() instead of using common_prefix() to duplicate it first,
> > thus avoiding memory duplication, strlen(3) and free(3).
> 
> How about killing common_prefix()? There are two other callers in
> ls-files.c and commit.c and it looks safe to do (but I didn't look
> very hard).
> 
> > diff --git a/dir.c b/dir.c
> > index 65c3e681b8..4541f9e146 100644
> > --- a/dir.c
> > +++ b/dir.c
> > @@ -174,20 +174,19 @@ char *common_prefix(const struct pathspec *pathspec)
> >
> >  int fill_directory(struct dir_struct *dir, const struct pathspec *pathspec)
> >  {
> > -       char *prefix;
> > +       const char *prefix;
> >         size_t prefix_len;
> >
> >         /*
> >          * Calculate common prefix for the pathspec, and
> >          * use that to optimize the directory walk
> >          */
> > -       prefix = common_prefix(pathspec);
> > -       prefix_len = prefix ? strlen(prefix) : 0;
> > +       prefix_len = common_prefix_len(pathspec);
> > +       prefix = prefix_len ? pathspec->items[0].match : "";
> 
> There's a subtle difference. Before the patch, prefix[prefix_len] is
> NUL. After the patch, it's not always true. If some code (incorrectly)
> depends on that, this patch exposes it. I had a look inside
> read_directory() though and it looks like no such code exists. So, all
> good.

Yeah I had the exact same thought when looking at this, but I agree
everything looks fine.  And if something does indeed depend on prefix
having a \0 at prefix_len then this will allow us to more easily find
the bug and fix it.

> 
> >
> >         /* Read the directory and prune it */
> >         read_directory(dir, prefix, prefix_len, pathspec);
> >
> > -       free(prefix);
> >         return prefix_len;
> >  }
> -- 
> Duy

-- 
Brandon Williams

^ permalink raw reply

* [PATCH v2 04/11] for_each_alternate_ref: pass name/oid instead of ref struct
From: Jeff King @ 2017-02-08 20:52 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano
In-Reply-To: <20170208205219.twgm5rggovqbepte@sigill.intra.peff.net>

Breaking down the fields in the interface makes it easier to
change the backend of for_each_alternate_ref to something
that doesn't use "struct ref" internally.

The only field that callers actually look at is the oid,
anyway. The refname is kept in the interface as a plausible
thing for future code to want.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/receive-pack.c |  6 ++++--
 fetch-pack.c           | 12 ++++++++----
 transport.c            |  2 +-
 transport.h            |  2 +-
 4 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 1dbb8a069..d21332d9e 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -277,10 +277,12 @@ static int show_one_alternate_sha1(const unsigned char sha1[20], void *unused)
 	return 0;
 }
 
-static void collect_one_alternate_ref(const struct ref *ref, void *data)
+static void collect_one_alternate_ref(const char *refname,
+				      const struct object_id *oid,
+				      void *data)
 {
 	struct sha1_array *sa = data;
-	sha1_array_append(sa, ref->old_oid.hash);
+	sha1_array_append(sa, oid->hash);
 }
 
 static void write_head_info(void)
diff --git a/fetch-pack.c b/fetch-pack.c
index 601f0779a..54f84c573 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -253,9 +253,11 @@ static void send_request(struct fetch_pack_args *args,
 		write_or_die(fd, buf->buf, buf->len);
 }
 
-static void insert_one_alternate_ref(const struct ref *ref, void *unused)
+static void insert_one_alternate_ref(const char *refname,
+				     const struct object_id *oid,
+				     void *unused)
 {
-	rev_list_insert_ref(NULL, ref->old_oid.hash);
+	rev_list_insert_ref(NULL, oid->hash);
 }
 
 #define INITIAL_FLUSH 16
@@ -619,9 +621,11 @@ static void filter_refs(struct fetch_pack_args *args,
 	*refs = newlist;
 }
 
-static void mark_alternate_complete(const struct ref *ref, void *unused)
+static void mark_alternate_complete(const char *refname,
+				    const struct object_id *oid,
+				    void *unused)
 {
-	mark_complete(ref->old_oid.hash);
+	mark_complete(oid->hash);
 }
 
 static int everything_local(struct fetch_pack_args *args,
diff --git a/transport.c b/transport.c
index 6b7847131..c87147046 100644
--- a/transport.c
+++ b/transport.c
@@ -1238,7 +1238,7 @@ static int refs_from_alternate_cb(struct alternate_object_database *e,
 	for (extra = transport_get_remote_refs(transport);
 	     extra;
 	     extra = extra->next)
-		cb->fn(extra, cb->data);
+		cb->fn(extra->name, &extra->old_oid, cb->data);
 	transport_disconnect(transport);
 out:
 	strbuf_release(&path);
diff --git a/transport.h b/transport.h
index e597b31b3..bc5571574 100644
--- a/transport.h
+++ b/transport.h
@@ -255,6 +255,6 @@ int transport_refs_pushed(struct ref *ref);
 void transport_print_push_status(const char *dest, struct ref *refs,
 		  int verbose, int porcelain, unsigned int *reject_reasons);
 
-typedef void alternate_ref_fn(const struct ref *, void *);
+typedef void alternate_ref_fn(const char *refname, const struct object_id *oid, void *);
 extern void for_each_alternate_ref(alternate_ref_fn, void *);
 #endif
-- 
2.12.0.rc0.371.ga6cf8653b


^ permalink raw reply related

* [PATCH v2 03/11] for_each_alternate_ref: use strbuf for path allocation
From: Jeff King @ 2017-02-08 20:52 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano
In-Reply-To: <20170208205219.twgm5rggovqbepte@sigill.intra.peff.net>

We have a string with ".../objects" pointing to the
alternate object store, and overwrite bits of it to look at
other paths in the (potential) git repository holding it.
This works because the only path we care about is "refs",
which is shorter than "objects".

Using a strbuf to hold the path lets us get rid of some
magic numbers, and makes it more obvious that the memory
operations are safe.

Signed-off-by: Jeff King <peff@peff.net>
---
 transport.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/transport.c b/transport.c
index 6fba9e95b..6b7847131 100644
--- a/transport.c
+++ b/transport.c
@@ -1214,34 +1214,34 @@ struct alternate_refs_data {
 static int refs_from_alternate_cb(struct alternate_object_database *e,
 				  void *data)
 {
-	char *other;
-	size_t len;
+	struct strbuf path = STRBUF_INIT;
+	size_t base_len;
 	struct remote *remote;
 	struct transport *transport;
 	const struct ref *extra;
 	struct alternate_refs_data *cb = data;
 
-	other = real_pathdup(e->path);
-	if (!other)
-		return 0;
-	len = strlen(other);
-
-	if (len < 8 || memcmp(other + len - 8, "/objects", 8))
+	if (!strbuf_realpath(&path, e->path, 0))
+		goto out;
+	if (!strbuf_strip_suffix(&path, "/objects"))
 		goto out;
+	base_len = path.len;
+
 	/* Is this a git repository with refs? */
-	memcpy(other + len - 8, "/refs", 6);
-	if (!is_directory(other))
+	strbuf_addstr(&path, "/refs");
+	if (!is_directory(path.buf))
 		goto out;
-	other[len - 8] = '\0';
-	remote = remote_get(other);
-	transport = transport_get(remote, other);
+	strbuf_setlen(&path, base_len);
+
+	remote = remote_get(path.buf);
+	transport = transport_get(remote, path.buf);
 	for (extra = transport_get_remote_refs(transport);
 	     extra;
 	     extra = extra->next)
 		cb->fn(extra, cb->data);
 	transport_disconnect(transport);
 out:
-	free(other);
+	strbuf_release(&path);
 	return 0;
 }
 
-- 
2.12.0.rc0.371.ga6cf8653b


^ permalink raw reply related

* [PATCH v2 05/11] for_each_alternate_ref: replace transport code with for-each-ref
From: Jeff King @ 2017-02-08 20:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano
In-Reply-To: <20170208205219.twgm5rggovqbepte@sigill.intra.peff.net>

The current method for getting the refs from an alternate is
to run upload-pack in the alternate and parse its output
using the normal transport code.  This works and is
reasonably short, but it has a very bad memory footprint
when there are a lot of refs in the alternate. There are two
problems:

  1. It reads in all of the refs before passing any back to
     us. Which means that our peak memory usage has to store
     every ref (including duplicates for peeled variants),
     even if our callback could determine that some are not
     interesting (e.g., because they point to the same sha1
     as another ref).

  2. It allocates a "struct ref" for each one. Among other
     things, this contains 3 separate 20-byte oids, along
     with the name and various pointers.  That can add up,
     especially if the callback is only interested in the
     sha1 (which it can store in a sha1_array as just 20
     bytes).

On a particularly pathological case, where the alternate had
over 80 million refs pointing to only around 60,000 unique
objects, the peak heap usage of "git clone --reference" grew
to over 25GB.

This patch instead calls git-for-each-ref in the alternate
repository, and passes each line to the callback as we read
it. That drops the peak heap of the same command to 50MB.

I considered and rejected a few alternatives.

We could read all of the refs in the alternate using our own
ref code, just as we do with submodules.  However, as memory
footprint is one of the concerns here, we want to avoid
loading those refs into our own memory as a whole.

It's possible that this will be a better technique in the
future when the ref code can more easily iterate without
loading all of packed-refs into memory.

Another option is to keep calling upload-pack, and just
parse its output ourselves in a streaming fashion. Besides
for-each-ref being simpler (we get to define the format
ourselves, and don't have to deal with speaking the git
protocol), it's more flexible for possible future changes.

For instance, it might be useful for the caller to be able
to limit the set of "interesting" alternate refs.  The
motivating example is one where many "forks" of a particular
repository share object storage, and the shared storage has
refs for each fork (which is why so many of the refs are
duplicates; each fork has the same tags).  A plausible
future optimization would be to ask for the alternate refs
for just _one_ fork (if you had some out-of-band way of
knowing which was the most interesting or important for the
current operation).

Similarly, no callbacks actually care about the symref value
of alternate refs, and as before, this patch ignores them
entirely.  However, if we wanted to add them, for-each-ref's
"%(symref)" is going to be more flexible than upload-pack,
because the latter only handles the HEAD symref due to
historical constraints.

There is one potential downside, though: unlike upload-pack,
our for-each-ref command doesn't report the peeled value of
refs. The existing code calls the alternate_ref_fn callback
twice for tags: once for the tag, and once for the peeled
value with the refname set to "ref^{}".

For the callers in fetch-pack, this doesn't matter at all.
We immediately peel each tag down to a commit either way (so
there's a slight improvement, as do not bother passing the
redundant data over the pipe). For the caller in
receive-pack, it means we will not advertise the peeled
values of tags in our alternate. However, we also don't
advertise peeled values for our _own_ tags, so this is
actually making things more consistent.

It's unclear whether receive-pack advertising peeled values
is a win or not. On one hand, giving more information to the
other side may let it omit some objects from the push. On
the other hand, for tags which both sides have, they simply
bloat the advertisement. The upload-pack advertisement of
git.git is about 30% larger than the receive-pack
advertisement due to its peeled information.

This patch omits the peeled information from
for_each_alternate_ref entirely, and leaves it up to the
caller whether they want to dig up the information.

Signed-off-by: Jeff King <peff@peff.net>
---
 transport.c | 48 ++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 38 insertions(+), 10 deletions(-)

diff --git a/transport.c b/transport.c
index c87147046..48864b3a9 100644
--- a/transport.c
+++ b/transport.c
@@ -1206,6 +1206,42 @@ char *transport_anonymize_url(const char *url)
 	return xstrdup(url);
 }

+static void read_alternate_refs(const char *path,
+				alternate_ref_fn *cb,
+				void *data)
+{
+	struct child_process cmd = CHILD_PROCESS_INIT;
+	struct strbuf line = STRBUF_INIT;
+	FILE *fh;
+
+	cmd.git_cmd = 1;
+	argv_array_pushf(&cmd.args, "--git-dir=%s", path);
+	argv_array_push(&cmd.args, "for-each-ref");
+	argv_array_push(&cmd.args, "--format=%(objectname) %(refname)");
+	cmd.env = local_repo_env;
+	cmd.out = -1;
+
+	if (start_command(&cmd))
+		return;
+
+	fh = xfdopen(cmd.out, "r");
+	while (strbuf_getline_lf(&line, fh) != EOF) {
+		struct object_id oid;
+
+		if (get_oid_hex(line.buf, &oid) ||
+		    line.buf[GIT_SHA1_HEXSZ] != ' ') {
+			warning("invalid line while parsing alternate refs: %s",
+				line.buf);
+			break;
+		}
+
+		cb(line.buf + GIT_SHA1_HEXSZ + 1, &oid, data);
+	}
+
+	fclose(fh);
+	finish_command(&cmd);
+}
+
 struct alternate_refs_data {
 	alternate_ref_fn *fn;
 	void *data;
@@ -1216,9 +1252,6 @@ static int refs_from_alternate_cb(struct alternate_object_database *e,
 {
 	struct strbuf path = STRBUF_INIT;
 	size_t base_len;
-	struct remote *remote;
-	struct transport *transport;
-	const struct ref *extra;
 	struct alternate_refs_data *cb = data;

 	if (!strbuf_realpath(&path, e->path, 0))
@@ -1233,13 +1266,8 @@ static int refs_from_alternate_cb(struct alternate_object_database *e,
 		goto out;
 	strbuf_setlen(&path, base_len);

-	remote = remote_get(path.buf);
-	transport = transport_get(remote, path.buf);
-	for (extra = transport_get_remote_refs(transport);
-	     extra;
-	     extra = extra->next)
-		cb->fn(extra->name, &extra->old_oid, cb->data);
-	transport_disconnect(transport);
+	read_alternate_refs(path.buf, cb->fn, cb->data);
+
 out:
 	strbuf_release(&path);
 	return 0;
-- 
2.12.0.rc0.371.ga6cf8653b

^ permalink raw reply related

* [PATCH v2 08/11] receive-pack: use oidset to de-duplicate .have lines
From: Jeff King @ 2017-02-08 20:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano
In-Reply-To: <20170208205219.twgm5rggovqbepte@sigill.intra.peff.net>

If you have an alternate object store with a very large
number of refs, the peak memory usage of the sha1_array can
grow high, even if most of them are duplicates that end up
not being printed at all.

The similar for_each_alternate_ref() code-paths in
fetch-pack solve this by using flags in "struct object" to
de-duplicate (and so are relying on obj_hash at the core).

But we don't have a "struct object" at all in this case. We
could call lookup_unknown_object() to get one, but if our
goal is reducing memory footprint, it's not great:

 - an unknown object is as large as the largest object type
   (a commit), which is bigger than an oidset entry

 - we can free the memory after our ref advertisement, but
   "struct object" entries persist forever (and the
   receive-pack may hang around for a long time, as the
   bottleneck is often client upload bandwidth).

So let's use an oidset. Note that unlike a sha1-array it
doesn't sort the output as a side effect. However, our
output is at least stable, because for_each_alternate_ref()
will give us the sha1s in ref-sorted order.

In one particularly pathological case with an alternate that
has 60,000 unique refs out of 80 million total, this reduced
the peak heap usage of "git receive-pack . </dev/null" from
13GB to 14MB.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/receive-pack.c | 26 ++++++++++++--------------
 1 file changed, 12 insertions(+), 14 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index d21332d9e..a4926fcfb 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -21,6 +21,7 @@
 #include "sigchain.h"
 #include "fsck.h"
 #include "tmp-objdir.h"
+#include "oidset.h"
 
 static const char * const receive_pack_usage[] = {
 	N_("git receive-pack <git-dir>"),
@@ -271,27 +272,24 @@ static int show_ref_cb(const char *path_full, const struct object_id *oid,
 	return 0;
 }
 
-static int show_one_alternate_sha1(const unsigned char sha1[20], void *unused)
+static void show_one_alternate_ref(const char *refname,
+				   const struct object_id *oid,
+				   void *data)
 {
-	show_ref(".have", sha1);
-	return 0;
-}
+	struct oidset *seen = data;
 
-static void collect_one_alternate_ref(const char *refname,
-				      const struct object_id *oid,
-				      void *data)
-{
-	struct sha1_array *sa = data;
-	sha1_array_append(sa, oid->hash);
+	if (oidset_insert(seen, oid))
+		return;
+
+	show_ref(".have", oid->hash);
 }
 
 static void write_head_info(void)
 {
-	struct sha1_array sa = SHA1_ARRAY_INIT;
+	static struct oidset seen = OIDSET_INIT;
 
-	for_each_alternate_ref(collect_one_alternate_ref, &sa);
-	sha1_array_for_each_unique(&sa, show_one_alternate_sha1, NULL);
-	sha1_array_clear(&sa);
+	for_each_alternate_ref(show_one_alternate_ref, &seen);
+	oidset_clear(&seen);
 	for_each_ref(show_ref_cb, NULL);
 	if (!sent_capabilities)
 		show_ref("capabilities^{}", null_sha1);
-- 
2.12.0.rc0.371.ga6cf8653b


^ permalink raw reply related

* [PATCH v2 07/11] add oidset API
From: Jeff King @ 2017-02-08 20:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano
In-Reply-To: <20170208205219.twgm5rggovqbepte@sigill.intra.peff.net>

This is similar to many of our uses of sha1-array, but it
overcomes one limitation of a sha1-array: when you are
de-duplicating a large input with relatively few unique
entries, sha1-array uses 20 bytes per non-unique entry.
Whereas this set will use memory linear in the number of
unique entries (albeit a few more than 20 bytes due to
hashmap overhead).

Signed-off-by: Jeff King <peff@peff.net>
---
 Makefile |  1 +
 oidset.c | 49 +++++++++++++++++++++++++++++++++++++++++++++++++
 oidset.h | 45 +++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 95 insertions(+)
 create mode 100644 oidset.c
 create mode 100644 oidset.h

diff --git a/Makefile b/Makefile
index 8e4081e06..a5433978e 100644
--- a/Makefile
+++ b/Makefile
@@ -781,6 +781,7 @@ LIB_OBJS += notes-cache.o
 LIB_OBJS += notes-merge.o
 LIB_OBJS += notes-utils.o
 LIB_OBJS += object.o
+LIB_OBJS += oidset.o
 LIB_OBJS += pack-bitmap.o
 LIB_OBJS += pack-bitmap-write.o
 LIB_OBJS += pack-check.o
diff --git a/oidset.c b/oidset.c
new file mode 100644
index 000000000..ac169f05d
--- /dev/null
+++ b/oidset.c
@@ -0,0 +1,49 @@
+#include "cache.h"
+#include "oidset.h"
+
+struct oidset_entry {
+	struct hashmap_entry hash;
+	struct object_id oid;
+};
+
+static int oidset_hashcmp(const void *va, const void *vb,
+			  const void *vkey)
+{
+	const struct oidset_entry *a = va, *b = vb;
+	const struct object_id *key = vkey;
+	return oidcmp(&a->oid, key ? key : &b->oid);
+}
+
+int oidset_contains(const struct oidset *set, const struct object_id *oid)
+{
+	struct hashmap_entry key;
+
+	if (!set->map.cmpfn)
+		return 0;
+
+	hashmap_entry_init(&key, sha1hash(oid->hash));
+	return !!hashmap_get(&set->map, &key, oid);
+}
+
+int oidset_insert(struct oidset *set, const struct object_id *oid)
+{
+	struct oidset_entry *entry;
+
+	if (!set->map.cmpfn)
+		hashmap_init(&set->map, oidset_hashcmp, 0);
+
+	if (oidset_contains(set, oid))
+		return 1;
+
+	entry = xmalloc(sizeof(*entry));
+	hashmap_entry_init(&entry->hash, sha1hash(oid->hash));
+	oidcpy(&entry->oid, oid);
+
+	hashmap_add(&set->map, entry);
+	return 0;
+}
+
+void oidset_clear(struct oidset *set)
+{
+	hashmap_free(&set->map, 1);
+}
diff --git a/oidset.h b/oidset.h
new file mode 100644
index 000000000..b7eaab5b8
--- /dev/null
+++ b/oidset.h
@@ -0,0 +1,45 @@
+#ifndef OIDSET_H
+#define OIDSET_H
+
+/**
+ * This API is similar to sha1-array, in that it maintains a set of object ids
+ * in a memory-efficient way. The major differences are:
+ *
+ *   1. It uses a hash, so we can do online duplicate removal, rather than
+ *      sort-and-uniq at the end. This can reduce memory footprint if you have
+ *      a large list of oids with many duplicates.
+ *
+ *   2. The per-unique-oid memory footprint is slightly higher due to hash
+ *      table overhead.
+ */
+
+/**
+ * A single oidset; should be zero-initialized (or use OIDSET_INIT).
+ */
+struct oidset {
+	struct hashmap map;
+};
+
+#define OIDSET_INIT { { NULL } }
+
+/**
+ * Returns true iff `set` contains `oid`.
+ */
+int oidset_contains(const struct oidset *set, const struct object_id *oid);
+
+/**
+ * Insert the oid into the set; a copy is made, so "oid" does not need
+ * to persist after this function is called.
+ *
+ * Returns 1 if the oid was already in the set, 0 otherwise. This can be used
+ * to perform an efficient check-and-add.
+ */
+int oidset_insert(struct oidset *set, const struct object_id *oid);
+
+/**
+ * Remove all entries from the oidset, freeing any resources associated with
+ * it.
+ */
+void oidset_clear(struct oidset *set);
+
+#endif /* OIDSET_H */
-- 
2.12.0.rc0.371.ga6cf8653b


^ permalink raw reply related

* [PATCH v2 06/11] fetch-pack: cache results of for_each_alternate_ref
From: Jeff King @ 2017-02-08 20:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano
In-Reply-To: <20170208205219.twgm5rggovqbepte@sigill.intra.peff.net>

We may run for_each_alternate_ref() twice, once in
find_common() and once in everything_local(). This operation
can be expensive, because it involves running a sub-process
which must freshly load all of the alternate's refs from
disk.

Let's cache and reuse the results between the two calls. We
can make some optimizations based on the particular use
pattern in fetch-pack to keep our memory usage down.

The first is that we only care about the sha1s, not the refs
themselves. So it's OK to store only the sha1s, and to
suppress duplicates. The natural fit would therefore be a
sha1_array.

However, sha1_array's de-duplication happens only after it
has read and sorted all entries. It still stores each
duplicate. For an alternate with a large number of refs
pointing to the same commits, this is a needless expense.

Instead, we'd prefer to eliminate duplicates before putting
them in the cache, which implies using a hash. We can
further note that fetch-pack will call parse_object() on
each alternate sha1. We can therefore keep our cache as a
set of pointers to "struct object". That gives us a place to
put our "already seen" bit with an optimized hash lookup.
And as a bonus, the object stores the sha1 for us, so
pointer-to-object is all we need.

There are two extra optimizations I didn't do here:

  - we actually store an array of pointer-to-object.
    Technically we could just walk the obj_hash table
    looking for entries with the ALTERNATE flag set (because
    our use case doesn't care about the order here).

    But that hash table may be mostly composed of
    non-ALTERNATE entries, so we'd waste time walking over
    them. So it would be a slight win in memory use, but a
    loss in CPU.

  - the items we pull out of the cache are actual "struct
    object"s, but then we feed "obj->sha1" to our
    sub-functions, which promptly call parse_object().

    This second parse is cheap, because it starts with
    lookup_object() and will bail immediately when it sees
    we've already parsed the object. We could save the extra
    hash lookup, but it would involve refactoring the
    functions we call. It may or may not be worth the
    trouble.

Signed-off-by: Jeff King <peff@peff.net>
---
 fetch-pack.c | 52 ++++++++++++++++++++++++++++++++++++++++++----------
 object.h     |  2 +-
 2 files changed, 43 insertions(+), 11 deletions(-)

diff --git a/fetch-pack.c b/fetch-pack.c
index 54f84c573..e0f5d5ce8 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -35,6 +35,7 @@ static const char *alternate_shallow_file;
 #define COMMON_REF	(1U << 2)
 #define SEEN		(1U << 3)
 #define POPPED		(1U << 4)
+#define ALTERNATE	(1U << 5)
 
 static int marked;
 
@@ -67,6 +68,41 @@ static inline void print_verbose(const struct fetch_pack_args *args,
 	fputc('\n', stderr);
 }
 
+struct alternate_object_cache {
+	struct object **items;
+	size_t nr, alloc;
+};
+
+static void cache_one_alternate(const char *refname,
+				const struct object_id *oid,
+				void *vcache)
+{
+	struct alternate_object_cache *cache = vcache;
+	struct object *obj = parse_object(oid->hash);
+
+	if (!obj || (obj->flags & ALTERNATE))
+		return;
+
+	obj->flags |= ALTERNATE;
+	ALLOC_GROW(cache->items, cache->nr + 1, cache->alloc);
+	cache->items[cache->nr++] = obj;
+}
+
+static void for_each_cached_alternate(void (*cb)(struct object *))
+{
+	static int initialized;
+	static struct alternate_object_cache cache;
+	size_t i;
+
+	if (!initialized) {
+		for_each_alternate_ref(cache_one_alternate, &cache);
+		initialized = 1;
+	}
+
+	for (i = 0; i < cache.nr; i++)
+		cb(cache.items[i]);
+}
+
 static void rev_list_push(struct commit *commit, int mark)
 {
 	if (!(commit->object.flags & mark)) {
@@ -253,11 +289,9 @@ static void send_request(struct fetch_pack_args *args,
 		write_or_die(fd, buf->buf, buf->len);
 }
 
-static void insert_one_alternate_ref(const char *refname,
-				     const struct object_id *oid,
-				     void *unused)
+static void insert_one_alternate_object(struct object *obj)
 {
-	rev_list_insert_ref(NULL, oid->hash);
+	rev_list_insert_ref(NULL, obj->oid.hash);
 }
 
 #define INITIAL_FLUSH 16
@@ -300,7 +334,7 @@ static int find_common(struct fetch_pack_args *args,
 	marked = 1;
 
 	for_each_ref(rev_list_insert_ref_oid, NULL);
-	for_each_alternate_ref(insert_one_alternate_ref, NULL);
+	for_each_cached_alternate(insert_one_alternate_object);
 
 	fetching = 0;
 	for ( ; refs ; refs = refs->next) {
@@ -621,11 +655,9 @@ static void filter_refs(struct fetch_pack_args *args,
 	*refs = newlist;
 }
 
-static void mark_alternate_complete(const char *refname,
-				    const struct object_id *oid,
-				    void *unused)
+static void mark_alternate_complete(struct object *obj)
 {
-	mark_complete(oid->hash);
+	mark_complete(obj->oid.hash);
 }
 
 static int everything_local(struct fetch_pack_args *args,
@@ -661,7 +693,7 @@ static int everything_local(struct fetch_pack_args *args,
 
 	if (!args->deepen) {
 		for_each_ref(mark_complete_oid, NULL);
-		for_each_alternate_ref(mark_alternate_complete, NULL);
+		for_each_cached_alternate(mark_alternate_complete);
 		commit_list_sort_by_date(&complete);
 		if (cutoff)
 			mark_recent_complete_commits(args, cutoff);
diff --git a/object.h b/object.h
index 614a00675..f52957dcb 100644
--- a/object.h
+++ b/object.h
@@ -29,7 +29,7 @@ struct object_array {
 /*
  * object flag allocation:
  * revision.h:      0---------10                                26
- * fetch-pack.c:    0---4
+ * fetch-pack.c:    0---5
  * walker.c:        0-2
  * upload-pack.c:       4       11----------------19
  * builtin/blame.c:               12-13
-- 
2.12.0.rc0.371.ga6cf8653b


^ permalink raw reply related

* [PATCH v2 09/11] receive-pack: fix misleading namespace/.have comment
From: Jeff King @ 2017-02-08 20:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano
In-Reply-To: <20170208205219.twgm5rggovqbepte@sigill.intra.peff.net>

The comment claims that we handle alternate ".have" lines
through this function, but that hasn't been the case since
85f251045 (write_head_info(): handle "extra refs" locally,
2012-01-06).

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/receive-pack.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index a4926fcfb..1821ad5fa 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -261,10 +261,7 @@ static int show_ref_cb(const char *path_full, const struct object_id *oid,
 	/*
 	 * Advertise refs outside our current namespace as ".have"
 	 * refs, so that the client can use them to minimize data
-	 * transfer but will otherwise ignore them. This happens to
-	 * cover ".have" that are thrown in by add_one_alternate_ref()
-	 * to mark histories that are complete in our alternates as
-	 * well.
+	 * transfer but will otherwise ignore them.
 	 */
 	if (!path)
 		path = ".have";
-- 
2.12.0.rc0.371.ga6cf8653b


^ permalink raw reply related

* [PATCH v2 10/11] receive-pack: treat namespace .have lines like alternates
From: Jeff King @ 2017-02-08 20:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano
In-Reply-To: <20170208205219.twgm5rggovqbepte@sigill.intra.peff.net>

Namely, de-duplicate them. We use the same set as the
alternates, since we call them both ".have" (i.e., there is
no value in showing one versus the other).

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/receive-pack.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 1821ad5fa..c23b0cce8 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -251,8 +251,9 @@ static void show_ref(const char *path, const unsigned char *sha1)
 }
 
 static int show_ref_cb(const char *path_full, const struct object_id *oid,
-		       int flag, void *unused)
+		       int flag, void *data)
 {
+	struct oidset *seen = data;
 	const char *path = strip_namespace(path_full);
 
 	if (ref_is_hidden(path, path_full))
@@ -263,8 +264,11 @@ static int show_ref_cb(const char *path_full, const struct object_id *oid,
 	 * refs, so that the client can use them to minimize data
 	 * transfer but will otherwise ignore them.
 	 */
-	if (!path)
+	if (!path) {
+		if (oidset_insert(seen, oid))
+			return 0;
 		path = ".have";
+	}
 	show_ref(path, oid->hash);
 	return 0;
 }
@@ -287,7 +291,7 @@ static void write_head_info(void)
 
 	for_each_alternate_ref(show_one_alternate_ref, &seen);
 	oidset_clear(&seen);
-	for_each_ref(show_ref_cb, NULL);
+	for_each_ref(show_ref_cb, &seen);
 	if (!sent_capabilities)
 		show_ref("capabilities^{}", null_sha1);
 
-- 
2.12.0.rc0.371.ga6cf8653b


^ permalink raw reply related

* [PATCH v2 11/11] receive-pack: avoid duplicates between our refs and alternates
From: Jeff King @ 2017-02-08 20:53 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano
In-Reply-To: <20170208205219.twgm5rggovqbepte@sigill.intra.peff.net>

We de-duplicate ".have" refs among themselves, but never
check if they are duplicates of our local refs. It's not
unreasonable that they would be if we are a "--shared" or
"--reference" clone of a similar repository; we'd have all
the same tags.

We can handle this by inserting our local refs into the
oidset, but obviously not suppressing duplicates (since the
refnames are important).

Note that this also switches the order in which we advertise
refs, processing ours first and then any alternates. The
order shouldn't matter (and arguably showing our refs first
makes more sense).

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/receive-pack.c |  4 +++-
 t/t5400-send-pack.sh   | 38 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index c23b0cce8..9ed8fbbfa 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -268,6 +268,8 @@ static int show_ref_cb(const char *path_full, const struct object_id *oid,
 		if (oidset_insert(seen, oid))
 			return 0;
 		path = ".have";
+	} else {
+		oidset_insert(seen, oid);
 	}
 	show_ref(path, oid->hash);
 	return 0;
@@ -289,9 +291,9 @@ static void write_head_info(void)
 {
 	static struct oidset seen = OIDSET_INIT;
 
+	for_each_ref(show_ref_cb, &seen);
 	for_each_alternate_ref(show_one_alternate_ref, &seen);
 	oidset_clear(&seen);
-	for_each_ref(show_ref_cb, &seen);
 	if (!sent_capabilities)
 		show_ref("capabilities^{}", null_sha1);
 
diff --git a/t/t5400-send-pack.sh b/t/t5400-send-pack.sh
index 305ca7a93..3331e0f53 100755
--- a/t/t5400-send-pack.sh
+++ b/t/t5400-send-pack.sh
@@ -255,4 +255,42 @@ test_expect_success 'deny pushing to delete current branch' '
 	)
 '
 
+extract_ref_advertisement () {
+	perl -lne '
+		# \\ is there to skip capabilities after \0
+		/push< ([^\\]+)/ or next;
+		exit 0 if $1 eq "0000";
+		print $1;
+	'
+}
+
+test_expect_success 'receive-pack de-dupes .have lines' '
+	git init shared &&
+	git -C shared commit --allow-empty -m both &&
+	git clone -s shared fork &&
+	(
+		cd shared &&
+		git checkout -b only-shared &&
+		git commit --allow-empty -m only-shared &&
+		git update-ref refs/heads/foo HEAD
+	) &&
+
+	# Notable things in this expectation:
+	#  - local refs are not de-duped
+	#  - .have does not duplicate locals
+	#  - .have does not duplicate itself
+	local=$(git -C fork rev-parse HEAD) &&
+	shared=$(git -C shared rev-parse only-shared) &&
+	cat >expect <<-EOF &&
+	$local refs/heads/master
+	$local refs/remotes/origin/HEAD
+	$local refs/remotes/origin/master
+	$shared .have
+	EOF
+
+	GIT_TRACE_PACKET=$(pwd)/trace git push fork HEAD:foo &&
+	extract_ref_advertisement <trace >refs &&
+	test_cmp expect refs
+'
+
 test_done
-- 
2.12.0.rc0.371.ga6cf8653b

^ permalink raw reply related

* Re: Trying to use xfuncname without success.
From: Jack Adrian Zappa @ 2017-02-08 21:03 UTC (permalink / raw)
  To: Samuel Lijin; +Cc: René Scharfe, git-mailing-list
In-Reply-To: <CAJZjrdWouNaNKU2sX89Xh=QqSbdB7srwgufuquYL_1B7H324Yw@mail.gmail.com>

Thanks Samuel,

That example showed that there must be something wrong in my .git
directory, because with it, I'm getting the correct output.  Moving
the same lines to my .git/config didn't work.

On Wed, Feb 8, 2017 at 3:46 PM, Samuel Lijin <sxlijin@gmail.com> wrote:
> I just put this togther: https://github.com/sxlijin/xfuncname-test
>
> Try cloning and then for any of config1 thru 3,
>
> $ cp configX .git/config
> $ git diff HEAD^ -- test.natvis
>
> On Wed, Feb 8, 2017 at 2:42 PM, Jack Adrian Zappa <adrianh.bsc@gmail.com> wrote:
>> Thanks Samuel,
>>
>> So, the question is, what is causing this problem on my system?
>>
>> Anyone have an idea to help diagnose this problem?
>>
>> On Wed, Feb 8, 2017 at 3:24 PM, Samuel Lijin <sxlijin@gmail.com> wrote:
>>> On Windows 7, it works for me in both CMD and Git Bash:
>>>
>>> $ git --version
>>> git version 2.11.0.windows.3
>>>
>>> $ git diff HEAD^ --word-diff
>>> diff --git a/test.natvis b/test.natvis
>>> index 93396ad..1233b8c 100644
>>> --- a/test.natvis
>>> +++ b/test.natvis
>>> @@ -18,6 +18,7 @@ test
>>>
>>>
>>>       <!-- Non-blank line -->
>>>       {+<Item Name="added var">added_var</Item>+}
>>>
>>>
>>>       <Item Name="var2">var2</Item>
>>>
>>> On Wed, Feb 8, 2017 at 12:37 PM, René Scharfe <l.s.r@web.de> wrote:
>>>> Am 08.02.2017 um 18:11 schrieb Jack Adrian Zappa:
>>>>> Thanks Rene, but you seem to have missed the point.  NOTHING is
>>>>> working.  No matter what I put there, it doesn't seem to get matched.
>>>>
>>>> I'm not so sure about that.  With your example I get this diff without
>>>> setting diff.natvis.xfuncname:
>>>>
>>>> diff --git a/a.natvis b/a.natvis
>>>> index 7f9bdf5..bc3c090 100644
>>>> --- a/a.natvis
>>>> +++ b/a.natvis
>>>> @@ -19,7 +19,7 @@ xmlns="http://schemas.microsoft.com/vstudio/debugger/natvis/2010">
>>>>
>>>>
>>>>        <!-- Non-blank line -->
>>>> -      <Item Name="added var">added_var</Item>
>>>> +      <Item Name="added var">added_vars</Item>
>>>>
>>>>
>>>>        <Item Name="var2">var2</Item>
>>>>
>>>> Note the XML namespace in the hunk header.  It's put there by the
>>>> default rule because "xmlns" starts at the beginning of the line.  Your
>>>> diff has nothing there, which means the default rule is not used, i.e.
>>>> your user-defined rule is in effect.
>>>>
>>>> Come to think of it, this line break in the middle of the AutoVisualizer
>>>> tab might have been added by your email client unintentionally, so that
>>>> we use different test files, which then of course results in different
>>>> diffs.  Is that the case?
>>>>
>>>> Anyway, if I run the following two commands:
>>>>
>>>> $ git config diff.natvis.xfuncname "^[\t ]*<Type[\t ]+Name=\"([^\"]+)\".*$"
>>>> $ echo '*.natvis diff=natvis' >.gitattributes
>>>>
>>>> ... then I get this, both on Linux (git version 2.11.1) and on Windows
>>>> (git version 2.11.1.windows.1):
>>>>
>>>> diff --git a/a.natvis b/a.natvis
>>>> index 7f9bdf5..bc3c090 100644
>>>> --- a/a.natvis
>>>> +++ b/a.natvis
>>>> @@ -19,7 +19,7 @@ test
>>>>
>>>>
>>>>        <!-- Non-blank line -->
>>>> -      <Item Name="added var">added_var</Item>
>>>> +      <Item Name="added var">added_vars</Item>
>>>>
>>>>
>>>>        <Item Name="var2">var2</Item>
>>>>
>>>>> Just to be sure, I tested your regex and again it didn't work.
>>>>
>>>> At this point I'm out of ideas, sorry. :(  The only way I was able to
>>>> break it was due to mistyping the extension as "netvis" several times
>>>> for some reason.
>>>>
>>>> René

^ permalink raw reply

* Re: [PATCH v4] tag: generate useful reflog message
From: Junio C Hamano @ 2017-02-08 21:28 UTC (permalink / raw)
  To: cornelius.weig; +Cc: git, karthik.188, peff, bitte.keine.werbung.einwerfen
In-Reply-To: <20170206222416.28720-2-cornelius.weig@tngtech.com>

cornelius.weig@tngtech.com writes:

> From: Cornelius Weig <cornelius.weig@tngtech.com>
>
> When tags are created with `--create-reflog` or with the option
> `core.logAllRefUpdates` set to 'always', a reflog is created for them.
> So far, the description of reflog entries for tags was empty, making the
> reflog hard to understand. For example:
> 6e3a7b3 refs/tags/test@{0}:
>
> Now, a reflog message is generated when creating a tag, following the
> pattern "tag: tagging <short-sha1> (<description>)". If
> GIT_REFLOG_ACTION is set, the message becomes "$GIT_REFLOG_ACTION
> (<description>)" instead. If the tag references a commit object, the
> description is set to the subject line of the commit, followed by its
> commit date. For example:
> 6e3a7b3 refs/tags/test@{0}: tag: tagging 6e3a7b3398 (Git 2.12-rc0, 2017-02-03)
>
> If the tag points to a tree/blob/tag objects, the following static
> strings are taken as description:
>
>  - "tree object"
>  - "blob object"
>  - "other tag object"
>
> Signed-off-by: Cornelius Weig <cornelius.weig@tngtech.com>
> Reviewed-by: Junio C Hamano <gitster@pobox.com>

This last line is inappropriate, as I didn't review _THIS_ version,
which is different from the previous one, and I haven't checked if
the way the comments on the previous review were addressed in this
version is agreeable.

> diff --git a/t/t7004-tag.sh b/t/t7004-tag.sh
> index 072e6c6..894959f 100755
> --- a/t/t7004-tag.sh
> +++ b/t/t7004-tag.sh
> @@ -80,10 +80,24 @@ test_expect_success 'creating a tag using default HEAD should succeed' '
>  	test_must_fail git reflog exists refs/tags/mytag
>  '
>  
> +git log -1 > expected \
> +	--format="format:tag: tagging %h (%s, %cd)%n" --date=format:%F

We do not want to do this kind of thing outside the
test_expect_success immediately below, unless there is a good
reason, and in this case I do not see any.

Also write redirection operator and redirection target pathname
without SP in between.

>  test_expect_success 'creating a tag with --create-reflog should create reflog' '
>  	test_when_finished "git tag -d tag_with_reflog" &&
>  	git tag --create-reflog tag_with_reflog &&
> -	git reflog exists refs/tags/tag_with_reflog
> +	git reflog exists refs/tags/tag_with_reflog &&
> +	sed -e "s/^.*	//" .git/logs/refs/tags/tag_with_reflog > actual &&
> +	test_cmp expected actual
> +'

In other words, something like:

test_expect_success 'creating a tag with --create-reflog should create reflog' '
	git log -1 \
		--format="format:tag: tagging %h (%s, %cd)%n" \
		--date=format:%Y-%m-%d >expected &&
	test_when_finished "git tag -d tag_with_reflog" &&
	git tag --create-reflog tag_with_reflog &&
	git reflog exists refs/tags/tag_with_reflog &&
	sed -e "s/^.*	//" .git/logs/refs/tags/tag_with_reflog >actual &&
	test_cmp expected actual
'	

Even though %F may be shorter, spelling it out makes what we expect
more explicit, and what is what I did in the above example.

Thanks.

^ permalink raw reply

* Re: [PATCH v2] Document the --no-gui option in difftool
From: Junio C Hamano @ 2017-02-08 21:30 UTC (permalink / raw)
  To: Denton Liu; +Cc: git, davvid
In-Reply-To: <20170207063207.GA12746@arch-attack.localdomain>

Denton Liu <liu.denton@gmail.com> writes:

> Prior to this, the `--no-gui` option was not documented in the manpage.
> This commit introduces this into the manpage
>
> Signed-off-by: Denton Liu <liu.denton@gmail.com>
> ---

Looks good, thanks.

>  Documentation/git-difftool.txt | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/git-difftool.txt b/Documentation/git-difftool.txt
> index 224fb3090..96c26e6aa 100644
> --- a/Documentation/git-difftool.txt
> +++ b/Documentation/git-difftool.txt
> @@ -86,10 +86,11 @@ instead.  `--no-symlinks` is the default on Windows.
>  	Additionally, `$BASE` is set in the environment.
>  
>  -g::
> ---gui::
> +--[no-]gui::
>  	When 'git-difftool' is invoked with the `-g` or `--gui` option
>  	the default diff tool will be read from the configured
> -	`diff.guitool` variable instead of `diff.tool`.
> +	`diff.guitool` variable instead of `diff.tool`. The `--no-gui`
> +	option can be used to override this setting.
>  
>  --[no-]trust-exit-code::
>  	'git-difftool' invokes a diff tool individually on each file.

^ permalink raw reply

* Re: Trying to use xfuncname without success.
From: Jack Adrian Zappa @ 2017-02-08 21:12 UTC (permalink / raw)
  To: Samuel Lijin; +Cc: git-mailing-list
In-Reply-To: <CAJZjrdUcxe_K91CQXz_TgGHgXMsKaddwG5+JEWJ53pv5_GO1zw@mail.gmail.com>

That was it.  I have a .gitattributes file in my home directory.
Ahhh, but it's not in my %userprofile% directory, but in my ~
directory.

A bit confusing having 2 home directories.  I made a link to my
.gitconfig, but forgot to make a link to my .gitattributes.

Thanks.


A

On Wed, Feb 8, 2017 at 4:05 PM, Samuel Lijin <sxlijin@gmail.com> wrote:
> Double check .gitattributes?
>
> On Feb 8, 2017 2:58 PM, "Jack Adrian Zappa" <adrianh.bsc@gmail.com> wrote:
>>
>> Thanks Samuel,
>>
>> That example showed that there must be something wrong in my .git
>> directory, because with it, I'm getting the correct output.  Moving
>> the same lines to my .git/config didn't work.
>>
>> On Wed, Feb 8, 2017 at 3:46 PM, Samuel Lijin <sxlijin@gmail.com> wrote:
>> > I just put this togther: https://github.com/sxlijin/xfuncname-test
>> >
>> > Try cloning and then for any of config1 thru 3,
>> >
>> > $ cp configX .git/config
>> > $ git diff HEAD^ -- test.natvis
>> >
>> > On Wed, Feb 8, 2017 at 2:42 PM, Jack Adrian Zappa
>> > <adrianh.bsc@gmail.com> wrote:
>> >> Thanks Samuel,
>> >>
>> >> So, the question is, what is causing this problem on my system?
>> >>
>> >> Anyone have an idea to help diagnose this problem?
>> >>
>> >> On Wed, Feb 8, 2017 at 3:24 PM, Samuel Lijin <sxlijin@gmail.com> wrote:
>> >>> On Windows 7, it works for me in both CMD and Git Bash:
>> >>>
>> >>> $ git --version
>> >>> git version 2.11.0.windows.3
>> >>>
>> >>> $ git diff HEAD^ --word-diff
>> >>> diff --git a/test.natvis b/test.natvis
>> >>> index 93396ad..1233b8c 100644
>> >>> --- a/test.natvis
>> >>> +++ b/test.natvis
>> >>> @@ -18,6 +18,7 @@ test
>> >>>
>> >>>
>> >>>       <!-- Non-blank line -->
>> >>>       {+<Item Name="added var">added_var</Item>+}
>> >>>
>> >>>
>> >>>       <Item Name="var2">var2</Item>
>> >>>
>> >>> On Wed, Feb 8, 2017 at 12:37 PM, René Scharfe <l.s.r@web.de> wrote:
>> >>>> Am 08.02.2017 um 18:11 schrieb Jack Adrian Zappa:
>> >>>>> Thanks Rene, but you seem to have missed the point.  NOTHING is
>> >>>>> working.  No matter what I put there, it doesn't seem to get
>> >>>>> matched.
>> >>>>
>> >>>> I'm not so sure about that.  With your example I get this diff
>> >>>> without
>> >>>> setting diff.natvis.xfuncname:
>> >>>>
>> >>>> diff --git a/a.natvis b/a.natvis
>> >>>> index 7f9bdf5..bc3c090 100644
>> >>>> --- a/a.natvis
>> >>>> +++ b/a.natvis
>> >>>> @@ -19,7 +19,7 @@
>> >>>> xmlns="http://schemas.microsoft.com/vstudio/debugger/natvis/2010">
>> >>>>
>> >>>>
>> >>>>        <!-- Non-blank line -->
>> >>>> -      <Item Name="added var">added_var</Item>
>> >>>> +      <Item Name="added var">added_vars</Item>
>> >>>>
>> >>>>
>> >>>>        <Item Name="var2">var2</Item>
>> >>>>
>> >>>> Note the XML namespace in the hunk header.  It's put there by the
>> >>>> default rule because "xmlns" starts at the beginning of the line.
>> >>>> Your
>> >>>> diff has nothing there, which means the default rule is not used,
>> >>>> i.e.
>> >>>> your user-defined rule is in effect.
>> >>>>
>> >>>> Come to think of it, this line break in the middle of the
>> >>>> AutoVisualizer
>> >>>> tab might have been added by your email client unintentionally, so
>> >>>> that
>> >>>> we use different test files, which then of course results in
>> >>>> different
>> >>>> diffs.  Is that the case?
>> >>>>
>> >>>> Anyway, if I run the following two commands:
>> >>>>
>> >>>> $ git config diff.natvis.xfuncname "^[\t ]*<Type[\t
>> >>>> ]+Name=\"([^\"]+)\".*$"
>> >>>> $ echo '*.natvis diff=natvis' >.gitattributes
>> >>>>
>> >>>> ... then I get this, both on Linux (git version 2.11.1) and on
>> >>>> Windows
>> >>>> (git version 2.11.1.windows.1):
>> >>>>
>> >>>> diff --git a/a.natvis b/a.natvis
>> >>>> index 7f9bdf5..bc3c090 100644
>> >>>> --- a/a.natvis
>> >>>> +++ b/a.natvis
>> >>>> @@ -19,7 +19,7 @@ test
>> >>>>
>> >>>>
>> >>>>        <!-- Non-blank line -->
>> >>>> -      <Item Name="added var">added_var</Item>
>> >>>> +      <Item Name="added var">added_vars</Item>
>> >>>>
>> >>>>
>> >>>>        <Item Name="var2">var2</Item>
>> >>>>
>> >>>>> Just to be sure, I tested your regex and again it didn't work.
>> >>>>
>> >>>> At this point I'm out of ideas, sorry. :(  The only way I was able to
>> >>>> break it was due to mistyping the extension as "netvis" several times
>> >>>> for some reason.
>> >>>>
>> >>>> René

^ permalink raw reply

* [PATCHv2] push options: pass push options to the transport helper
From: Stefan Beller @ 2017-02-08 22:04 UTC (permalink / raw)
  To: gitster; +Cc: git, jrnieder, Stefan Beller
In-Reply-To: <xmqqpoiswo0l.fsf@gitster.mtv.corp.google.com>

When using non-builtin protocols relying on a transport helper
(such as http), push options are not propagated to the helper.

The user could ask for push options and a push would seemingly succeed,
but the push options would never be transported to the server,
misleading the users expectation.

Fix this by propagating the push options to the transport helper.

This is only addressing the first issue of
   (1) the helper protocol does not propagate push-option
   (2) the http helper is not prepared to handle push-option

Once we fix (2), the http transport helper can make use of push options
as well, but that happens as a follow up. (1) is a bug fix, whereas (2)
is a feature, which is why we only do (1) here.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
 Documentation/gitremote-helpers.txt |  4 ++++
 t/t5545-push-options.sh             | 15 +++++++++++++++
 transport-helper.c                  |  7 +++++++
 3 files changed, 26 insertions(+)

diff --git a/Documentation/gitremote-helpers.txt b/Documentation/gitremote-helpers.txt
index 9e8681f9e1..23474b1eab 100644
--- a/Documentation/gitremote-helpers.txt
+++ b/Documentation/gitremote-helpers.txt
@@ -462,6 +462,10 @@ set by Git if the remote helper has the 'option' capability.
 'option pushcert {'true'|'false'}::
 	GPG sign pushes.
 
+'option push-option <string>::
+	Transmit <string> as a push option. As the a push option
+	must not contain LF or NUL characters, the string is not encoded.
+
 SEE ALSO
 --------
 linkgit:git-remote[1]
diff --git a/t/t5545-push-options.sh b/t/t5545-push-options.sh
index ea813b9383..9a57a7d8f2 100755
--- a/t/t5545-push-options.sh
+++ b/t/t5545-push-options.sh
@@ -3,6 +3,8 @@
 test_description='pushing to a repository using push options'
 
 . ./test-lib.sh
+. "$TEST_DIRECTORY"/lib-httpd.sh
+start_httpd
 
 mk_repo_pair () {
 	rm -rf workbench upstream &&
@@ -100,4 +102,17 @@ test_expect_success 'two push options work' '
 	test_cmp expect upstream/.git/hooks/post-receive.push_options
 '
 
+test_expect_success 'push option denied properly by http remote helper' '\
+	mk_repo_pair &&
+	git -C upstream config receive.advertisePushOptions false &&
+	git -C upstream config http.receivepack true &&
+	cp -R upstream/.git "$HTTPD_DOCUMENT_ROOT_PATH"/upstream.git &&
+	git clone "$HTTPD_URL"/smart/upstream test_http_clone &&
+	test_commit -C test_http_clone one &&
+	test_must_fail git -C test_http_clone push --push-option=asdf origin master &&
+	git -C test_http_clone push origin master
+'
+
+stop_httpd
+
 test_done
diff --git a/transport-helper.c b/transport-helper.c
index 91aed35ebb..1258d6aedd 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -826,6 +826,13 @@ static void set_common_push_options(struct transport *transport,
 		if (set_helper_option(transport, TRANS_OPT_PUSH_CERT, "if-asked") != 0)
 			die("helper %s does not support --signed=if-asked", name);
 	}
+
+	if (flags & TRANSPORT_PUSH_OPTIONS) {
+		struct string_list_item *item;
+		for_each_string_list_item(item, transport->push_options)
+			if (set_helper_option(transport, "push-option", item->string) != 0)
+				die("helper %s does not support 'push-option'", name);
+	}
 }
 
 static int push_refs_with_push(struct transport *transport,
-- 
2.12.0.rc0.1.g018cb5e6f4


^ permalink raw reply related

* Re: [PATCH] refs-internal.c: make files_log_ref_write() static
From: Junio C Hamano @ 2017-02-08 22:06 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy; +Cc: git, David Turner
In-Reply-To: <20170208094558.23712-1-pclouds@gmail.com>

Nguyễn Thái Ngọc Duy  <pclouds@gmail.com> writes:

> Created in 5f3c3a4e6f (files_log_ref_write: new function - 2015-11-10)
> but probably never used outside refs-internal.c
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---

It's been more than a year, so let's declare that when somebody
needs it it can easily be turned into extern again and take this
patch.

Thanks.

>  refs/files-backend.c | 3 +++
>  refs/refs-internal.h | 4 ----
>  2 files changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index c041d4ba21..6a0bf94bf1 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -165,6 +165,9 @@ static struct ref_entry *create_dir_entry(struct files_ref_store *ref_store,
>  					  const char *dirname, size_t len,
>  					  int incomplete);
>  static void add_entry_to_dir(struct ref_dir *dir, struct ref_entry *entry);
> +static int files_log_ref_write(const char *refname, const unsigned char *old_sha1,
> +			       const unsigned char *new_sha1, const char *msg,
> +			       int flags, struct strbuf *err);
>  
>  static struct ref_dir *get_ref_dir(struct ref_entry *entry)
>  {
> diff --git a/refs/refs-internal.h b/refs/refs-internal.h
> index 25444cf5b0..4c9215d33f 100644
> --- a/refs/refs-internal.h
> +++ b/refs/refs-internal.h
> @@ -220,10 +220,6 @@ struct ref_transaction {
>  	enum ref_transaction_state state;
>  };
>  
> -int files_log_ref_write(const char *refname, const unsigned char *old_sha1,
> -			const unsigned char *new_sha1, const char *msg,
> -			int flags, struct strbuf *err);
> -
>  /*
>   * Check for entries in extras that are within the specified
>   * directory, where dirname is a reference directory name including

^ permalink raw reply

* Re: [PATCH 2/2] pathspec: don't error out on all-exclusionary pathspec patterns
From: Junio C Hamano @ 2017-02-08 21:59 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Git Mailing List
In-Reply-To: <alpine.LFD.2.20.1702072113380.25002@i7.lan>

Linus Torvalds <torvalds@linux-foundation.org> writes:

> @@ -546,10 +546,16 @@ void parse_pathspec(struct pathspec *pathspec,
>  		pathspec->magic |= item[i].magic;
>  	}
>  
> -	if (nr_exclude == n)
> -		die(_("There is nothing to exclude from by :(exclude) patterns.\n"
> -		      "Perhaps you forgot to add either ':/' or '.' ?"));
> -
> +	/*
> +	 * If everything is an exclude pattern, add one positive pattern
> +	 * that matches everyting. We allocated an extra one for this.
> +	 */
> +	if (nr_exclude == n) {
> +		if (!(flags & PATHSPEC_PREFER_CWD))
> +			prefixlen = 0;
> +		init_pathspec_item(item + n, 0, prefix, prefixlen, "");
> +		pathspec->nr++;
> +	}
>  
>  	if (pathspec->magic & PATHSPEC_MAXDEPTH) {
>  		if (flags & PATHSPEC_KEEP_ORDER)

Thanks.  Even though the current code does not refer to the original
prefixlen after the added hunk, I'd prefer not to destroy it to
avoid future troubles, so I'll queue with a bit of tweak there,
perhaps like the attached.

Also this has an obvious fallout to the tests, whose (minimum) fix
is rather trivial.

Thanks.

 pathspec.c                  | 7 +++----
 t/t6132-pathspec-exclude.sh | 6 ++++--
 2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/pathspec.c b/pathspec.c
index d8f78088c8..b961f00c8c 100644
--- a/pathspec.c
+++ b/pathspec.c
@@ -522,7 +522,7 @@ void parse_pathspec(struct pathspec *pathspec,
 	}
 
 	pathspec->nr = n;
-	ALLOC_ARRAY(pathspec->items, n+1);
+	ALLOC_ARRAY(pathspec->items, n + 1);
 	item = pathspec->items;
 	prefixlen = prefix ? strlen(prefix) : 0;
 
@@ -551,9 +551,8 @@ void parse_pathspec(struct pathspec *pathspec,
 	 * that matches everyting. We allocated an extra one for this.
 	 */
 	if (nr_exclude == n) {
-		if (!(flags & PATHSPEC_PREFER_CWD))
-			prefixlen = 0;
-		init_pathspec_item(item + n, 0, prefix, prefixlen, "");
+		int plen = (!(flags & PATHSPEC_PREFER_CWD)) ? 0 : prefixlen;
+		init_pathspec_item(item + n, 0, prefix, plen, "");
 		pathspec->nr++;
 	}
 
diff --git a/t/t6132-pathspec-exclude.sh b/t/t6132-pathspec-exclude.sh
index d51595cf6b..9dd5cde5fc 100755
--- a/t/t6132-pathspec-exclude.sh
+++ b/t/t6132-pathspec-exclude.sh
@@ -25,8 +25,10 @@ EOF
 	test_cmp expect actual
 '
 
-test_expect_success 'exclude only should error out' '
-	test_must_fail git log --oneline --format=%s -- ":(exclude)sub"
+test_expect_success 'exclude only no longer errors out' '
+	git log --oneline --format=%s -- . ":(exclude)sub" >expect &&
+	git log --oneline --format=%s -- ":(exclude)sub" >actual &&
+	test_cmp expect actual
 '
 
 test_expect_success 't_e_i() exclude sub' '

^ permalink raw reply related

* Re: "disabling bitmap writing, as some objects are not being packed"?
From: David Turner @ 2017-02-08 22:14 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Duy Nguyen, Git Mailing List
In-Reply-To: <20170208190858.rjoqehbhyizlwg5q@sigill.intra.peff.net>

On Wed, 2017-02-08 at 14:08 -0500, Jeff King wrote:
> On Wed, Feb 08, 2017 at 02:05:42PM -0500, David Turner wrote:
> 
> > On Wed, 2017-02-08 at 09:44 -0800, Junio C Hamano wrote:
> > > Duy Nguyen <pclouds@gmail.com> writes:
> > > 
> > > > On second thought, perhaps gc.autoDetach should default to false if
> > > > there's no tty, since its main point it to stop breaking interactive
> > > > usage. That would make the server side happy (no tty there).
> > > 
> > > Sounds like an idea, but wouldn't that keep the end-user coming over
> > > the network waiting after accepting a push until the GC completes, I
> > > wonder.  If an impatient user disconnects, would that end up killing
> > > an ongoing GC?  etc.
> > 
> > Regardless, it's impolite to keep the user waiting. So, I think we
> > should just not write the "too many unreachable loose objects" message
> > if auto-gc is on.  Does that sound OK?
> 
> I thought the point of that message was to prevent auto-gc from kicking
> in over and over again due to objects that won't actually get pruned.
> 
> I wonder if you'd want to either bump the auto-gc object limit, or
> possibly reduce the gc.pruneExpire limit to keep this situation from
> coming up in the first place (or at least mitigating the amount of time
> it's the case).

Auto-gc might not succeed in pruning objects, but it will at least
reduce the number of packs, which is super-important for performance.

I think the intent of automatic gc is to have a git repository be
relatively low-maintenance from a server-operator perspective.  (Side
note: it's fairly trivial for a user with push access to mess with the
check simply by pushing a bunch of objects whose shas start with 17).
It seems odd that git gets itself into a state where it refuses to do
any maintenance just because at some point some piece of the maintenance
didn't make progress.

Sure, I could change my configuration, but that doesn't help the other
folks (e.g. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=813084 )
who run into this.

I have three thoughts on this:

Idea 1: when gc --auto would issue this message, instead it could create
a file named gc.too-much-garbage (instead of gc.log), with this message.
If that file exists, and it is less than one day (?) old, then we don't
attempt to do a full gc; instead we just run git repack -A -d.  (If it's
more than one day old, we just delete it and continue anyway).

Idea 2 : Like idea 1, but instead of repacking, just smash the existing
packs together into one big pack.  In other words, don't consider
dangling objects, or recompute deltas.  Twitter has a tool called "git
combine-pack" that does this:
https://github.com/dturner-tw/git/blob/dturner/journal/builtin/combine-pack.c

That's less space-efficient than a true repack, but it's no worse than
having the packs separate, and it's a win for read performance because
there's no need to do a linear search over N packs to find an object.

Idea 3: As I suggested last time, separate fatal and non-fatal errors.
If gc fails because of EIO or something, we probably don't want to touch
the disk anymore. But here, the worst consequence is that we waste some
processing power. And it's better to occasionally waste processing power
in a non-interactive setting than it is to do so when a user will be
blocked on it.  So non-fatal warnings should go to gc.log, and fatal
errors should go to gc.fatal.  gc.log won't block gc from running. I
think this is my preferred option.

^ permalink raw reply

* Automatically Add .gitignore Files
From: Thangalin @ 2017-02-08 19:05 UTC (permalink / raw)
  To: git

I frequently forget to add .gitignore files when creating new .gitignore files.

I'd like to request a command-line option to always add .gitignore
files (or, more generally, always add files that match a given file
specification).

Replicate

0. git init ...
1. echo "*.bak" >> .gitignore
2. touch file.txt
3. git add file.txt
4. git commit -a -m "..."
5. git push origin master

Expected Results

The .gitignore file is also added to the repository. (This is probably
the 80% use case.)

Actual Results

The .gitignore file is not added to the repository.

Additional Details

At step 4, there should be a prompt, warning, or (preferably) either a
command-line or configuration option to add the .gitignore file prior
to commit, automatically. Such as:

git commit --include-all-gitignore-files -a -m "..."

Or:

echo "**/.gitignore" >> .git/config/add-before-commit

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox