Git development
 help / color / mirror / Atom feed
* Re: [DEMO][PATCH v2 6/5] compat: add a qsort_s() implementation based on GNU's qsort_r(1)
From: René Scharfe @ 2017-01-24 18:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git List, Jeff King, Johannes Schindelin
In-Reply-To: <xmqq60l5sihz.fsf@gitster.mtv.corp.google.com>

Am 23.01.2017 um 20:07 schrieb Junio C Hamano:
> René Scharfe <l.s.r@web.de> writes:
> 
>> Implement qsort_s() as a wrapper to the GNU version of qsort_r(1) and
>> use it on Linux.  Performance increases slightly:
>>
>> Test                         HEAD^             HEAD
>> --------------------------------------------------------------------
>> 0071.2: sort(1)              0.10(0.20+0.02)   0.10(0.21+0.01) +0.0%
>> 0071.3: string_list_sort()   0.17(0.15+0.01)   0.16(0.15+0.01) -5.9%
>>
>> Additionally the unstripped size of compat/qsort_s.o falls from 24576
>> to 16544 bytes in my build.
>>
>> IMHO these savings aren't worth the increased complexity of having to
>> support two implementations.
> 
> I do worry about having to support more implementations in the
> future that have different function signature for the comparison
> callbacks, which will make things ugly, but this addition alone
> doesn't look too bad to me.

It is unfair of me to show a 5% speedup and then recommend to not
include it. ;-)  That difference won't be measurable in real use cases
and the patch is not necessary.  This patch is simple, but the overall
complexity (incl. #ifdefs etc.) will be lower without it.

But here's another one, with even higher performance and with an even
bigger recommendation to not include it! :)  It veers off into another
direction: Parallel execution.  It requires thread-safe comparison
functions, which might surprise callers.  The value 1000 for the minimum
number of items before threading kicks in is just a guess, not based on
measurements.  So it's quite raw -- and I'm not sure why it's still a
bit slower than sort(1).

Test                         HEAD^             HEAD
---------------------------------------------------------------------
0071.2: sort(1)              0.10(0.18+0.03)   0.10(0.20+0.02) +0.0%
0071.3: string_list_sort()   0.17(0.14+0.02)   0.11(0.18+0.02) -35.3%

---
 compat/qsort_s.c | 76 ++++++++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 58 insertions(+), 18 deletions(-)

diff --git a/compat/qsort_s.c b/compat/qsort_s.c
index 52d1f0a73d..1304a089af 100644
--- a/compat/qsort_s.c
+++ b/compat/qsort_s.c
@@ -1,4 +1,5 @@
 #include "../git-compat-util.h"
+#include "../thread-utils.h"
 
 /*
  * A merge sort implementation, simplified from the qsort implementation
@@ -6,29 +7,58 @@
  * Added context pointer, safety checks and return value.
  */
 
-static void msort_with_tmp(void *b, size_t n, size_t s,
-			   int (*cmp)(const void *, const void *, void *),
-			   char *t, void *ctx)
+#define MIN_ITEMS_FOR_THREAD 1000
+
+struct work {
+	int nr_threads;
+	void *base;
+	size_t nmemb;
+	size_t size;
+	char *tmp;
+	int (*cmp)(const void *, const void *, void *);
+	void *ctx;
+};
+
+static void *msort_with_tmp(void *work)
 {
+	struct work one, two, *w = work;
 	char *tmp;
 	char *b1, *b2;
 	size_t n1, n2;
+	size_t s, n;
 
-	if (n <= 1)
-		return;
+	if (w->nmemb <= 1)
+		return NULL;
 
-	n1 = n / 2;
-	n2 = n - n1;
-	b1 = b;
-	b2 = (char *)b + (n1 * s);
+	one = two = *w;
+	one.nr_threads /= 2;
+	two.nr_threads -= one.nr_threads;
+	n = one.nmemb;
+	s = one.size;
+	n1 = one.nmemb = n / 2;
+	n2 = two.nmemb = n - n1;
+	b1 = one.base;
+	b2 = two.base = b1 + n1 * s;
+	two.tmp += n1 * s;
 
-	msort_with_tmp(b1, n1, s, cmp, t, ctx);
-	msort_with_tmp(b2, n2, s, cmp, t, ctx);
+#ifndef NO_PTHREADS
+	if (one.nr_threads && n > MIN_ITEMS_FOR_THREAD) {
+		pthread_t thread;
+		int err = pthread_create(&thread, NULL, msort_with_tmp, &one);
+		msort_with_tmp(&two);
+		if (err || pthread_join(thread, NULL))
+			msort_with_tmp(&one);
+	} else
+#endif
+	{
+		msort_with_tmp(&one);
+		msort_with_tmp(&two);
+	}
 
-	tmp = t;
+	tmp = one.tmp;
 
 	while (n1 > 0 && n2 > 0) {
-		if (cmp(b1, b2, ctx) <= 0) {
+		if (one.cmp(b1, b2, one.ctx) <= 0) {
 			memcpy(tmp, b1, s);
 			tmp += s;
 			b1 += s;
@@ -42,7 +72,8 @@ static void msort_with_tmp(void *b, size_t n, size_t s,
 	}
 	if (n1 > 0)
 		memcpy(tmp, b1, n1 * s);
-	memcpy(b, t, (n - n2) * s);
+	memcpy(one.base, one.tmp, (n - n2) * s);
+	return NULL;
 }
 
 int git_qsort_s(void *b, size_t n, size_t s,
@@ -50,20 +81,29 @@ int git_qsort_s(void *b, size_t n, size_t s,
 {
 	const size_t size = st_mult(n, s);
 	char buf[1024];
+	struct work w;
 
 	if (!n)
 		return 0;
 	if (!b || !cmp)
 		return -1;
 
+	w.nr_threads = online_cpus();
+	w.base = b;
+	w.nmemb = n;
+	w.size = s;
+	w.cmp = cmp;
+	w.ctx = ctx;
+
 	if (size < sizeof(buf)) {
 		/* The temporary array fits on the small on-stack buffer. */
-		msort_with_tmp(b, n, s, cmp, buf, ctx);
+		w.tmp = buf;
 	} else {
 		/* It's somewhat large, so malloc it.  */
-		char *tmp = xmalloc(size);
-		msort_with_tmp(b, n, s, cmp, tmp, ctx);
-		free(tmp);
+		w.tmp = xmalloc(size);
 	}
+	msort_with_tmp(&w);
+	if (w.tmp != buf)
+		free(w.tmp);
 	return 0;
 }
-- 
2.11.0


^ permalink raw reply related

* Re: [PATCH v2 0/5] string-list: make string_list_sort() reentrant
From: René Scharfe @ 2017-01-24 18:00 UTC (permalink / raw)
  To: Jeff King; +Cc: Git List, Junio C Hamano, Johannes Schindelin
In-Reply-To: <20170123235445.qsejumltutd2vrhd@sigill.intra.peff.net>

Am 24.01.2017 um 00:54 schrieb Jeff King:
> The speed looks like a reasonable outcome. I'm torn on the qsort_r()
> demo patch. I don't think it looks too bad. OTOH, I don't think I would
> want to deal with the opposite-argument-order versions.

The code itself may look OK, but it's not really necessary and the 
special implementation for Linux makes increases maintenance costs.  Can 
we save it for later and first give the common implemention a chance to 
prove itself?

> Is there any interest in people adding the ISO qsort_s() to their libc
> implementations? It seems like it's been a fair number of years by now.

https://sourceware.org/ml/libc-alpha/2014-12/msg00513.html is the last 
post mentioning qsort_s on the glibc mailing list, but it didn't even 
make it into https://sourceware.org/glibc/wiki/Development_Todo/Master.
Not sure what's planned in BSD land, didn't find anything (but didn't 
look too hard).

René

^ permalink raw reply

* Re: [PATCH v2 4/4] urlmatch: allow globbing for the URL host part
From: Philip Oakley @ 2017-01-24 17:52 UTC (permalink / raw)
  To: Patrick Steinhardt, git
  Cc: Patrick Steinhardt, Junio C Hamano, Patrick Steinhardt
In-Reply-To: <20170124170031.18069-5-patrick.steinhardt@elego.de>

From: "Patrick Steinhardt" <patrick.steinhardt@elego.de>

a quick comment on the documentation part ..

> The URL matching function computes for two URLs whether they match not.
> The match is performed by splitting up the URL into different parts and
> then doing an exact comparison with the to-be-matched URL.
>
> The main user of `urlmatch` is the configuration subsystem. It allows to
> set certain configurations based on the URL which is being connected to
> via keys like `http.<url>.*`. A common use case for this is to set
> proxies for only some remotes which match the given URL. Unfortunately,
> having exact matches for all parts of the URL can become quite tedious
> in some setups. Imagine for example a corporate network where there are
> dozens or even hundreds of subdomains, which would have to be configured
> individually.
>
> This commit introduces the ability to use globbing in the host-part of
> the URLs. A user can simply specify a `*` as part of the host name to
> match all subdomains at this level. For example adding a configuration
> key `http.https://*.example.com.proxy` will match all subdomains of
> `https://example.com`.
>
> Signed-off-by: Patrick Steinhardt <patrick.steinhardt@elego.de>
> ---
> Documentation/config.txt |  5 ++++-
> t/t1300-repo-config.sh   | 36 ++++++++++++++++++++++++++++++++++++
> urlmatch.c               | 38 ++++++++++++++++++++++++++++++++++----
> 3 files changed, 74 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 506431267..a78921c2b 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -1914,7 +1914,10 @@ http.<url>.*::
>   must match exactly between the config key and the URL.
>
> . Host/domain name (e.g., `example.com` in `https://example.com/`).
> -  This field must match exactly between the config key and the URL.
> +  This field must match between the config key and the URL. It is
> +  possible to use globs in the config key to match all subdomains, e.g.
> +  `https://*.example.com/` to match all subdomains of `example.com`. Note
> +  that a glob only every matches a single part of the hostname.

[s/every/ever/ ?]

the "match all subdomains" appears to contradict the "a glob only ever 
matches a single part ".

Maybe borrow the example from the 0/4 cover letter
"so for example `https://foo.bar.example.com` would not match in the case of 
`http.https://*.example.com` " (If I understood it correctly.

A simple example often clarifies much better than more words.
--
Philip

>
> . Port number (e.g., `8080` in `http://example.com:8080/`).
>   This field must match exactly between the config key and the URL.
> diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
> index 923bfc5a2..ec545e092 100755
> --- a/t/t1300-repo-config.sh
> +++ b/t/t1300-repo-config.sh
> @@ -1177,6 +1177,42 @@ test_expect_success 'urlmatch' '
>  test_cmp expect actual
> '
>
> +test_expect_success 'glob-based urlmatch' '
> + cat >.git/config <<-\EOF &&
> + [http]
> + sslVerify
> + [http "https://*.example.com"]
> + sslVerify = false
> + cookieFile = /tmp/cookie.txt
> + EOF
> +
> + test_expect_code 1 git config --bool --get-urlmatch doesnt.exist 
> https://good.example.com >actual &&
> + test_must_be_empty actual &&
> +
> + echo true >expect &&
> + git config --bool --get-urlmatch http.SSLverify https://example.com 
>  >actual &&
> + test_cmp expect actual &&
> +
> + echo true >expect &&
> + git config --bool --get-urlmatch http.SSLverify https://good-example.com 
>  >actual &&
> + test_cmp expect actual &&
> +
> + echo true >expect &&
> + git config --bool --get-urlmatch http.sslverify 
> https://deep.nested.example.com >actual &&
> + test_cmp expect actual &&
> +
> + echo false >expect &&
> + git config --bool --get-urlmatch http.sslverify https://good.example.com 
>  >actual &&
> + test_cmp expect actual &&
> +
> + {
> + echo http.cookiefile /tmp/cookie.txt &&
> + echo http.sslverify false
> + } >expect &&
> + git config --get-urlmatch HTTP https://good.example.com >actual &&
> + test_cmp expect actual
> +'
> +
> # good section hygiene
> test_expect_failure 'unsetting the last key in a section removes header' '
>  cat >.git/config <<-\EOF &&
> diff --git a/urlmatch.c b/urlmatch.c
> index e328905eb..53ff972a6 100644
> --- a/urlmatch.c
> +++ b/urlmatch.c
> @@ -63,6 +63,38 @@ static int append_normalized_escapes(struct strbuf 
> *buf,
>  return 1;
> }
>
> +static int match_host(const struct url_info *url_info,
> +       const struct url_info *pattern_info)
> +{
> + char *url = xmemdupz(url_info->url + url_info->host_off, 
> url_info->host_len);
> + char *pat = xmemdupz(pattern_info->url + pattern_info->host_off, 
> pattern_info->host_len);
> + char *url_tok, *pat_tok, *url_save, *pat_save;
> + int matching;
> +
> + url_tok = strtok_r(url, ".", &url_save);
> + pat_tok = strtok_r(pat, ".", &pat_save);
> +
> + for (; url_tok && pat_tok; url_tok = strtok_r(NULL, ".", &url_save),
> +    pat_tok = strtok_r(NULL, ".", &pat_save)) {
> + if (!strcmp(pat_tok, "*"))
> + continue; /* a simple glob matches everything */
> +
> + if (strcmp(url_tok, pat_tok)) {
> + /* subdomains do not match */
> + matching = 0;
> + break;
> + }
> + }
> +
> + /* matching if both URL and pattern are at their ends */
> + matching = (url_tok == NULL && pat_tok == NULL);
> +
> + free(url);
> + free(pat);
> +
> + return matching;
> +}
> +
> static char *url_normalize_1(const char *url, struct url_info *out_info, 
> char allow_globs)
> {
>  /*
> @@ -467,9 +499,7 @@ static int match_urls(const struct url_info *url,
>  }
>
>  /* check the host */
> - if (url_prefix->host_len != url->host_len ||
> -     strncmp(url->url + url->host_off,
> -     url_prefix->url + url_prefix->host_off, url->host_len))
> + if (!match_host(url, url_prefix))
>  return 0; /* host names do not match */
>
>  /* check the port */
> @@ -512,7 +542,7 @@ int urlmatch_config_entry(const char *var, const char 
> *value, void *cb)
>  struct url_info norm_info;
>
>  config_url = xmemdupz(key, dot - key);
> - norm_url = url_normalize(config_url, &norm_info);
> + norm_url = url_normalize_1(config_url, &norm_info, 1);
>  free(config_url);
>  if (!norm_url)
>  return 0;
> -- 
> 2.11.0
>
> 


^ permalink raw reply

* diff --color-words breaks spacing
From: Phil Hord @ 2017-01-24 17:39 UTC (permalink / raw)
  To: Git

I noticed some weird spacing when comparing files with git diff
--color-words.  The space before a colored word disappears sometimes.

$ git --version
git version 2.11.0.485.g4e59582ff

echo "FOO foo; foo = bar" > a
echo "FOO foo = baz" > b
git diff --color-words --no-index a b
FOOfoo; foo = barbaz

There should be a space after FOO in the diff, even if git doesn't
think "foo" and "foo;" are the same word.

If I remove the semicolon, it looks better, but in fact it only moves
the error later. The missing space is now between the two "foo" words.

echo "FOO foo foo = bar" > a
echo "FOO foo = baz" > b
git diff --color-words --no-index a b
FOO foofoo = barbaz

Here's the same with the color codes changed to text for purposes of this email:

echo "FOO foo; foo = bar" > a
echo "FOO foo = baz" > b
git diff --color-words --no-index a b | tr  \\033 E
FOOE[31mfoo;E[m foo = E[31mbarE[mE[32mbazE[m

echo "FOO foo foo = bar" > a
echo "FOO foo = baz" > b
git diff --color-words --no-index a b | tr  \\033 E
FOO fooE[31mfooE[m = E[31mbarE[mE[32mbazE[m

^ permalink raw reply

* Re: [RFC 2/2] grep: use '/' delimiter for paths
From: Phil Hord @ 2017-01-24 17:18 UTC (permalink / raw)
  To: Stefan Hajnoczi, Junio C Hamano; +Cc: Jeff King, Git
In-Reply-To: <20170123132918.GM29186@stefanha-x1.localdomain>

On Tue, Jan 24, 2017 at 1:54 AM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > The use of "git show" you are demonstrating is still about showing
> > the commit object, whose behaviour is defined to show the log
> > message and the diff relative to its sole parent, limited to the
> > paths that match the pathspec.
> >
> > It is perfectly fine and desirable that "git show <commit>:<path>"
> > and "git show <commit> -- <path>" behaves differently.  These are
> > two completely different features.
>
> Thanks for explaining guys.  It all makes logical sense.  I just hope I
> remember the distinctions in that table for everyday git use.


Familiar itch; previous discussion here:

https://public-inbox.org/git/1377528372-31206-1-git-send-email-hordp@cisco.com/

Phil

^ permalink raw reply

* [PATCH v2 4/4] urlmatch: allow globbing for the URL host part
From: Patrick Steinhardt @ 2017-01-24 17:00 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Junio C Hamano, Patrick Steinhardt
In-Reply-To: <20170123130635.29577-1-patrick.steinhardt@elego.de>

The URL matching function computes for two URLs whether they match not.
The match is performed by splitting up the URL into different parts and
then doing an exact comparison with the to-be-matched URL.

The main user of `urlmatch` is the configuration subsystem. It allows to
set certain configurations based on the URL which is being connected to
via keys like `http.<url>.*`. A common use case for this is to set
proxies for only some remotes which match the given URL. Unfortunately,
having exact matches for all parts of the URL can become quite tedious
in some setups. Imagine for example a corporate network where there are
dozens or even hundreds of subdomains, which would have to be configured
individually.

This commit introduces the ability to use globbing in the host-part of
the URLs. A user can simply specify a `*` as part of the host name to
match all subdomains at this level. For example adding a configuration
key `http.https://*.example.com.proxy` will match all subdomains of
`https://example.com`.

Signed-off-by: Patrick Steinhardt <patrick.steinhardt@elego.de>
---
 Documentation/config.txt |  5 ++++-
 t/t1300-repo-config.sh   | 36 ++++++++++++++++++++++++++++++++++++
 urlmatch.c               | 38 ++++++++++++++++++++++++++++++++++----
 3 files changed, 74 insertions(+), 5 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 506431267..a78921c2b 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1914,7 +1914,10 @@ http.<url>.*::
   must match exactly between the config key and the URL.
 
 . Host/domain name (e.g., `example.com` in `https://example.com/`).
-  This field must match exactly between the config key and the URL.
+  This field must match between the config key and the URL. It is
+  possible to use globs in the config key to match all subdomains, e.g.
+  `https://*.example.com/` to match all subdomains of `example.com`. Note
+  that a glob only every matches a single part of the hostname.
 
 . Port number (e.g., `8080` in `http://example.com:8080/`).
   This field must match exactly between the config key and the URL.
diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
index 923bfc5a2..ec545e092 100755
--- a/t/t1300-repo-config.sh
+++ b/t/t1300-repo-config.sh
@@ -1177,6 +1177,42 @@ test_expect_success 'urlmatch' '
 	test_cmp expect actual
 '
 
+test_expect_success 'glob-based urlmatch' '
+	cat >.git/config <<-\EOF &&
+	[http]
+		sslVerify
+	[http "https://*.example.com"]
+		sslVerify = false
+		cookieFile = /tmp/cookie.txt
+	EOF
+
+	test_expect_code 1 git config --bool --get-urlmatch doesnt.exist https://good.example.com >actual &&
+	test_must_be_empty actual &&
+
+	echo true >expect &&
+	git config --bool --get-urlmatch http.SSLverify https://example.com >actual &&
+	test_cmp expect actual &&
+
+	echo true >expect &&
+	git config --bool --get-urlmatch http.SSLverify https://good-example.com >actual &&
+	test_cmp expect actual &&
+
+	echo true >expect &&
+	git config --bool --get-urlmatch http.sslverify https://deep.nested.example.com >actual &&
+	test_cmp expect actual &&
+
+	echo false >expect &&
+	git config --bool --get-urlmatch http.sslverify https://good.example.com >actual &&
+	test_cmp expect actual &&
+
+	{
+		echo http.cookiefile /tmp/cookie.txt &&
+		echo http.sslverify false
+	} >expect &&
+	git config --get-urlmatch HTTP https://good.example.com >actual &&
+	test_cmp expect actual
+'
+
 # good section hygiene
 test_expect_failure 'unsetting the last key in a section removes header' '
 	cat >.git/config <<-\EOF &&
diff --git a/urlmatch.c b/urlmatch.c
index e328905eb..53ff972a6 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -63,6 +63,38 @@ static int append_normalized_escapes(struct strbuf *buf,
 	return 1;
 }
 
+static int match_host(const struct url_info *url_info,
+		      const struct url_info *pattern_info)
+{
+	char *url = xmemdupz(url_info->url + url_info->host_off, url_info->host_len);
+	char *pat = xmemdupz(pattern_info->url + pattern_info->host_off, pattern_info->host_len);
+	char *url_tok, *pat_tok, *url_save, *pat_save;
+	int matching;
+
+	url_tok = strtok_r(url, ".", &url_save);
+	pat_tok = strtok_r(pat, ".", &pat_save);
+
+	for (; url_tok && pat_tok; url_tok = strtok_r(NULL, ".", &url_save),
+				   pat_tok = strtok_r(NULL, ".", &pat_save)) {
+		if (!strcmp(pat_tok, "*"))
+			continue; /* a simple glob matches everything */
+
+		if (strcmp(url_tok, pat_tok)) {
+			/* subdomains do not match */
+			matching = 0;
+			break;
+		}
+	}
+
+	/* matching if both URL and pattern are at their ends */
+	matching = (url_tok == NULL && pat_tok == NULL);
+
+	free(url);
+	free(pat);
+
+	return matching;
+}
+
 static char *url_normalize_1(const char *url, struct url_info *out_info, char allow_globs)
 {
 	/*
@@ -467,9 +499,7 @@ static int match_urls(const struct url_info *url,
 	}
 
 	/* check the host */
-	if (url_prefix->host_len != url->host_len ||
-	    strncmp(url->url + url->host_off,
-		    url_prefix->url + url_prefix->host_off, url->host_len))
+	if (!match_host(url, url_prefix))
 		return 0; /* host names do not match */
 
 	/* check the port */
@@ -512,7 +542,7 @@ int urlmatch_config_entry(const char *var, const char *value, void *cb)
 		struct url_info norm_info;
 
 		config_url = xmemdupz(key, dot - key);
-		norm_url = url_normalize(config_url, &norm_info);
+		norm_url = url_normalize_1(config_url, &norm_info, 1);
 		free(config_url);
 		if (!norm_url)
 			return 0;
-- 
2.11.0


^ permalink raw reply related

* [PATCH v2 3/4] urlmatch: split host and port fields in `struct url_info`
From: Patrick Steinhardt @ 2017-01-24 17:00 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Junio C Hamano, Patrick Steinhardt
In-Reply-To: <20170123130635.29577-1-patrick.steinhardt@elego.de>

The `url_info` structure contains information about a normalized URL
with the URL's components being represented by different fields. The
host and port part though are to be accessed by the same `host` field,
so that getting the host and/or port separately becomes more involved
than really necessary.

To make the port more readily accessible, split up the host and port
fields. Namely, the `host_len` will not include the port length anymore
and a new `port_off` field has been added which includes the offset to
the port, if available.

The only user of these fields is `url_normalize_1`. This change makes it
easier later on to treat host and port differently when introducing
globs for domains.

Signed-off-by: Patrick Steinhardt <patrick.steinhardt@elego.de>
---
 urlmatch.c | 16 ++++++++++++----
 urlmatch.h |  9 +++++----
 2 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/urlmatch.c b/urlmatch.c
index d350478c0..e328905eb 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -104,7 +104,7 @@ static char *url_normalize_1(const char *url, struct url_info *out_info, char al
 	struct strbuf norm;
 	size_t spanned;
 	size_t scheme_len, user_off=0, user_len=0, passwd_off=0, passwd_len=0;
-	size_t host_off=0, host_len=0, port_len=0, path_off, path_len, result_len;
+	size_t host_off=0, host_len=0, port_off=0, port_len=0, path_off, path_len, result_len;
 	const char *slash_ptr, *at_ptr, *colon_ptr, *path_start;
 	char *result;
 
@@ -263,6 +263,7 @@ static char *url_normalize_1(const char *url, struct url_info *out_info, char al
 				return NULL;
 			}
 			strbuf_addch(&norm, ':');
+			port_off = norm.len;
 			strbuf_add(&norm, url, slash_ptr - url);
 			port_len = slash_ptr - url;
 		}
@@ -270,7 +271,7 @@ static char *url_normalize_1(const char *url, struct url_info *out_info, char al
 		url = slash_ptr;
 	}
 	if (host_off)
-		host_len = norm.len - host_off;
+		host_len = norm.len - host_off - (port_len ? port_len + 1 : 0);
 
 
 	/*
@@ -378,6 +379,7 @@ static char *url_normalize_1(const char *url, struct url_info *out_info, char al
 		out_info->passwd_len = passwd_len;
 		out_info->host_off = host_off;
 		out_info->host_len = host_len;
+		out_info->port_off = port_off;
 		out_info->port_len = port_len;
 		out_info->path_off = path_off;
 		out_info->path_len = path_len;
@@ -464,11 +466,17 @@ static int match_urls(const struct url_info *url,
 		usermatched = 1;
 	}
 
-	/* check the host and port */
+	/* check the host */
 	if (url_prefix->host_len != url->host_len ||
 	    strncmp(url->url + url->host_off,
 		    url_prefix->url + url_prefix->host_off, url->host_len))
-		return 0; /* host names and/or ports do not match */
+		return 0; /* host names do not match */
+
+	/* check the port */
+	if (url_prefix->port_len != url->port_len ||
+	    strncmp(url->url + url->port_off,
+		    url_prefix->url + url_prefix->port_off, url->port_len))
+		return 0; /* ports do not match */
 
 	/* check the path */
 	pathmatchlen = url_match_prefix(
diff --git a/urlmatch.h b/urlmatch.h
index 528862adc..0ea812b03 100644
--- a/urlmatch.h
+++ b/urlmatch.h
@@ -18,11 +18,12 @@ struct url_info {
 	size_t passwd_len;	/* length of passwd; if passwd_off != 0 but
 				   passwd_len == 0, an empty passwd was given */
 	size_t host_off;	/* offset into url to start of host name (0 => none) */
-	size_t host_len;	/* length of host name; this INCLUDES any ':portnum';
+	size_t host_len;	/* length of host name;
 				 * file urls may have host_len == 0 */
-	size_t port_len;	/* if a portnum is present (port_len != 0), it has
-				 * this length (excluding the leading ':') at the
-				 * end of the host name (always 0 for file urls) */
+	size_t port_off;	/* offset into url to start of port number (0 => none) */
+	size_t port_len;	/* if a portnum is present (port_off != 0), it has
+				 * this length (excluding the leading ':') starting
+				 * from port_off (always 0 for file urls) */
 	size_t path_off;	/* offset into url to the start of the url path;
 				 * this will always point to a '/' character
 				 * after the url has been normalized */
-- 
2.11.0


^ permalink raw reply related

* [PATCH v2 2/4] urlmatch: enable normalization of URLs with globs
From: Patrick Steinhardt @ 2017-01-24 17:00 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Junio C Hamano, Patrick Steinhardt
In-Reply-To: <20170123130635.29577-1-patrick.steinhardt@elego.de>

The `url_normalize` function is used to validate and normalize URLs. As
such, it does not allow for some special characters to be part of the
URLs that are to be normalized. As we want to allow using globs in some
configuration keys making use of URLs, namely `http.<url>.<key>`, but
still normalize them, we need to somehow enable some additional allowed
characters.

To do this without having to change all callers of `url_normalize`,
where most do not actually want globbing at all, we split off another
function `url_normalize_1`. This function accepts an additional
parameter `allow_globs`, which is subsequently called by `url_normalize`
with `allow_globs=0`.

As of now, this function is not used with globbing enabled. A caller
will be added in the following commit.

Signed-off-by: Patrick Steinhardt <patrick.steinhardt@elego.de>
---
 urlmatch.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/urlmatch.c b/urlmatch.c
index 132d342bc..d350478c0 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -63,7 +63,7 @@ static int append_normalized_escapes(struct strbuf *buf,
 	return 1;
 }
 
-char *url_normalize(const char *url, struct url_info *out_info)
+static char *url_normalize_1(const char *url, struct url_info *out_info, char allow_globs)
 {
 	/*
 	 * Normalize NUL-terminated url using the following rules:
@@ -191,7 +191,12 @@ char *url_normalize(const char *url, struct url_info *out_info)
 		strbuf_release(&norm);
 		return NULL;
 	}
-	spanned = strspn(url, URL_HOST_CHARS);
+
+	if (allow_globs)
+		spanned = strspn(url, URL_HOST_CHARS "*");
+	else
+		spanned = strspn(url, URL_HOST_CHARS);
+
 	if (spanned < colon_ptr - url) {
 		/* Host name has invalid characters */
 		if (out_info) {
@@ -380,6 +385,11 @@ char *url_normalize(const char *url, struct url_info *out_info)
 	return result;
 }
 
+char *url_normalize(const char *url, struct url_info *out_info)
+{
+	return url_normalize_1(url, out_info, 0);
+}
+
 static size_t url_match_prefix(const char *url,
 			       const char *url_prefix,
 			       size_t url_prefix_len)
-- 
2.11.0


^ permalink raw reply related

* [PATCH v2 1/4] mailmap: add Patrick Steinhardt's work address
From: Patrick Steinhardt @ 2017-01-24 17:00 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Junio C Hamano, Patrick Steinhardt
In-Reply-To: <20170123130635.29577-1-patrick.steinhardt@elego.de>

Signed-off-by: Patrick Steinhardt <patrick.steinhardt@elego.de>
---
 .mailmap | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.mailmap b/.mailmap
index 9c87a3840..ea59205b9 100644
--- a/.mailmap
+++ b/.mailmap
@@ -177,6 +177,7 @@ Paolo Bonzini <bonzini@gnu.org> <paolo.bonzini@lu.unisi.ch>
 Pascal Obry <pascal@obry.net> <pascal.obry@gmail.com>
 Pascal Obry <pascal@obry.net> <pascal.obry@wanadoo.fr>
 Pat Notz <patnotz@gmail.com> <pknotz@sandia.gov>
+Patrick Steinhardt <ps@pks.im> <patrick.steinhardt@elego.de>
 Paul Mackerras <paulus@samba.org> <paulus@dorrigo.(none)>
 Paul Mackerras <paulus@samba.org> <paulus@pogo.(none)>
 Peter Baumann <waste.manager@gmx.de> <Peter.B.Baumann@stud.informatik.uni-erlangen.de>
-- 
2.11.0


^ permalink raw reply related

* [PATCH v2 0/4] urlmatch: allow regexp-based matches
From: Patrick Steinhardt @ 2017-01-24 17:00 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Junio C Hamano, Patrick Steinhardt
In-Reply-To: <20170123130635.29577-1-patrick.steinhardt@elego.de>

Hi,

This is version two of my patch series.

The use case is to be able to configure an HTTP proxy for all
subdomains of a domain where there are hundreds of subdomains.

Previously, I have been using complete regular expressions with
an escape-mechanism to match the configuration key's URLs.
According to Junio's comments, I changed this mechanism to a much
simpler one, where the user is only allowed to use globbing for
the host part of the URL. That is a user can now specify a key
`http.https://*.example.com` to match all sub-domains of
`example.com`. For now I've decided to implement it such that a
single `*` matches a single subdomain only, so for example
`https://foo.bar.example.com` would not match in this case. This
is similar to how shell-globbing works usually, so it should not
be of much surprise. It's also highlighted in the documentation.

I did not include an interdiff as too much has changed between
the two versions.

Regards
Patrick

Patrick Steinhardt (4):
  mailmap: add Patrick Steinhardt's work address
  urlmatch: enable normalization of URLs with globs
  urlmatch: split host and port fields in `struct url_info`
  urlmatch: allow globbing for the URL host part

 .mailmap                 |  1 +
 Documentation/config.txt |  5 +++-
 t/t1300-repo-config.sh   | 36 +++++++++++++++++++++++++
 urlmatch.c               | 68 +++++++++++++++++++++++++++++++++++++++++-------
 urlmatch.h               |  9 ++++---
 5 files changed, 104 insertions(+), 15 deletions(-)

-- 
2.11.0


^ permalink raw reply

* Re: What's cooking in git.git (Jan 2017, #04; Mon, 23)
From: Lars Schneider @ 2017-01-24 14:43 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git
In-Reply-To: <20170124132749.l3ezupyitvxe4t2l@sigill.intra.peff.net>


> On 24 Jan 2017, at 14:27, Jeff King <peff@peff.net> wrote:
> 
> On Tue, Jan 24, 2017 at 11:04:21AM +0100, Lars Schneider wrote:
> 
>> "fsck: prepare dummy objects for --connectivity-check" seems to
>> make t1450-fsck.sh fail on macOS with TravisCI:
>> 
>> Initialized empty Git repository in /Users/travis/build/git/git/t/trash directory.t1450-fsck/connectivity-only/.git/
>> [master (root-commit) 86520b7] empty
>> Author: A U Thor <author@example.com>
>> 2 files changed, 1 insertion(+)
>> create mode 100644 empty
>> create mode 100644 empty.t
>> override r--r--r--  travis/staff for .git/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391? (y/n [n]) not overwritten
>> dangling blob c21c9352f7526e9576892a6631e0e8cf1fccd34d
> 
> Looks like "mv" prompts and then fails to move the file (so we get the
> dangling blob for the source blob, and fsck doesn't report failure because
> we didn't actually corrupt the destination blob).
> 
> If I'm understanding the behavior correctly, this violates POSIX, which
> says:
> 
>  1. If the destination path exists, the −f option is not specified, and
>     either of the following conditions is true:
> 
>     a. The permissions of the destination path do not permit writing
>     and the standard input is a terminal.
> 
>     b. The −i option is specified.
> 
>     the mv utility shall write a prompt to standard error and read a
>     line from standard input. If the response is not affirmative,
>     mv shall do nothing more with the current source_file and go on
>     to any remaining source_files.
> 
> Our stdin isn't a terminal, and we do not specify "-i", so I think it
> shouldn't prompt.  It also exits with code 0 after failing to move,
> which is another surprise.
> 
> Here's a patch (tested by me that it works on Linux, but I don't know
> for sure that it fixes the Travis problem).
> 
> -- >8 --
> Subject: [PATCH] t1450: use "mv -f" within loose object directory
> 
> The loose objects are created with mode 0444. That doesn't
> prevent them being overwritten by rename(), but some
> versions of "mv" will be extra careful and prompt the user,
> even without "-i".
> 
> Reportedly macOS does this, at least in the Travis builds.
> The prompt reads from /dev/null, defaulting to "no", and the
> object isn't moved. Then to make matters even more
> interesting, it still returns "0" and the rest of the test
> proceeds, but with a broken setup.
> 
> We can work around it by using "mv -f" to override the
> prompt. This should work as it's already used in t5504 for
> the same purpose.
> 
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> t/t1450-fsck.sh | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
> index efaf41b68..33a51c9a6 100755
> --- a/t/t1450-fsck.sh
> +++ b/t/t1450-fsck.sh
> @@ -536,7 +536,7 @@ test_expect_success 'fsck --connectivity-only' '
> 		# free to examine the type if it chooses.
> 		empty=.git/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391 &&
> 		blob=$(echo unrelated | git hash-object -w --stdin) &&
> -		mv $(sha1_file $blob) $empty &&
> +		mv -f $(sha1_file $blob) $empty &&
> 
> 		test_must_fail git fsck --strict &&
> 		git fsck --strict --connectivity-only &&
> -- 
> 2.11.0.840.gd37c5973a

Ack. This patch fixes the issue:
https://travis-ci.org/larsxschneider/git/builds/194819605

Thanks,
Lars

^ permalink raw reply

* Re: [PATCH] difftool.c: mark a file-local symbol with static
From: Jeff King @ 2017-01-24 14:23 UTC (permalink / raw)
  To: David Aguilar
  Cc: Junio C Hamano, Ramsay Jones, Johannes Schindelin,
	GIT Mailing-list
In-Reply-To: <20170122052608.tpr5pihfgafhoynj@gmail.com>

mn Sat, Jan 21, 2017 at 09:26:08PM -0800, David Aguilar wrote:

> > > An obvious second
> > > best option would be to drop -Wall from the "everybody" CFLAGS and
> > > move it to DEVELOPER_CFLAGS instead.
> > 
> > Yeah, though that doesn't help the example above.
> > 
> > As ugly as warning("%s", "") is, I think it may be the thing that annoys
> > the smallest number of people.
> > 
> > -Peff
> 
> How about using warning(" ") instead?
> 
> For difftool.c specifically, the following is a fine solution,
> and doesn't require that we change our warning flags just for
> this one file.

I dunno. As ugly as the "%s" thing is in the source, at least it doesn't
change the output. Not that an extra space is the end of the world, but
it seems like it's letting the problem escape from the source code.

Do people still care about resolving this? -Wno-format-zero-length is in
the DEVELOPER options. It wasn't clear to me if that was sufficient, or
if we're going to get a bunch of reports from people that need to be
directed to the right compiler options.

The problematic code just hit 'next', so I suppose we'll see soon (OTOH,
the real test is when it get shipped as part of a release).

-Peff

^ permalink raw reply

* Re: [PATCH v2 0/5] string-list: make string_list_sort() reentrant
From: Jeff King @ 2017-01-24 13:44 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: René Scharfe, Git List, Junio C Hamano
In-Reply-To: <alpine.DEB.2.20.1701241233390.3469@virtualbox>

On Tue, Jan 24, 2017 at 12:44:10PM +0100, Johannes Schindelin wrote:

> Hi Peff,
> 
> On Mon, 23 Jan 2017, Jeff King wrote:
> 
> > Is there any interest in people adding the ISO qsort_s() to their libc
> > implementations? It seems like it's been a fair number of years by now.
> 
> Visual C supports it *at least* since Visual Studio 2005:
> 
> https://msdn.microsoft.com/en-us/library/4xc60xas(v=vs.80).aspx
> 
> With René's patch, we have an adapter for GNU libc, and if anybody comes
> up with the (equally trivial) adapter for BSD libc's qsort_r(), we have a
> lot of bases covered.

Sadly, no. Microsoft's qsort_s() is not compatible with the ISO C one.
And BSD's qsort_r() has a similar problem acting as a wrapper, because
the order of arguments in the callback functions is different (so you'd
have to actually wrap the callback, too, and rearrange the arguments for
each call, or do some macro trickery).

Gory details are in:

  https://stackoverflow.com/questions/39560773/different-declarations-of-qsort-r-on-mac-and-linux/39561369

and the original thread:

  http://public-inbox.org/git/3083fbf7-d67e-77e4-e05f-94a7e7e15eba@web.de/

-Peff

^ permalink raw reply

* Re: GSoC 2017: application open, deadline = February 9, 2017
From: Christian Couder @ 2017-01-24 13:32 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Matthieu Moy, git, Pranit Bauva, Lars Schneider, Jeff King,
	Carlos Martín Nieto, Thomas Gummerer
In-Reply-To: <alpine.DEB.2.20.1701241228020.3469@virtualbox>

On Tue, Jan 24, 2017 at 12:28 PM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> Hi Matthieu,
>
> On Mon, 23 Jan 2017, Matthieu Moy wrote:
>
>> * Who's willing to mentor?
>
> As in the years before, I am willing to mentor.

I am also willing to mentor.

Thanks!

^ permalink raw reply

* Re: What's cooking in git.git (Jan 2017, #04; Mon, 23)
From: Jeff King @ 2017-01-24 13:27 UTC (permalink / raw)
  To: Lars Schneider; +Cc: Junio C Hamano, git
In-Reply-To: <0D956B23-E655-4C28-A205-14CCC0A7DEA2@gmail.com>

On Tue, Jan 24, 2017 at 11:04:21AM +0100, Lars Schneider wrote:

> "fsck: prepare dummy objects for --connectivity-check" seems to
> make t1450-fsck.sh fail on macOS with TravisCI:
> 
> Initialized empty Git repository in /Users/travis/build/git/git/t/trash directory.t1450-fsck/connectivity-only/.git/
> [master (root-commit) 86520b7] empty
>  Author: A U Thor <author@example.com>
>  2 files changed, 1 insertion(+)
>  create mode 100644 empty
>  create mode 100644 empty.t
> override r--r--r--  travis/staff for .git/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391? (y/n [n]) not overwritten
> dangling blob c21c9352f7526e9576892a6631e0e8cf1fccd34d

Looks like "mv" prompts and then fails to move the file (so we get the
dangling blob for the source blob, and fsck doesn't report failure because
we didn't actually corrupt the destination blob).

If I'm understanding the behavior correctly, this violates POSIX, which
says:

  1. If the destination path exists, the −f option is not specified, and
     either of the following conditions is true:

     a. The permissions of the destination path do not permit writing
     and the standard input is a terminal.

     b. The −i option is specified.

     the mv utility shall write a prompt to standard error and read a
     line from standard input. If the response is not affirmative,
     mv shall do nothing more with the current source_file and go on
     to any remaining source_files.

Our stdin isn't a terminal, and we do not specify "-i", so I think it
shouldn't prompt.  It also exits with code 0 after failing to move,
which is another surprise.

Here's a patch (tested by me that it works on Linux, but I don't know
for sure that it fixes the Travis problem).

-- >8 --
Subject: [PATCH] t1450: use "mv -f" within loose object directory

The loose objects are created with mode 0444. That doesn't
prevent them being overwritten by rename(), but some
versions of "mv" will be extra careful and prompt the user,
even without "-i".

Reportedly macOS does this, at least in the Travis builds.
The prompt reads from /dev/null, defaulting to "no", and the
object isn't moved. Then to make matters even more
interesting, it still returns "0" and the rest of the test
proceeds, but with a broken setup.

We can work around it by using "mv -f" to override the
prompt. This should work as it's already used in t5504 for
the same purpose.

Signed-off-by: Jeff King <peff@peff.net>
---
 t/t1450-fsck.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index efaf41b68..33a51c9a6 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -536,7 +536,7 @@ test_expect_success 'fsck --connectivity-only' '
 		# free to examine the type if it chooses.
 		empty=.git/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391 &&
 		blob=$(echo unrelated | git hash-object -w --stdin) &&
-		mv $(sha1_file $blob) $empty &&
+		mv -f $(sha1_file $blob) $empty &&
 
 		test_must_fail git fsck --strict &&
 		git fsck --strict --connectivity-only &&
-- 
2.11.0.840.gd37c5973a


^ permalink raw reply related

* Re: [PATCH v2 0/5] string-list: make string_list_sort() reentrant
From: Johannes Schindelin @ 2017-01-24 11:44 UTC (permalink / raw)
  To: Jeff King; +Cc: René Scharfe, Git List, Junio C Hamano
In-Reply-To: <20170123235445.qsejumltutd2vrhd@sigill.intra.peff.net>

[-- Attachment #1: Type: text/plain, Size: 522 bytes --]

Hi Peff,

On Mon, 23 Jan 2017, Jeff King wrote:

> Is there any interest in people adding the ISO qsort_s() to their libc
> implementations? It seems like it's been a fair number of years by now.

Visual C supports it *at least* since Visual Studio 2005:

https://msdn.microsoft.com/en-us/library/4xc60xas(v=vs.80).aspx

With René's patch, we have an adapter for GNU libc, and if anybody comes
up with the (equally trivial) adapter for BSD libc's qsort_r(), we have a
lot of bases covered.

Ciao,
Johannes

^ permalink raw reply

* Re: [PATCH] blame: add option to print tips (--tips)
From: Pranit Bauva @ 2017-01-24 11:36 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Edmundo Carmona Antoranz, Git List
In-Reply-To: <xmqqr33tsjwx.fsf@gitster.mtv.corp.google.com>

Hey Junio,

On Tue, Jan 24, 2017 at 12:06 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Pranit Bauva <pranit.bauva@gmail.com> writes:
>
>> We can probably make it useful with some extended efforts. I use
>> git-blame and I sometimes find that I don't need things like the name
>> of the author, time, timezone and not even the file name and I have to
>> use a bigger terminal. If we could somehow remove those fields then
>> maybe this would be a useful feature.
>
> I admit that I didn't recall the option until somebody else told me,
> but I think "blame -s" or something like that for that purpose ;-)

Ah! Thanks a lot!

Regards,
Pranit Bauva

^ permalink raw reply

* Re: [PATCH v1 0/2] urlmatch: allow regexp-based matches
From: Patrick Steinhardt @ 2017-01-24 11:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Patrick Steinhardt
In-Reply-To: <xmqqsho9r1rs.fsf@gitster.mtv.corp.google.com>

[-- Attachment #1: Type: text/plain, Size: 2423 bytes --]

On Mon, Jan 23, 2017 at 11:53:43AM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <patrick.steinhardt@elego.de> writes:
> 
> > This patch is mostly a request for comments. The use case is to
> > be able to configure an HTTP proxy for all subdomains of a
> > certain domain where there are hundreds of subdomains. The most
> > flexible way I could imagine was by using regular expressions for
> > the matching, which is how I implemented it for now. So users can
> > now create a configuration key like
> > `http.?http://.*\\.example\\.com.*` to apply settings for all
> > subdomains of `example.com`.
> 
> While reading 2/2, I got an impression that this is "too" flexible
> and possibly operates at a wrong level.  I would have expected that
> the wildcarding to be limited to the host part only and hook into
> match_urls(), allowing the users of the new feature to still take
> advantage of the existing support of "http://me@example.com" that
> limits the match to the case that the connection is authenticated
> for a user, for example, by newly allowing "http://me@*.example.com"
> or something like that.
> 
> Because you cannot have a literal '*' in your hostname, I would
> imagine that supporting a match pattern "http://me@*.example.com"
> would be already backward compatible without requiring a leading
> question-mark.
> 
> I also personally would prefer these textual matching to be done
> with glob not with regexp, by the way, as the above description of
> mine shows.
> 
> Thanks.

Thanks for your feedback. Using globs in the hostname only was my
first intent, as well. I later on took regular expressions
instead so as to allow further flexibility for the user. The
reasoning was that there might be other use cases which cannot
actually be solved with using globs only, even if I myself wasn't
aware of different ones. So this might be indeed over-engineered
when using regular expressions.

There are several questions though regarding semantics with
globs, where I'd like to have additional opinions on.

- should a glob only be allowed for actual subdomains, allowing
  "http://*.example.com" but not "http://*example.com"?

- should a glob also match multiple nestings of subdomains? E.g.
  "http://*.example.com" would match "http://foo.example.com" but
  not "http://foo.bar.example.com"?

I'll send a version 2 soon-ish.

Regards
Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply

* Re: GSoC 2017: application open, deadline = February 9, 2017
From: Johannes Schindelin @ 2017-01-24 11:28 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: git, Pranit Bauva, Lars Schneider, Christian Couder, Jeff King,
	Carlos Martín Nieto, Thomas Gummerer
In-Reply-To: <vpq1svtstud.fsf@anie.imag.fr>

Hi Matthieu,

On Mon, 23 Jan 2017, Matthieu Moy wrote:

> * Who's willing to mentor?

As in the years before, I am willing to mentor.

Ciao,
Johannes

^ permalink raw reply

* Re: [PATCH 0/3] stash: support filename argument
From: Johannes Schindelin @ 2017-01-24 10:56 UTC (permalink / raw)
  To: Thomas Gummerer
  Cc: git, Stephan Beyer, Junio C Hamano, Marc Strapetz, Jeff King
In-Reply-To: <20170121200804.19009-1-t.gummerer@gmail.com>

Hi Thomas,

On Sat, 21 Jan 2017, Thomas Gummerer wrote:

> This is the first try to implement the RFC I posted a week ago [1].  It
> introduces a new push verb for git stash.  I couldn't come up with
> any better name that wasn't already taken.  If anyone has ideas I'd be
> very happy to hear them.

I would have preferred a series of patches that essentially adds a new and
improved `save` syntax:

git stash [save] [-p|--patch] [-k|--[no-]keep-index] [-q|--quiet]
          [-u|--include-untracked] [-a|--all] [-m <message>]]
          [-- <path>...]

and keeps the legacy syntax, but deprecates it:

git stash [save [-p|--patch] [-k|--[no-]keep-index] [-q|--quiet]
          [-u|--include-untracked] [-a|--all] [<message>]]

The problem with that is, of course, that 3c2eb80fe3 (stash: simplify
defaulting to "save" and reject unknown options, 2009-08-18) in its
infinite wisdom *already* introduced the `--` separator to drop out of
option parsing.

On a positive note, it is a thorn in Git's CUI that `git stash` implies
the `save` command, and that `save` is not at all the opposite of `apply`
or `pop`. Your introduction of the `push` command will fix that flaw, and
we can *still* deprecate the `save` command.

Ciao,
Johannes

^ permalink raw reply

* Re: What's cooking in git.git (Jan 2017, #04; Mon, 23)
From: Lars Schneider @ 2017-01-24 10:04 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King; +Cc: git
In-Reply-To: <xmqqo9yxpaxk.fsf@gitster.mtv.corp.google.com>


> On 24 Jan 2017, at 01:18, Junio C Hamano <gitster@pobox.com> wrote:
> 
> * jk/fsck-connectivity-check-fix (2017-01-17) 6 commits
>  (merged to 'next' on 2017-01-23 at e8e9b76b84)
> + fsck: check HAS_OBJ more consistently
> + fsck: do not fallback "git fsck <bogus>" to "git fsck"
> + fsck: tighten error-checks of "git fsck <head>"
> + fsck: prepare dummy objects for --connectivity-check
> + fsck: report trees as dangling
> + t1450: clean up sub-objects in duplicate-entry test
> 
> "git fsck --connectivity-check" was not working at all.
> 
> Will merge to 'master'.

"fsck: prepare dummy objects for --connectivity-check" seems to
make t1450-fsck.sh fail on macOS with TravisCI:

Initialized empty Git repository in /Users/travis/build/git/git/t/trash directory.t1450-fsck/connectivity-only/.git/
[master (root-commit) 86520b7] empty
 Author: A U Thor <author@example.com>
 2 files changed, 1 insertion(+)
 create mode 100644 empty
 create mode 100644 empty.t
override r--r--r--  travis/staff for .git/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391? (y/n [n]) not overwritten
dangling blob c21c9352f7526e9576892a6631e0e8cf1fccd34d
test_must_fail: command succeeded: git fsck --strict
not ok 58 - fsck --connectivity-only

More test output: https://travis-ci.org/git/git/jobs/194663620

For some reason I am not able to replicate that behavior on my local
macOS machine. I found the commit using bisect on TravisCI:
https://api.travis-ci.org/jobs/194746454/log.txt?deansi=true

Any idea what might be wrong?

- Lars

^ permalink raw reply

* Re: [RFC 2/2] grep: use '/' delimiter for paths
From: Stefan Hajnoczi @ 2017-01-23 13:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, git
In-Reply-To: <xmqq60l9wdb9.fsf@gitster.mtv.corp.google.com>

[-- Attachment #1: Type: text/plain, Size: 1531 bytes --]

On Fri, Jan 20, 2017 at 02:56:26PM -0800, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
> 
> > It's not ignored; just as with git-log, it's a pathspec to limit the
> > diff. E.g.:
> >
> >   $ git show --name-status v2.9.3
> >   ...
> >   M       Documentation/RelNotes/2.9.3.txt
> >   M       Documentation/git.txt
> >   M       GIT-VERSION-GEN
> >
> >   $ git show --name-status v2.9.3 -- Documentation
> >   M       Documentation/RelNotes/2.9.3.txt
> >   M       Documentation/git.txt
> >
> > That's typically less useful than it is with log (where limiting the
> > diff also kicks in history simplification and omits some commits
> > entirely). But it does do something.
> 
> I think Stefan is missing the fact that the argument to "git show
> <tree-ish>:<path>" actually is naming a blob that sits at the <path>
> in the <tree-ish>.  In other words, "show" is acting as a glorified
> "git -p cat-file blob", in that use.
> 
> The use of "git show" you are demonstrating is still about showing
> the commit object, whose behaviour is defined to show the log
> message and the diff relative to its sole parent, limited to the
> paths that match the pathspec.
> 
> It is perfectly fine and desirable that "git show <commit>:<path>"
> and "git show <commit> -- <path>" behaves differently.  These are
> two completely different features.

Thanks for explaining guys.  It all makes logical sense.  I just hope I
remember the distinctions in that table for everyday git use.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply

* Re: [RFC 1/2] grep: only add delimiter if there isn't one already
From: Stefan Hajnoczi @ 2017-01-23 13:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <xmqqlgu5y4u8.fsf@gitster.mtv.corp.google.com>

[-- Attachment #1: Type: text/plain, Size: 2866 bytes --]

On Fri, Jan 20, 2017 at 10:16:31AM -0800, Junio C Hamano wrote:
> Stefan Hajnoczi <stefanha@redhat.com> writes:
> 
> > v2.9.3::Makefile may convey that the user originally provided v2.9.3:
> > but is that actually useful information?
> 
> You are either asking a wrong question, or asking a wrong person
> (i.e. Git) the question.  The real question is why the user added a
> colon at the end, when "v2.9.3" alone would have sufficed.  What did
> the user want to get out of giving an extra colon like "v2.9.3:"?
> 
> One answer I can think of is that it is a degenerate case of [2/2],
> i.e. if "v2.9.3:t" were an appropriate way to look in the subtree
> "t" of "v2.9.3", "v2.9.3:" would be the appropriate way to look in
> the whole tree of "v2.9.3".
> 
> I understand, from your response to my comment in the thread for
> [2/2], that the reason why "v2.9.3:t" was used in the example was
> because you felt uncomfortable with using pathspec.  
> 
> That's superstition.
> 
> My only piece of advice to folks who feel that way is to learn Git
> more and get comfortable.  You can do neat things like
> 
>    $ git grep -e pattern rev -- t ':!t/helper/'
> 
> that you cannot do with "rev:t", for example ;-)

Neat, thanks for showing the path exclusion syntax.  I wasn't aware of
it.

> All examples we saw so far are the ones that the user used the colon
> syntax when it is more natural to express the command without it.  I
> am hesitant to see the behaviour of the command changed to appease
> such suboptimal use cases if it requires us to discard a bit of
> information, when we haven't established it is OK to lose.
> 
> There may be a case
> 
>  (1) where the colon syntax is the most natural thing to use, AND
>      script reading the output that used that syntax is forced to do
>      unnecessary work because "git grep" parrots the colon
>      literally, instread of being more intelligent about it
>      (i.e. omitting or substituting with slash when the input is a
>      tree object, not a tree-ish, as discussed in the thread).
> 
>  (2) where the colon syntax is the most natural thing, AND script
>      reading the output WANTS to see the distinction in the input
>      (remember, "git grep" can take more than one input tree).
> 
> We haven't seen either of the above two in the discussion, so the
> discussion so far is not sufficient to support the change being
> proposed in this RFC, which requires that it is safe to assume that
> (1) is the only case where the input is a raw tree (or the input
> contains a colon) and (2) will never happen.
> 
> So I am still mildly negative on the whole thing.

Avoiding the colon syntax avoids the whole issue for my use case.

I still think git-grep(1)'s output would be more useful if it used valid
git rev:path syntax in all cases.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply

* Re: [RFC] Case insensitive Git attributes
From: Lars Schneider @ 2017-01-24  9:49 UTC (permalink / raw)
  To: Junio C Hamano, git
  Cc: Dakota Hawkins, Duy Nguyen, Johannes Schindelin, Stefan Beller,
	drafnel, bmwill
In-Reply-To: <xmqqwpdlr2ht.fsf@gitster.mtv.corp.google.com>


> On 23 Jan 2017, at 20:38, Junio C Hamano <gitster@pobox.com> wrote:
> 
> Junio C Hamano <gitster@pobox.com> writes:
> 
>> So you are worried about the case where somebody on a case
>> insensitive but case preserving system would do
>> 
>>   $ edit file.txt
>>   $ edit .gitattributes
>>   $ git add file.txt .gitattributes
>> 
>> and adds "*.TXT	someattr=true" to the attributes file, which
>> would set someattr to true on his system for file.txt, but when the
>> result is checked out on a case sensitive system, it would behave
>> differently because "*.TXT" does not match "file.txt"?

Correct!


>> How do other systems address it?  Your Java, Ruby, etc. sources may
>> refer to another file with "import" and the derivation of the file
>> names from class names or package names would have the same issue,
>> isn't it?  Do they have an option that lets you say
>> 
>>   Even though the import statements may say "import a.b.C", we
>>   know that the source tarball was prepared on a case insensitive
>>   system, and I want you to look for a/b/C.java and a/b/c.java and
>>   use what was found.
>> 
>> or something like that?  Same for anything that records other
>> filenames in the content to refer to them, like "Makefile".
>> 
>> My knee jerk reaction to that is that .gitattributes and .gitignore
>> should not be instructed to go case insensitive on case sensitive
>> systems.  If the system is case insensitive but case preserving,
>> it probably would make sense not to do case insensitive matching,
>> which would prevent the issue from happening in the first place.
> 
> Sorry, but there is a slight leap in the above that makes it hard to
> track my thought, so let me clarify a bit.  
> 
> In the above, I am guessing the answer to the "How do other systems
> address it?" question to be "nothing".  And that leads to the
> conclusion that it is better to do "nothing on case sensitive
> systems, and probably become evem more strict on case insensitive
> but case preserving systems", because that will give us a chance to
> expose the problem earlier, hopefully even on the originating
> system.

I agree: Git attributes should behave the same on all platforms independent
of the file system type. I dug a bit deeper and realized that this is actually
already the case. However, the default (?) core.ignorecase=1 config on Win/Mac
generates the behavior explained above. I wonder if 6eba621 ("attr.c: respect 
core.ignorecase when matching attribute patterns", 2011-10-11) was a good idea.

AFAIK disabling core.ignorecase entirely on Win/Mac is no solution as this would
generate other trouble.

Git users can already create case insensitive gitattributes pattern. E.g.:
*.[tT][xX][tT]

However, based on my dayjob experience no Win/Mac developer does that as it
makes the gitattributes file unreadable. Consequently, Linux developers are 
screwed. Therefore, I wonder if it would make sense to introduce a shortcut
for the case insensitive glob pattern. E.g.:

*.txt ignorecase

If Git detects the ignorecase attribute then it could generate *.[tT][xX][tT]
automatically.


^ permalink raw reply

* Re: [PATCH 7/7] completion: recognize more long-options
From: Cornelius Weig @ 2017-01-24  8:14 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: bitte.keine.werbung.einwerfen, git, thomas.braun, john
In-Reply-To: <74ecd09c-55da-3858-5187-52c286a6bf62@kdbg.org>

On 01/24/2017 08:15 AM, Johannes Sixt wrote:
> If at all possible, please use your real email address as the From
> address. It is pointless to hide behind a fake address because as Git
> contributor you will have to reveal your identity anyway.

These are both real addresses, but for send-mail I would not want to use
my work account. I hope this is not a problem.

> Please study item (5) "Sign your work" in
> Documentation/SubmittingPatches and sign off your work.

I followed the recommendations to submitting work, and in the first
round signing is discouraged.

> AFAIR, it was a deliberate decision that potentially destructive command
> line options are not included in command completions. In the list given,
> I find these:
>
>>  - reset: --merge --mixed --hard --soft --patch --keep

My bad, I only added --keep, which should be fine. As to these options

>>  - apply: --unsafe-paths
>>  - rm: --force

let's wait for further comments, but I won't cling to it.

> Additionally, these options are only for internal purposes, but I may be
> wrong:
>
>>  - archive: --remote= --exec=

These may in fact be too exotic and just clutter the command line. Best
they are removed.

-- Cornelius

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox