Git development

Git development
 help / color / mirror / Atom feed

* [PATCH v4 1/3] update-unicode.sh: automatically download newer definition files
From: Beat Bolli @ 2016-12-03 21:00 UTC (permalink / raw)
  To: git; +Cc: Beat Bolli
In-Reply-To: <835c0328-e812-1cb7-c49e-714ff0e9ffb3@drbeat.li>

Checking just for the unicode data files' existence is not sufficient;
we should also download them if a newer version exists on the Unicode
consortium's servers. Option -N of wget does this nicely for us.

Reviewed-by: Torsten Boegershausen <tboegi@web.de>
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
---
Diff to v3:
  - change the Cc: into Reviewed-by: on Thorsten's request
  - include the old reroll diffs

Diff to v2:
  - reorder the commits: fix all of update-unicode.sh first, then
    regenerate unicode_width.h only once

Diff to v1:
  - reword the commit message
  - add Thorsten's Cc:

 update_unicode.sh | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/update_unicode.sh b/update_unicode.sh
index 27af77c..3c84270 100755
--- a/update_unicode.sh
+++ b/update_unicode.sh
@@ -10,12 +10,8 @@ if ! test -d unicode; then
 	mkdir unicode
 fi &&
 ( cd unicode &&
-	if ! test -f UnicodeData.txt; then
-		wget http://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt
-	fi &&
-	if ! test -f EastAsianWidth.txt; then
-		wget http://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt
-	fi &&
+	wget -N http://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt \
+		http://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt &&
 	if ! test -d uniset; then
 		git clone https://github.com/depp/uniset.git
 	fi &&
-- 
2.7.2

^ permalink raw reply related

* [PATCH v4 3/3] unicode_width.h: update the tables to Unicode 9.0
From: Beat Bolli @ 2016-12-03 21:00 UTC (permalink / raw)
  To: git; +Cc: Beat Bolli
In-Reply-To: <1480798849-13907-1-git-send-email-dev+git@drbeat.li>

Rerunning update-unicode.sh that we fixed in the two previous commits
produces these new tables.

Signed-off-by: Beat Bolli <dev+git@drbeat.li>
---
 unicode_width.h | 131 +++++++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 107 insertions(+), 24 deletions(-)

diff --git a/unicode_width.h b/unicode_width.h
index 47cdd23..02207be 100644
--- a/unicode_width.h
+++ b/unicode_width.h
@@ -25,7 +25,7 @@ static const struct interval zero_width[] = {
 { 0x0825, 0x0827 },
 { 0x0829, 0x082D },
 { 0x0859, 0x085B },
-{ 0x08E4, 0x0902 },
+{ 0x08D4, 0x0902 },
 { 0x093A, 0x093A },
 { 0x093C, 0x093C },
 { 0x0941, 0x0948 },
@@ -120,6 +120,7 @@ static const struct interval zero_width[] = {
 { 0x17C9, 0x17D3 },
 { 0x17DD, 0x17DD },
 { 0x180B, 0x180E },
+{ 0x1885, 0x1886 },
 { 0x18A9, 0x18A9 },
 { 0x1920, 0x1922 },
 { 0x1927, 0x1928 },
@@ -158,7 +159,7 @@ static const struct interval zero_width[] = {
 { 0x1CF4, 0x1CF4 },
 { 0x1CF8, 0x1CF9 },
 { 0x1DC0, 0x1DF5 },
-{ 0x1DFC, 0x1DFF },
+{ 0x1DFB, 0x1DFF },
 { 0x200B, 0x200F },
 { 0x202A, 0x202E },
 { 0x2060, 0x2064 },
@@ -171,13 +172,13 @@ static const struct interval zero_width[] = {
 { 0x3099, 0x309A },
 { 0xA66F, 0xA672 },
 { 0xA674, 0xA67D },
-{ 0xA69F, 0xA69F },
+{ 0xA69E, 0xA69F },
 { 0xA6F0, 0xA6F1 },
 { 0xA802, 0xA802 },
 { 0xA806, 0xA806 },
 { 0xA80B, 0xA80B },
 { 0xA825, 0xA826 },
-{ 0xA8C4, 0xA8C4 },
+{ 0xA8C4, 0xA8C5 },
 { 0xA8E0, 0xA8F1 },
 { 0xA926, 0xA92D },
 { 0xA947, 0xA951 },
@@ -204,7 +205,7 @@ static const struct interval zero_width[] = {
 { 0xABED, 0xABED },
 { 0xFB1E, 0xFB1E },
 { 0xFE00, 0xFE0F },
-{ 0xFE20, 0xFE2D },
+{ 0xFE20, 0xFE2F },
 { 0xFEFF, 0xFEFF },
 { 0xFFF9, 0xFFFB },
 { 0x101FD, 0x101FD },
@@ -228,16 +229,21 @@ static const struct interval zero_width[] = {
 { 0x11173, 0x11173 },
 { 0x11180, 0x11181 },
 { 0x111B6, 0x111BE },
+{ 0x111CA, 0x111CC },
 { 0x1122F, 0x11231 },
 { 0x11234, 0x11234 },
 { 0x11236, 0x11237 },
+{ 0x1123E, 0x1123E },
 { 0x112DF, 0x112DF },
 { 0x112E3, 0x112EA },
-{ 0x11301, 0x11301 },
+{ 0x11300, 0x11301 },
 { 0x1133C, 0x1133C },
 { 0x11340, 0x11340 },
 { 0x11366, 0x1136C },
 { 0x11370, 0x11374 },
+{ 0x11438, 0x1143F },
+{ 0x11442, 0x11444 },
+{ 0x11446, 0x11446 },
 { 0x114B3, 0x114B8 },
 { 0x114BA, 0x114BA },
 { 0x114BF, 0x114C0 },
@@ -245,6 +251,7 @@ static const struct interval zero_width[] = {
 { 0x115B2, 0x115B5 },
 { 0x115BC, 0x115BD },
 { 0x115BF, 0x115C0 },
+{ 0x115DC, 0x115DD },
 { 0x11633, 0x1163A },
 { 0x1163D, 0x1163D },
 { 0x1163F, 0x11640 },
@@ -252,6 +259,16 @@ static const struct interval zero_width[] = {
 { 0x116AD, 0x116AD },
 { 0x116B0, 0x116B5 },
 { 0x116B7, 0x116B7 },
+{ 0x1171D, 0x1171F },
+{ 0x11722, 0x11725 },
+{ 0x11727, 0x1172B },
+{ 0x11C30, 0x11C36 },
+{ 0x11C38, 0x11C3D },
+{ 0x11C3F, 0x11C3F },
+{ 0x11C92, 0x11CA7 },
+{ 0x11CAA, 0x11CB0 },
+{ 0x11CB2, 0x11CB3 },
+{ 0x11CB5, 0x11CB6 },
 { 0x16AF0, 0x16AF4 },
 { 0x16B30, 0x16B36 },
 { 0x16F8F, 0x16F92 },
@@ -262,31 +279,59 @@ static const struct interval zero_width[] = {
 { 0x1D185, 0x1D18B },
 { 0x1D1AA, 0x1D1AD },
 { 0x1D242, 0x1D244 },
+{ 0x1DA00, 0x1DA36 },
+{ 0x1DA3B, 0x1DA6C },
+{ 0x1DA75, 0x1DA75 },
+{ 0x1DA84, 0x1DA84 },
+{ 0x1DA9B, 0x1DA9F },
+{ 0x1DAA1, 0x1DAAF },
+{ 0x1E000, 0x1E006 },
+{ 0x1E008, 0x1E018 },
+{ 0x1E01B, 0x1E021 },
+{ 0x1E023, 0x1E024 },
+{ 0x1E026, 0x1E02A },
 { 0x1E8D0, 0x1E8D6 },
+{ 0x1E944, 0x1E94A },
 { 0xE0001, 0xE0001 },
 { 0xE0020, 0xE007F },
 { 0xE0100, 0xE01EF }
 };
 static const struct interval double_width[] = {
-{ /* plane */ 0x0, 0x1C },
-{ /* plane */ 0x1C, 0x21 },
-{ /* plane */ 0x21, 0x22 },
-{ /* plane */ 0x22, 0x23 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
 { 0x1100, 0x115F },
+{ 0x231A, 0x231B },
 { 0x2329, 0x232A },
+{ 0x23E9, 0x23EC },
+{ 0x23F0, 0x23F0 },
+{ 0x23F3, 0x23F3 },
+{ 0x25FD, 0x25FE },
+{ 0x2614, 0x2615 },
+{ 0x2648, 0x2653 },
+{ 0x267F, 0x267F },
+{ 0x2693, 0x2693 },
+{ 0x26A1, 0x26A1 },
+{ 0x26AA, 0x26AB },
+{ 0x26BD, 0x26BE },
+{ 0x26C4, 0x26C5 },
+{ 0x26CE, 0x26CE },
+{ 0x26D4, 0x26D4 },
+{ 0x26EA, 0x26EA },
+{ 0x26F2, 0x26F3 },
+{ 0x26F5, 0x26F5 },
+{ 0x26FA, 0x26FA },
+{ 0x26FD, 0x26FD },
+{ 0x2705, 0x2705 },
+{ 0x270A, 0x270B },
+{ 0x2728, 0x2728 },
+{ 0x274C, 0x274C },
+{ 0x274E, 0x274E },
+{ 0x2753, 0x2755 },
+{ 0x2757, 0x2757 },
+{ 0x2795, 0x2797 },
+{ 0x27B0, 0x27B0 },
+{ 0x27BF, 0x27BF },
+{ 0x2B1B, 0x2B1C },
+{ 0x2B50, 0x2B50 },
+{ 0x2B55, 0x2B55 },
 { 0x2E80, 0x2E99 },
 { 0x2E9B, 0x2EF3 },
 { 0x2F00, 0x2FD5 },
@@ -313,11 +358,49 @@ static const struct interval double_width[] = {
 { 0xFE68, 0xFE6B },
 { 0xFF01, 0xFF60 },
 { 0xFFE0, 0xFFE6 },
+{ 0x16FE0, 0x16FE0 },
+{ 0x17000, 0x187EC },
+{ 0x18800, 0x18AF2 },
 { 0x1B000, 0x1B001 },
+{ 0x1F004, 0x1F004 },
+{ 0x1F0CF, 0x1F0CF },
+{ 0x1F18E, 0x1F18E },
+{ 0x1F191, 0x1F19A },
 { 0x1F200, 0x1F202 },
-{ 0x1F210, 0x1F23A },
+{ 0x1F210, 0x1F23B },
 { 0x1F240, 0x1F248 },
 { 0x1F250, 0x1F251 },
+{ 0x1F300, 0x1F320 },
+{ 0x1F32D, 0x1F335 },
+{ 0x1F337, 0x1F37C },
+{ 0x1F37E, 0x1F393 },
+{ 0x1F3A0, 0x1F3CA },
+{ 0x1F3CF, 0x1F3D3 },
+{ 0x1F3E0, 0x1F3F0 },
+{ 0x1F3F4, 0x1F3F4 },
+{ 0x1F3F8, 0x1F43E },
+{ 0x1F440, 0x1F440 },
+{ 0x1F442, 0x1F4FC },
+{ 0x1F4FF, 0x1F53D },
+{ 0x1F54B, 0x1F54E },
+{ 0x1F550, 0x1F567 },
+{ 0x1F57A, 0x1F57A },
+{ 0x1F595, 0x1F596 },
+{ 0x1F5A4, 0x1F5A4 },
+{ 0x1F5FB, 0x1F64F },
+{ 0x1F680, 0x1F6C5 },
+{ 0x1F6CC, 0x1F6CC },
+{ 0x1F6D0, 0x1F6D2 },
+{ 0x1F6EB, 0x1F6EC },
+{ 0x1F6F4, 0x1F6F6 },
+{ 0x1F910, 0x1F91E },
+{ 0x1F920, 0x1F927 },
+{ 0x1F930, 0x1F930 },
+{ 0x1F933, 0x1F93E },
+{ 0x1F940, 0x1F94B },
+{ 0x1F950, 0x1F95E },
+{ 0x1F980, 0x1F991 },
+{ 0x1F9C0, 0x1F9C0 },
 { 0x20000, 0x2FFFD },
 { 0x30000, 0x3FFFD }
 };
-- 
2.7.2

^ permalink raw reply related

* [PATCH v4 2/3] update-unicode.sh: strip the plane offsets from the double_width[] table
From: Beat Bolli @ 2016-12-03 21:00 UTC (permalink / raw)
  To: git; +Cc: Beat Bolli
In-Reply-To: <1480798849-13907-1-git-send-email-dev+git@drbeat.li>

The function bisearch() in utf8.c does a pure binary search in
double_width. It does not care about the 17 plane offsets which
unicode/uniset/uniset prepends. Leaving the plane offsets in the table
may cause wrong results.

Filter out the plane offsets in update-unicode.sh.

Reviewed-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
---
 update_unicode.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/update_unicode.sh b/update_unicode.sh
index 3c84270..4c1ec8d 100755
--- a/update_unicode.sh
+++ b/update_unicode.sh
@@ -30,7 +30,7 @@ fi &&
 		  grep -v plane)
 	};
 	static const struct interval double_width[] = {
-		$(uniset/uniset --32 eaw:F,W)
+		$(uniset/uniset --32 eaw:F,W | grep -v plane)
 	};
 	EOF
 )
-- 
2.7.2

^ permalink raw reply related

* [PATCH] docs: warn about possible '=' in clean/smudge filter process values
From: larsxschneider @ 2016-12-03 19:45 UTC (permalink / raw)
  To: git; +Cc: Lars Schneider

From: Lars Schneider <larsxschneider@gmail.com>

A pathname value in a clean/smudge filter process "key=value" pair can
contain the '=' character (introduced in edcc858). Make the user aware
of this issue in the docs, add a corresponding test case, and fix the
issue in filter process value parser of the example implementation in
contrib.

Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
---
 Documentation/gitattributes.txt        |  4 +++-
 contrib/long-running-filter/example.pl |  8 ++++++--
 t/t0021-conversion.sh                  | 20 ++++++++++----------
 t/t0021/rot13-filter.pl                |  8 ++++++--
 4 files changed, 25 insertions(+), 15 deletions(-)

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 976243a63e..e0b66c1220 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -435,7 +435,9 @@ to filter relative to the repository root. Right after the flush packet
 Git sends the content split in zero or more pkt-line packets and a
 flush packet to terminate content. Please note, that the filter
 must not send any response before it received the content and the
-final flush packet.
+final flush packet. Also note that the "value" of a "key=value" pair
+can contain the "=" character whereas the key would never contain
+that character.
 ------------------------
 packet:          git> command=smudge
 packet:          git> pathname=path/testfile.dat
diff --git a/contrib/long-running-filter/example.pl b/contrib/long-running-filter/example.pl
index 39457055a5..a677569ddd 100755
--- a/contrib/long-running-filter/example.pl
+++ b/contrib/long-running-filter/example.pl
@@ -81,8 +81,12 @@ packet_txt_write("capability=smudge");
 packet_flush();

 while (1) {
-	my ($command)  = packet_txt_read() =~ /^command=([^=]+)$/;
-	my ($pathname) = packet_txt_read() =~ /^pathname=([^=]+)$/;
+	my ($command)  = packet_txt_read() =~ /^command=(.+)$/;
+	my ($pathname) = packet_txt_read() =~ /^pathname=(.+)$/;
+
+	if ( $pathname eq "" ) {
+		die "bad pathname '$pathname'";
+	}

 	packet_bin_read();

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 4ea534e9fa..f3a0df2add 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -93,7 +93,7 @@ test_expect_success setup '
 	git checkout -- test test.t test.i &&

 	echo "content-test2" >test2.o &&
-	echo "content-test3 - filename with special characters" >"test3 '\''sq'\'',\$x.o"
+	echo "content-test3 - filename with special characters" >"test3 '\''sq'\'',\$x=.o"
 '

 script='s/^\$Id: \([0-9a-f]*\) \$/\1/p'
@@ -359,12 +359,12 @@ test_expect_success PERL 'required process filter should filter data' '
 		cp "$TEST_ROOT/test.o" test.r &&
 		cp "$TEST_ROOT/test2.o" test2.r &&
 		mkdir testsubdir &&
-		cp "$TEST_ROOT/test3 '\''sq'\'',\$x.o" "testsubdir/test3 '\''sq'\'',\$x.r" &&
+		cp "$TEST_ROOT/test3 '\''sq'\'',\$x=.o" "testsubdir/test3 '\''sq'\'',\$x=.r" &&
 		>test4-empty.r &&

 		S=$(file_size test.r) &&
 		S2=$(file_size test2.r) &&
-		S3=$(file_size "testsubdir/test3 '\''sq'\'',\$x.r") &&
+		S3=$(file_size "testsubdir/test3 '\''sq'\'',\$x=.r") &&

 		filter_git add . &&
 		cat >expected.log <<-EOF &&
@@ -373,7 +373,7 @@ test_expect_success PERL 'required process filter should filter data' '
 			IN: clean test.r $S [OK] -- OUT: $S . [OK]
 			IN: clean test2.r $S2 [OK] -- OUT: $S2 . [OK]
 			IN: clean test4-empty.r 0 [OK] -- OUT: 0  [OK]
-			IN: clean testsubdir/test3 '\''sq'\'',\$x.r $S3 [OK] -- OUT: $S3 . [OK]
+			IN: clean testsubdir/test3 '\''sq'\'',\$x=.r $S3 [OK] -- OUT: $S3 . [OK]
 			STOP
 		EOF
 		test_cmp_count expected.log rot13-filter.log &&
@@ -385,23 +385,23 @@ test_expect_success PERL 'required process filter should filter data' '
 			IN: clean test.r $S [OK] -- OUT: $S . [OK]
 			IN: clean test2.r $S2 [OK] -- OUT: $S2 . [OK]
 			IN: clean test4-empty.r 0 [OK] -- OUT: 0  [OK]
-			IN: clean testsubdir/test3 '\''sq'\'',\$x.r $S3 [OK] -- OUT: $S3 . [OK]
+			IN: clean testsubdir/test3 '\''sq'\'',\$x=.r $S3 [OK] -- OUT: $S3 . [OK]
 			IN: clean test.r $S [OK] -- OUT: $S . [OK]
 			IN: clean test2.r $S2 [OK] -- OUT: $S2 . [OK]
 			IN: clean test4-empty.r 0 [OK] -- OUT: 0  [OK]
-			IN: clean testsubdir/test3 '\''sq'\'',\$x.r $S3 [OK] -- OUT: $S3 . [OK]
+			IN: clean testsubdir/test3 '\''sq'\'',\$x=.r $S3 [OK] -- OUT: $S3 . [OK]
 			STOP
 		EOF
 		test_cmp_count expected.log rot13-filter.log &&

-		rm -f test2.r "testsubdir/test3 '\''sq'\'',\$x.r" &&
+		rm -f test2.r "testsubdir/test3 '\''sq'\'',\$x=.r" &&

 		filter_git checkout --quiet --no-progress . &&
 		cat >expected.log <<-EOF &&
 			START
 			init handshake complete
 			IN: smudge test2.r $S2 [OK] -- OUT: $S2 . [OK]
-			IN: smudge testsubdir/test3 '\''sq'\'',\$x.r $S3 [OK] -- OUT: $S3 . [OK]
+			IN: smudge testsubdir/test3 '\''sq'\'',\$x=.r $S3 [OK] -- OUT: $S3 . [OK]
 			STOP
 		EOF
 		test_cmp_exclude_clean expected.log rot13-filter.log &&
@@ -422,14 +422,14 @@ test_expect_success PERL 'required process filter should filter data' '
 			IN: smudge test.r $S [OK] -- OUT: $S . [OK]
 			IN: smudge test2.r $S2 [OK] -- OUT: $S2 . [OK]
 			IN: smudge test4-empty.r 0 [OK] -- OUT: 0  [OK]
-			IN: smudge testsubdir/test3 '\''sq'\'',\$x.r $S3 [OK] -- OUT: $S3 . [OK]
+			IN: smudge testsubdir/test3 '\''sq'\'',\$x=.r $S3 [OK] -- OUT: $S3 . [OK]
 			STOP
 		EOF
 		test_cmp_exclude_clean expected.log rot13-filter.log &&

 		test_cmp_committed_rot13 "$TEST_ROOT/test.o" test.r &&
 		test_cmp_committed_rot13 "$TEST_ROOT/test2.o" test2.r &&
-		test_cmp_committed_rot13 "$TEST_ROOT/test3 '\''sq'\'',\$x.o" "testsubdir/test3 '\''sq'\'',\$x.r"
+		test_cmp_committed_rot13 "$TEST_ROOT/test3 '\''sq'\'',\$x=.o" "testsubdir/test3 '\''sq'\'',\$x=.r"
 	)
 '

diff --git a/t/t0021/rot13-filter.pl b/t/t0021/rot13-filter.pl
index 4d5697ee51..617f581e56 100644
--- a/t/t0021/rot13-filter.pl
+++ b/t/t0021/rot13-filter.pl
@@ -109,14 +109,18 @@ print $debug "init handshake complete\n";
 $debug->flush();

 while (1) {
-	my ($command) = packet_txt_read() =~ /^command=([^=]+)$/;
+	my ($command) = packet_txt_read() =~ /^command=(.+)$/;
 	print $debug "IN: $command";
 	$debug->flush();

-	my ($pathname) = packet_txt_read() =~ /^pathname=([^=]+)$/;
+	my ($pathname) = packet_txt_read() =~ /^pathname=(.+)$/;
 	print $debug " $pathname";
 	$debug->flush();

+	if ( $pathname eq "" ) {
+		die "bad pathname '$pathname'";
+	}
+
 	# Flush
 	packet_bin_read();

--
2.11.0


^ permalink raw reply related

* Re: [RFC/PATCH v3 00/16] Add initial experimental external ODB support
From: Lars Schneider @ 2016-12-03 18:47 UTC (permalink / raw)
  To: Christian Couder
  Cc: git, Junio C Hamano, Jeff King, Nguyen Thai Ngoc Duy, Mike Hommey,
	Eric Wong, Christian Couder
In-Reply-To: <20161130210420.15982-1-chriscool@tuxfamily.org>


> On 30 Nov 2016, at 22:04, Christian Couder <christian.couder@gmail.com> wrote:
> 
> Goal
> ~~~~
> 
> Git can store its objects only in the form of loose objects in
> separate files or packed objects in a pack file.
> 
> To be able to better handle some kind of objects, for example big
> blobs, it would be nice if Git could store its objects in other object
> databases (ODB).

This is a great goal. I really hope we can use that to solve the
pain points in the current Git <--> GitLFS integration!
Thanks for working on this!

Minor nit: I feel the term "other" could be more expressive. Plus
"database" might confuse people. What do you think about
"External Object Storage" or something?


> Design
> ~~~~~~
> 
>  - "<command> have": the command should output the sha1, size and
> type of all the objects the external ODB contains, one object per
> line.

This looks impractical. If a repo has 10k external files with
100 versions each then you need to read/transfer 1m hashes (this is
not made up - I am working with Git repos than contain >>10k files
in GitLFS).

Wouldn't it be better if Git collects all hashes that it currently 
needs and then asks the external ODBs if they have them?


>  - "<command> get <sha1>": the command should then read from the
> external ODB the content of the object corresponding to <sha1> and
> output it on stdout.
> 
>  - "<command> put <sha1> <size> <type>": the command should then read
> from stdin an object and store it in the external ODB.

Based on my experience with Git clean/smudge filters I think this kind 
of single shot protocol will be a performance bottleneck as soon as 
people store more than >1000 files in the external ODB.
Maybe you can reuse my "filter process protocol" (edcc858) here?


> * Transfer
> 
> To tranfer information about the blobs stored in external ODB, some
> special refs, called "odb ref", similar as replace refs, are used.
> 
> For now there should be one odb ref per blob. Each ref name should be
> refs/odbs/<odbname>/<sha1> where <sha1> is the sha1 of the blob stored
> in the external odb named <odbname>.
> 
> These odb refs should all point to a blob that should be stored in the
> Git repository and contain information about the blob stored in the
> external odb. This information can be specific to the external odb.
> The repos can then share this information using commands like:
> 
> `git fetch origin "refs/odbs/<odbname>/*:refs/odbs/<odbname>/*"`

The "odbref" would point to a blob and the blob could contain anything,
right? E.g. it could contain an existing GitLFS pointer, right?

version https://git-lfs.github.com/spec/v1
oid sha256:4d7a214614ab2935c943f9e0ff69d22eadbb8f32b1258daaa5e2ca24d17e2393
size 12345


> Design discussion about performance
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> Yeah, it is not efficient to fork/exec a command to just read or write
> one object to or from the external ODB. Batch calls and/or using a
> daemon and/or RPC should be used instead to be able to store regular
> objects in an external ODB. But for now the external ODB would be all
> about really big files, where the cost of a fork+exec should not
> matter much. If we later want to extend usage of external ODBs, yeah
> we will probably need to design other mechanisms.

I think we should leverage the learnings from GitLFS as much as possible.
My learnings are:

(1) Fork/exec per object won't work. People have lots and lots of content
    that is not suited for Git (e.g. integration test data, images, ...).

(2) We need a good UI. I think it would be great if the average user would 
    not even need to know about ODB. Moving files explicitly with a "put"
    command seems unpractical to me. GitLFS tracks files via filename and
    that has a number of drawbacks, too. Do you see a way to define a 
    customizable metric such as "move all files to ODB X that are gzip 
    compressed larger than Y"?


> Future work
> ~~~~~~~~~~~
> 
> I think that the odb refs don't prevent a regular fetch or push from
> wanting to send the objects that are managed by an external odb. So I
> am interested in suggestions about this problem. I will take a look at
> previous discussions and how other mechanisms (shallow clone, bundle
> v3, ...) handle this.

If the ODB configuration is stored in the Git repo similar to
.gitmodules then every client that clones ODB references would be able
to resolve them, right?

Cheers,
Lars


^ permalink raw reply

* Re: [PATCH v3 1/3] update-unicode.sh: automatically download newer definition files
From: Beat Bolli @ 2016-12-03 16:41 UTC (permalink / raw)
  To: Torsten =?unknown-8bit?Q?B=C3=B6gershausen?=; +Cc: git
In-Reply-To: <20161203164049.GA31244@tb-raspi>

On 03.12.16 17:40, Torsten =?unknown-8bit?Q?B=C3=B6gershausen?= wrote:
> On Sat, Dec 03, 2016 at 02:19:31PM +0100, Beat Bolli wrote:
>> Checking just for the unicode data files' existence is not sufficient;
>> we should also download them if a newer version exists on the Unicode
>> consortium's servers. Option -N of wget does this nicely for us.
>>
>> Cc: Torsten B??gershausen <tboegi@web.de>
> 
> The V3 series makes perfect sense, thanks for cleaning up my mess.
Yeah, it took me three tries, too :-)

> (And can we remove the Cc: line, or replace with it Reviewed-by ?)
If you prefer, sure.

Do you have any other comments?

Beat

^ permalink raw reply

* Re: [PATCH v3 1/3] update-unicode.sh: automatically download newer definition files
From: Torsten =?unknown-8bit?Q?B=C3=B6gershausen?= @ 2016-12-03 16:40 UTC (permalink / raw)
  To: Beat Bolli; +Cc: git
In-Reply-To: <1480771173-731-1-git-send-email-dev+git@drbeat.li>

On Sat, Dec 03, 2016 at 02:19:31PM +0100, Beat Bolli wrote:
> Checking just for the unicode data files' existence is not sufficient;
> we should also download them if a newer version exists on the Unicode
> consortium's servers. Option -N of wget does this nicely for us.
> 
> Cc: Torsten B??gershausen <tboegi@web.de>

The V3 series makes perfect sense, thanks for cleaning up my mess.
(And can we remove the Cc: line, or replace with it Reviewed-by ?)

^ permalink raw reply

* Re: [PATCH] commit: make --only --allow-empty work without paths
From: Jeff King @ 2016-12-03 16:23 UTC (permalink / raw)
  To: Andreas Krey; +Cc: git, Junio C Hamano
In-Reply-To: <20161203065949.GG19570@inner.h.apk.li>

On Sat, Dec 03, 2016 at 07:59:49AM +0100, Andreas Krey wrote:

> > OK. I'm not sure why you would want to create an empty commit in such a
> > case.
> 
> User: Ok tool, make me a pullreq.
> 
> Tool: But you haven't mentioned any issue
>       in your commit messages. Which are they?
> 
> User: Ok, that would be A-123.
> 
> Tool: git commit --allow-empty -m 'FIX: A-123'

OK. I think "tool" is slightly funny here, but I get that is part of the
real world works. Thanks for illustrating.

> > Yes, I think --run is a misfeature (I actually had to look it up, as I
> ...
> > implicit. If a single test script is annoyingly long to run, I'd argue
> 
> It wasn't about runtime but about output. I would have
> liked to see only the output of my still-failing test;
> a 'stop after test X' would be helpful there.

You can do --verbose-only=<n>, but if the test is failing, I typically
use "-v -i". That makes everything verbose, and then stops at the
failing test, so you can see the output easily.

-Peff

^ permalink raw reply

* [PATCH v3 3/3] unicode_width.h: update the tables to Unicode 9.0
From: Beat Bolli @ 2016-12-03 13:19 UTC (permalink / raw)
  To: git; +Cc: Beat Bolli
In-Reply-To: <1480771173-731-1-git-send-email-dev+git@drbeat.li>

Rerunning update-unicode.sh that we fixed in the two previous commits
produces these new tables.

Signed-off-by: Beat Bolli <dev+git@drbeat.li>
---
 unicode_width.h | 131 +++++++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 107 insertions(+), 24 deletions(-)

diff --git a/unicode_width.h b/unicode_width.h
index 47cdd23..02207be 100644
--- a/unicode_width.h
+++ b/unicode_width.h
@@ -25,7 +25,7 @@ static const struct interval zero_width[] = {
 { 0x0825, 0x0827 },
 { 0x0829, 0x082D },
 { 0x0859, 0x085B },
-{ 0x08E4, 0x0902 },
+{ 0x08D4, 0x0902 },
 { 0x093A, 0x093A },
 { 0x093C, 0x093C },
 { 0x0941, 0x0948 },
@@ -120,6 +120,7 @@ static const struct interval zero_width[] = {
 { 0x17C9, 0x17D3 },
 { 0x17DD, 0x17DD },
 { 0x180B, 0x180E },
+{ 0x1885, 0x1886 },
 { 0x18A9, 0x18A9 },
 { 0x1920, 0x1922 },
 { 0x1927, 0x1928 },
@@ -158,7 +159,7 @@ static const struct interval zero_width[] = {
 { 0x1CF4, 0x1CF4 },
 { 0x1CF8, 0x1CF9 },
 { 0x1DC0, 0x1DF5 },
-{ 0x1DFC, 0x1DFF },
+{ 0x1DFB, 0x1DFF },
 { 0x200B, 0x200F },
 { 0x202A, 0x202E },
 { 0x2060, 0x2064 },
@@ -171,13 +172,13 @@ static const struct interval zero_width[] = {
 { 0x3099, 0x309A },
 { 0xA66F, 0xA672 },
 { 0xA674, 0xA67D },
-{ 0xA69F, 0xA69F },
+{ 0xA69E, 0xA69F },
 { 0xA6F0, 0xA6F1 },
 { 0xA802, 0xA802 },
 { 0xA806, 0xA806 },
 { 0xA80B, 0xA80B },
 { 0xA825, 0xA826 },
-{ 0xA8C4, 0xA8C4 },
+{ 0xA8C4, 0xA8C5 },
 { 0xA8E0, 0xA8F1 },
 { 0xA926, 0xA92D },
 { 0xA947, 0xA951 },
@@ -204,7 +205,7 @@ static const struct interval zero_width[] = {
 { 0xABED, 0xABED },
 { 0xFB1E, 0xFB1E },
 { 0xFE00, 0xFE0F },
-{ 0xFE20, 0xFE2D },
+{ 0xFE20, 0xFE2F },
 { 0xFEFF, 0xFEFF },
 { 0xFFF9, 0xFFFB },
 { 0x101FD, 0x101FD },
@@ -228,16 +229,21 @@ static const struct interval zero_width[] = {
 { 0x11173, 0x11173 },
 { 0x11180, 0x11181 },
 { 0x111B6, 0x111BE },
+{ 0x111CA, 0x111CC },
 { 0x1122F, 0x11231 },
 { 0x11234, 0x11234 },
 { 0x11236, 0x11237 },
+{ 0x1123E, 0x1123E },
 { 0x112DF, 0x112DF },
 { 0x112E3, 0x112EA },
-{ 0x11301, 0x11301 },
+{ 0x11300, 0x11301 },
 { 0x1133C, 0x1133C },
 { 0x11340, 0x11340 },
 { 0x11366, 0x1136C },
 { 0x11370, 0x11374 },
+{ 0x11438, 0x1143F },
+{ 0x11442, 0x11444 },
+{ 0x11446, 0x11446 },
 { 0x114B3, 0x114B8 },
 { 0x114BA, 0x114BA },
 { 0x114BF, 0x114C0 },
@@ -245,6 +251,7 @@ static const struct interval zero_width[] = {
 { 0x115B2, 0x115B5 },
 { 0x115BC, 0x115BD },
 { 0x115BF, 0x115C0 },
+{ 0x115DC, 0x115DD },
 { 0x11633, 0x1163A },
 { 0x1163D, 0x1163D },
 { 0x1163F, 0x11640 },
@@ -252,6 +259,16 @@ static const struct interval zero_width[] = {
 { 0x116AD, 0x116AD },
 { 0x116B0, 0x116B5 },
 { 0x116B7, 0x116B7 },
+{ 0x1171D, 0x1171F },
+{ 0x11722, 0x11725 },
+{ 0x11727, 0x1172B },
+{ 0x11C30, 0x11C36 },
+{ 0x11C38, 0x11C3D },
+{ 0x11C3F, 0x11C3F },
+{ 0x11C92, 0x11CA7 },
+{ 0x11CAA, 0x11CB0 },
+{ 0x11CB2, 0x11CB3 },
+{ 0x11CB5, 0x11CB6 },
 { 0x16AF0, 0x16AF4 },
 { 0x16B30, 0x16B36 },
 { 0x16F8F, 0x16F92 },
@@ -262,31 +279,59 @@ static const struct interval zero_width[] = {
 { 0x1D185, 0x1D18B },
 { 0x1D1AA, 0x1D1AD },
 { 0x1D242, 0x1D244 },
+{ 0x1DA00, 0x1DA36 },
+{ 0x1DA3B, 0x1DA6C },
+{ 0x1DA75, 0x1DA75 },
+{ 0x1DA84, 0x1DA84 },
+{ 0x1DA9B, 0x1DA9F },
+{ 0x1DAA1, 0x1DAAF },
+{ 0x1E000, 0x1E006 },
+{ 0x1E008, 0x1E018 },
+{ 0x1E01B, 0x1E021 },
+{ 0x1E023, 0x1E024 },
+{ 0x1E026, 0x1E02A },
 { 0x1E8D0, 0x1E8D6 },
+{ 0x1E944, 0x1E94A },
 { 0xE0001, 0xE0001 },
 { 0xE0020, 0xE007F },
 { 0xE0100, 0xE01EF }
 };
 static const struct interval double_width[] = {
-{ /* plane */ 0x0, 0x1C },
-{ /* plane */ 0x1C, 0x21 },
-{ /* plane */ 0x21, 0x22 },
-{ /* plane */ 0x22, 0x23 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
 { 0x1100, 0x115F },
+{ 0x231A, 0x231B },
 { 0x2329, 0x232A },
+{ 0x23E9, 0x23EC },
+{ 0x23F0, 0x23F0 },
+{ 0x23F3, 0x23F3 },
+{ 0x25FD, 0x25FE },
+{ 0x2614, 0x2615 },
+{ 0x2648, 0x2653 },
+{ 0x267F, 0x267F },
+{ 0x2693, 0x2693 },
+{ 0x26A1, 0x26A1 },
+{ 0x26AA, 0x26AB },
+{ 0x26BD, 0x26BE },
+{ 0x26C4, 0x26C5 },
+{ 0x26CE, 0x26CE },
+{ 0x26D4, 0x26D4 },
+{ 0x26EA, 0x26EA },
+{ 0x26F2, 0x26F3 },
+{ 0x26F5, 0x26F5 },
+{ 0x26FA, 0x26FA },
+{ 0x26FD, 0x26FD },
+{ 0x2705, 0x2705 },
+{ 0x270A, 0x270B },
+{ 0x2728, 0x2728 },
+{ 0x274C, 0x274C },
+{ 0x274E, 0x274E },
+{ 0x2753, 0x2755 },
+{ 0x2757, 0x2757 },
+{ 0x2795, 0x2797 },
+{ 0x27B0, 0x27B0 },
+{ 0x27BF, 0x27BF },
+{ 0x2B1B, 0x2B1C },
+{ 0x2B50, 0x2B50 },
+{ 0x2B55, 0x2B55 },
 { 0x2E80, 0x2E99 },
 { 0x2E9B, 0x2EF3 },
 { 0x2F00, 0x2FD5 },
@@ -313,11 +358,49 @@ static const struct interval double_width[] = {
 { 0xFE68, 0xFE6B },
 { 0xFF01, 0xFF60 },
 { 0xFFE0, 0xFFE6 },
+{ 0x16FE0, 0x16FE0 },
+{ 0x17000, 0x187EC },
+{ 0x18800, 0x18AF2 },
 { 0x1B000, 0x1B001 },
+{ 0x1F004, 0x1F004 },
+{ 0x1F0CF, 0x1F0CF },
+{ 0x1F18E, 0x1F18E },
+{ 0x1F191, 0x1F19A },
 { 0x1F200, 0x1F202 },
-{ 0x1F210, 0x1F23A },
+{ 0x1F210, 0x1F23B },
 { 0x1F240, 0x1F248 },
 { 0x1F250, 0x1F251 },
+{ 0x1F300, 0x1F320 },
+{ 0x1F32D, 0x1F335 },
+{ 0x1F337, 0x1F37C },
+{ 0x1F37E, 0x1F393 },
+{ 0x1F3A0, 0x1F3CA },
+{ 0x1F3CF, 0x1F3D3 },
+{ 0x1F3E0, 0x1F3F0 },
+{ 0x1F3F4, 0x1F3F4 },
+{ 0x1F3F8, 0x1F43E },
+{ 0x1F440, 0x1F440 },
+{ 0x1F442, 0x1F4FC },
+{ 0x1F4FF, 0x1F53D },
+{ 0x1F54B, 0x1F54E },
+{ 0x1F550, 0x1F567 },
+{ 0x1F57A, 0x1F57A },
+{ 0x1F595, 0x1F596 },
+{ 0x1F5A4, 0x1F5A4 },
+{ 0x1F5FB, 0x1F64F },
+{ 0x1F680, 0x1F6C5 },
+{ 0x1F6CC, 0x1F6CC },
+{ 0x1F6D0, 0x1F6D2 },
+{ 0x1F6EB, 0x1F6EC },
+{ 0x1F6F4, 0x1F6F6 },
+{ 0x1F910, 0x1F91E },
+{ 0x1F920, 0x1F927 },
+{ 0x1F930, 0x1F930 },
+{ 0x1F933, 0x1F93E },
+{ 0x1F940, 0x1F94B },
+{ 0x1F950, 0x1F95E },
+{ 0x1F980, 0x1F991 },
+{ 0x1F9C0, 0x1F9C0 },
 { 0x20000, 0x2FFFD },
 { 0x30000, 0x3FFFD }
 };
-- 
2.7.2

^ permalink raw reply related

* [PATCH v3 1/3] update-unicode.sh: automatically download newer definition files
From: Beat Bolli @ 2016-12-03 13:19 UTC (permalink / raw)
  To: git; +Cc: Beat Bolli, Torsten Bögershausen
In-Reply-To: <1480762392-28731-3-git-send-email-dev+git@drbeat.li>

Checking just for the unicode data files' existence is not sufficient;
we should also download them if a newer version exists on the Unicode
consortium's servers. Option -N of wget does this nicely for us.

Cc: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
---
Diff to v2:
  - reorder the commits: fix all of update-unicode.sh first, then
    regenerate unicode_width.h only once

 update_unicode.sh | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/update_unicode.sh b/update_unicode.sh
index 27af77c..3c84270 100755
--- a/update_unicode.sh
+++ b/update_unicode.sh
@@ -10,12 +10,8 @@ if ! test -d unicode; then
 	mkdir unicode
 fi &&
 ( cd unicode &&
-	if ! test -f UnicodeData.txt; then
-		wget http://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt
-	fi &&
-	if ! test -f EastAsianWidth.txt; then
-		wget http://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt
-	fi &&
+	wget -N http://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt \
+		http://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt &&
 	if ! test -d uniset; then
 		git clone https://github.com/depp/uniset.git
 	fi &&
-- 
2.7.2

^ permalink raw reply related

* [PATCH v3 2/3] update-unicode.sh: strip the plane offsets from the double_width[] table
From: Beat Bolli @ 2016-12-03 13:19 UTC (permalink / raw)
  To: git; +Cc: Beat Bolli, Torsten Bögershausen
In-Reply-To: <1480771173-731-1-git-send-email-dev+git@drbeat.li>

The function bisearch() in utf8.c does a pure binary search in
double_width. It does not care about the 17 plane offsets which
unicode/uniset/uniset prepends. Leaving the plane offsets in the table
may cause wrong results.

Filter out the plane offsets in update-unicode.sh.

Cc: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
---
 update_unicode.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/update_unicode.sh b/update_unicode.sh
index 3c84270..4c1ec8d 100755
--- a/update_unicode.sh
+++ b/update_unicode.sh
@@ -30,7 +30,7 @@ fi &&
 		  grep -v plane)
 	};
 	static const struct interval double_width[] = {
-		$(uniset/uniset --32 eaw:F,W)
+		$(uniset/uniset --32 eaw:F,W | grep -v plane)
 	};
 	EOF
 )
-- 
2.7.2

^ permalink raw reply related

* [PATCH v2 2/3] unicode_width.h: update the tables to Unicode 9.0
From: Beat Bolli @ 2016-12-03 10:53 UTC (permalink / raw)
  To: git; +Cc: Beat Bolli
In-Reply-To: <1480762392-28731-1-git-send-email-dev+git@drbeat.li>

Rerunning update-unicode.sh fixed in the previous commit produces these new
tables.

Signed-off-by: Beat Bolli <dev+git@drbeat.li>
---
Diff to v1:
  - reword the commit message

 unicode_width.h | 122 +++++++++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 111 insertions(+), 11 deletions(-)

diff --git a/unicode_width.h b/unicode_width.h
index 47cdd23..73b5fd6 100644
--- a/unicode_width.h
+++ b/unicode_width.h
@@ -25,7 +25,7 @@ static const struct interval zero_width[] = {
 { 0x0825, 0x0827 },
 { 0x0829, 0x082D },
 { 0x0859, 0x085B },
-{ 0x08E4, 0x0902 },
+{ 0x08D4, 0x0902 },
 { 0x093A, 0x093A },
 { 0x093C, 0x093C },
 { 0x0941, 0x0948 },
@@ -120,6 +120,7 @@ static const struct interval zero_width[] = {
 { 0x17C9, 0x17D3 },
 { 0x17DD, 0x17DD },
 { 0x180B, 0x180E },
+{ 0x1885, 0x1886 },
 { 0x18A9, 0x18A9 },
 { 0x1920, 0x1922 },
 { 0x1927, 0x1928 },
@@ -158,7 +159,7 @@ static const struct interval zero_width[] = {
 { 0x1CF4, 0x1CF4 },
 { 0x1CF8, 0x1CF9 },
 { 0x1DC0, 0x1DF5 },
-{ 0x1DFC, 0x1DFF },
+{ 0x1DFB, 0x1DFF },
 { 0x200B, 0x200F },
 { 0x202A, 0x202E },
 { 0x2060, 0x2064 },
@@ -171,13 +172,13 @@ static const struct interval zero_width[] = {
 { 0x3099, 0x309A },
 { 0xA66F, 0xA672 },
 { 0xA674, 0xA67D },
-{ 0xA69F, 0xA69F },
+{ 0xA69E, 0xA69F },
 { 0xA6F0, 0xA6F1 },
 { 0xA802, 0xA802 },
 { 0xA806, 0xA806 },
 { 0xA80B, 0xA80B },
 { 0xA825, 0xA826 },
-{ 0xA8C4, 0xA8C4 },
+{ 0xA8C4, 0xA8C5 },
 { 0xA8E0, 0xA8F1 },
 { 0xA926, 0xA92D },
 { 0xA947, 0xA951 },
@@ -204,7 +205,7 @@ static const struct interval zero_width[] = {
 { 0xABED, 0xABED },
 { 0xFB1E, 0xFB1E },
 { 0xFE00, 0xFE0F },
-{ 0xFE20, 0xFE2D },
+{ 0xFE20, 0xFE2F },
 { 0xFEFF, 0xFEFF },
 { 0xFFF9, 0xFFFB },
 { 0x101FD, 0x101FD },
@@ -228,16 +229,21 @@ static const struct interval zero_width[] = {
 { 0x11173, 0x11173 },
 { 0x11180, 0x11181 },
 { 0x111B6, 0x111BE },
+{ 0x111CA, 0x111CC },
 { 0x1122F, 0x11231 },
 { 0x11234, 0x11234 },
 { 0x11236, 0x11237 },
+{ 0x1123E, 0x1123E },
 { 0x112DF, 0x112DF },
 { 0x112E3, 0x112EA },
-{ 0x11301, 0x11301 },
+{ 0x11300, 0x11301 },
 { 0x1133C, 0x1133C },
 { 0x11340, 0x11340 },
 { 0x11366, 0x1136C },
 { 0x11370, 0x11374 },
+{ 0x11438, 0x1143F },
+{ 0x11442, 0x11444 },
+{ 0x11446, 0x11446 },
 { 0x114B3, 0x114B8 },
 { 0x114BA, 0x114BA },
 { 0x114BF, 0x114C0 },
@@ -245,6 +251,7 @@ static const struct interval zero_width[] = {
 { 0x115B2, 0x115B5 },
 { 0x115BC, 0x115BD },
 { 0x115BF, 0x115C0 },
+{ 0x115DC, 0x115DD },
 { 0x11633, 0x1163A },
 { 0x1163D, 0x1163D },
 { 0x1163F, 0x11640 },
@@ -252,6 +259,16 @@ static const struct interval zero_width[] = {
 { 0x116AD, 0x116AD },
 { 0x116B0, 0x116B5 },
 { 0x116B7, 0x116B7 },
+{ 0x1171D, 0x1171F },
+{ 0x11722, 0x11725 },
+{ 0x11727, 0x1172B },
+{ 0x11C30, 0x11C36 },
+{ 0x11C38, 0x11C3D },
+{ 0x11C3F, 0x11C3F },
+{ 0x11C92, 0x11CA7 },
+{ 0x11CAA, 0x11CB0 },
+{ 0x11CB2, 0x11CB3 },
+{ 0x11CB5, 0x11CB6 },
 { 0x16AF0, 0x16AF4 },
 { 0x16B30, 0x16B36 },
 { 0x16F8F, 0x16F92 },
@@ -262,16 +279,28 @@ static const struct interval zero_width[] = {
 { 0x1D185, 0x1D18B },
 { 0x1D1AA, 0x1D1AD },
 { 0x1D242, 0x1D244 },
+{ 0x1DA00, 0x1DA36 },
+{ 0x1DA3B, 0x1DA6C },
+{ 0x1DA75, 0x1DA75 },
+{ 0x1DA84, 0x1DA84 },
+{ 0x1DA9B, 0x1DA9F },
+{ 0x1DAA1, 0x1DAAF },
+{ 0x1E000, 0x1E006 },
+{ 0x1E008, 0x1E018 },
+{ 0x1E01B, 0x1E021 },
+{ 0x1E023, 0x1E024 },
+{ 0x1E026, 0x1E02A },
 { 0x1E8D0, 0x1E8D6 },
+{ 0x1E944, 0x1E94A },
 { 0xE0001, 0xE0001 },
 { 0xE0020, 0xE007F },
 { 0xE0100, 0xE01EF }
 };
 static const struct interval double_width[] = {
-{ /* plane */ 0x0, 0x1C },
-{ /* plane */ 0x1C, 0x21 },
-{ /* plane */ 0x21, 0x22 },
-{ /* plane */ 0x22, 0x23 },
+{ /* plane */ 0x0, 0x3D },
+{ /* plane */ 0x3D, 0x68 },
+{ /* plane */ 0x68, 0x69 },
+{ /* plane */ 0x69, 0x6A },
 { /* plane */ 0x0, 0x0 },
 { /* plane */ 0x0, 0x0 },
 { /* plane */ 0x0, 0x0 },
@@ -286,7 +315,40 @@ static const struct interval double_width[] = {
 { /* plane */ 0x0, 0x0 },
 { /* plane */ 0x0, 0x0 },
 { 0x1100, 0x115F },
+{ 0x231A, 0x231B },
 { 0x2329, 0x232A },
+{ 0x23E9, 0x23EC },
+{ 0x23F0, 0x23F0 },
+{ 0x23F3, 0x23F3 },
+{ 0x25FD, 0x25FE },
+{ 0x2614, 0x2615 },
+{ 0x2648, 0x2653 },
+{ 0x267F, 0x267F },
+{ 0x2693, 0x2693 },
+{ 0x26A1, 0x26A1 },
+{ 0x26AA, 0x26AB },
+{ 0x26BD, 0x26BE },
+{ 0x26C4, 0x26C5 },
+{ 0x26CE, 0x26CE },
+{ 0x26D4, 0x26D4 },
+{ 0x26EA, 0x26EA },
+{ 0x26F2, 0x26F3 },
+{ 0x26F5, 0x26F5 },
+{ 0x26FA, 0x26FA },
+{ 0x26FD, 0x26FD },
+{ 0x2705, 0x2705 },
+{ 0x270A, 0x270B },
+{ 0x2728, 0x2728 },
+{ 0x274C, 0x274C },
+{ 0x274E, 0x274E },
+{ 0x2753, 0x2755 },
+{ 0x2757, 0x2757 },
+{ 0x2795, 0x2797 },
+{ 0x27B0, 0x27B0 },
+{ 0x27BF, 0x27BF },
+{ 0x2B1B, 0x2B1C },
+{ 0x2B50, 0x2B50 },
+{ 0x2B55, 0x2B55 },
 { 0x2E80, 0x2E99 },
 { 0x2E9B, 0x2EF3 },
 { 0x2F00, 0x2FD5 },
@@ -313,11 +375,49 @@ static const struct interval double_width[] = {
 { 0xFE68, 0xFE6B },
 { 0xFF01, 0xFF60 },
 { 0xFFE0, 0xFFE6 },
+{ 0x16FE0, 0x16FE0 },
+{ 0x17000, 0x187EC },
+{ 0x18800, 0x18AF2 },
 { 0x1B000, 0x1B001 },
+{ 0x1F004, 0x1F004 },
+{ 0x1F0CF, 0x1F0CF },
+{ 0x1F18E, 0x1F18E },
+{ 0x1F191, 0x1F19A },
 { 0x1F200, 0x1F202 },
-{ 0x1F210, 0x1F23A },
+{ 0x1F210, 0x1F23B },
 { 0x1F240, 0x1F248 },
 { 0x1F250, 0x1F251 },
+{ 0x1F300, 0x1F320 },
+{ 0x1F32D, 0x1F335 },
+{ 0x1F337, 0x1F37C },
+{ 0x1F37E, 0x1F393 },
+{ 0x1F3A0, 0x1F3CA },
+{ 0x1F3CF, 0x1F3D3 },
+{ 0x1F3E0, 0x1F3F0 },
+{ 0x1F3F4, 0x1F3F4 },
+{ 0x1F3F8, 0x1F43E },
+{ 0x1F440, 0x1F440 },
+{ 0x1F442, 0x1F4FC },
+{ 0x1F4FF, 0x1F53D },
+{ 0x1F54B, 0x1F54E },
+{ 0x1F550, 0x1F567 },
+{ 0x1F57A, 0x1F57A },
+{ 0x1F595, 0x1F596 },
+{ 0x1F5A4, 0x1F5A4 },
+{ 0x1F5FB, 0x1F64F },
+{ 0x1F680, 0x1F6C5 },
+{ 0x1F6CC, 0x1F6CC },
+{ 0x1F6D0, 0x1F6D2 },
+{ 0x1F6EB, 0x1F6EC },
+{ 0x1F6F4, 0x1F6F6 },
+{ 0x1F910, 0x1F91E },
+{ 0x1F920, 0x1F927 },
+{ 0x1F930, 0x1F930 },
+{ 0x1F933, 0x1F93E },
+{ 0x1F940, 0x1F94B },
+{ 0x1F950, 0x1F95E },
+{ 0x1F980, 0x1F991 },
+{ 0x1F9C0, 0x1F9C0 },
 { 0x20000, 0x2FFFD },
 { 0x30000, 0x3FFFD }
 };
-- 
2.7.2

^ permalink raw reply related

* [PATCH v2 3/3] unicode_width.h: fix the double_width[] table
From: Beat Bolli @ 2016-12-03 10:53 UTC (permalink / raw)
  To: git; +Cc: Beat Bolli, Torsten Bögershausen
In-Reply-To: <1480762392-28731-1-git-send-email-dev+git@drbeat.li>

The function bisearch() in utf8.c does a pure binary search in
double_width. It does not care about the 17 plane offsets which
unicode/uniset/uniset prepends. Leaving the plane offsets in the table
may cause wrong results.

Filter out the plane offsets in update-unicode.sh and regenerate the
table.

Cc: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
---
Diff to v1:
  - add Thorsten's Cc:

 unicode_width.h   | 17 -----------------
 update_unicode.sh |  2 +-
 2 files changed, 1 insertion(+), 18 deletions(-)

diff --git a/unicode_width.h b/unicode_width.h
index 73b5fd6..02207be 100644
--- a/unicode_width.h
+++ b/unicode_width.h
@@ -297,23 +297,6 @@ static const struct interval zero_width[] = {
 { 0xE0100, 0xE01EF }
 };
 static const struct interval double_width[] = {
-{ /* plane */ 0x0, 0x3D },
-{ /* plane */ 0x3D, 0x68 },
-{ /* plane */ 0x68, 0x69 },
-{ /* plane */ 0x69, 0x6A },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
 { 0x1100, 0x115F },
 { 0x231A, 0x231B },
 { 0x2329, 0x232A },
diff --git a/update_unicode.sh b/update_unicode.sh
index 3c84270..4c1ec8d 100755
--- a/update_unicode.sh
+++ b/update_unicode.sh
@@ -30,7 +30,7 @@ fi &&
 		  grep -v plane)
 	};
 	static const struct interval double_width[] = {
-		$(uniset/uniset --32 eaw:F,W)
+		$(uniset/uniset --32 eaw:F,W | grep -v plane)
 	};
 	EOF
 )
-- 
2.7.2

^ permalink raw reply related

* [PATCH v2 1/3] update-unicode.sh: automatically download newer definition files
From: Beat Bolli @ 2016-12-03 10:53 UTC (permalink / raw)
  To: git; +Cc: Beat Bolli, Torsten Bögershausen
In-Reply-To: <1480713995-16157-1-git-send-email-dev+git@drbeat.li>

Checking just for the unicode data files' existence is not sufficient;
we should also download them if a newer version exists on the Unicode
consortium's servers. Option -N of wget does this nicely for us.

Cc: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
---
Diff to v1:
  - reword the commit message
  - add Thorsten's Cc:

 update_unicode.sh | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/update_unicode.sh b/update_unicode.sh
index 27af77c..3c84270 100755
--- a/update_unicode.sh
+++ b/update_unicode.sh
@@ -10,12 +10,8 @@ if ! test -d unicode; then
 	mkdir unicode
 fi &&
 ( cd unicode &&
-	if ! test -f UnicodeData.txt; then
-		wget http://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt
-	fi &&
-	if ! test -f EastAsianWidth.txt; then
-		wget http://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt
-	fi &&
+	wget -N http://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt \
+		http://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt &&
 	if ! test -d uniset; then
 		git clone https://github.com/depp/uniset.git
 	fi &&
-- 
2.7.2

^ permalink raw reply related

* [PATCH 3/3] unicode_width.h: fix the double_width[] table
From: Beat Bolli @ 2016-12-03 10:35 UTC (permalink / raw)
  To: git; +Cc: Beat Bolli
In-Reply-To: <1480713995-16157-1-git-send-email-dev+git@drbeat.li>

The function bisearch() in utf8.c does a pure binary search in
double_width. It does not care about the 17 plane offsets which
unicode/uniset/uniset prepends. Leaving the plane offsets in the table
may cause wrong results.

Filter out the plane offsets in the update-unicode.sh and regenerate
the table.

Signed-off-by: Beat Bolli <dev+git@drbeat.li>
---
 unicode_width.h   | 17 -----------------
 update_unicode.sh |  2 +-
 2 files changed, 1 insertion(+), 18 deletions(-)

diff --git a/unicode_width.h b/unicode_width.h
index 73b5fd6..02207be 100644
--- a/unicode_width.h
+++ b/unicode_width.h
@@ -297,23 +297,6 @@ static const struct interval zero_width[] = {
 { 0xE0100, 0xE01EF }
 };
 static const struct interval double_width[] = {
-{ /* plane */ 0x0, 0x3D },
-{ /* plane */ 0x3D, 0x68 },
-{ /* plane */ 0x68, 0x69 },
-{ /* plane */ 0x69, 0x6A },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
-{ /* plane */ 0x0, 0x0 },
 { 0x1100, 0x115F },
 { 0x231A, 0x231B },
 { 0x2329, 0x232A },
diff --git a/update_unicode.sh b/update_unicode.sh
index 3c84270..4c1ec8d 100755
--- a/update_unicode.sh
+++ b/update_unicode.sh
@@ -30,7 +30,7 @@ fi &&
 		  grep -v plane)
 	};
 	static const struct interval double_width[] = {
-		$(uniset/uniset --32 eaw:F,W)
+		$(uniset/uniset --32 eaw:F,W | grep -v plane)
 	};
 	EOF
 )
-- 
2.7.2

^ permalink raw reply related

* Re: git reset --hard should not irretrievably destroy new files
From: Christian Couder @ 2016-12-03  8:11 UTC (permalink / raw)
  To: Julian de Bhal; +Cc: git
In-Reply-To: <CAJZCeG1Eu+5DfaxavX_WGUCa+SY+yepDWZhPXxiFcV__h0xjrw@mail.gmail.com>

On Sat, Dec 3, 2016 at 6:04 AM, Julian de Bhal <julian.debhal@gmail.com> wrote:
> If you `git add new_file; git reset --hard`, new_file is gone forever.
>
> This is totally what git says it will do on the box, but it caught me out.

Yeah, you are not the first one, and probably not the last
unfortunately, to be caught by it, see for example the last discussion
about it:

https://public-inbox.org/git/loom.20160523T023140-975@post.gmane.org/

which itself refers to this previous discussion:

https://public-inbox.org/git/CANWD=rX-MEiS4cNzDWr2wwkshz2zu8-L31UrKwbZrJSBcJX-nQ@mail.gmail.com/

> It might seem a little less stupid if I explain what I was doing: I was
> breaking apart a chunk of work into smaller changes:
>
> git commit -a -m 'tmp'           # You feel pretty safe now, right?
> git checkout -b backup/my-stuff  # Not necessary, just a convenience
> git checkout -
> git reset HEAD^                  # mixed
> git add new_file
> git add -p                       # also not necessary, but distracting
> git reset --hard                 # decided copy from backed up diff
> # boom. new_file is gone forever
>
>
> Now, again, this is totally what git says it's going to do, and that was
> pretty stupid, but that file is gone for good, and it feels bad.

Yeah, I agree that it feels bad even if there are often ways to get
back your data as you can see from the links in Yotam's email above.

> Everything that was committed is safe, and the other untracked files in
> my local directory are also fine, but that particular file is
> permanently destroyed. This is the first time I've lost something since I
> discovered the reflog a year or two ago.
>
> The behaviour that would make the most sense to me (personally) would be
> for a hard reset to unstage new files,

This has already been proposed last time...

> but I'd be nearly as happy if a
> commit was added to the reflog when the reset happens (I can probably make
> that happen with some configuration now that I've been bitten).

Not sure if this has been proposed. Perhaps it would be simpler to
just output the sha1, and maybe the filenames too, of the blobs, that
are no more referenced from the trees, somewhere (in a bloblog?).

> If there's support for this idea but no-one is keen to write the code, let
> me know and I could have a crack at it.

Not sure if your report and your offer will make us more likely to
agree to do something, but thanks for trying!

^ permalink raw reply

* Re: git reset --hard should not irretrievably destroy new files
From: Johannes Sixt @ 2016-12-03  7:49 UTC (permalink / raw)
  To: Julian de Bhal; +Cc: git
In-Reply-To: <CAJZCeG1Eu+5DfaxavX_WGUCa+SY+yepDWZhPXxiFcV__h0xjrw@mail.gmail.com>

Am 03.12.2016 um 06:04 schrieb Julian de Bhal:
> If you `git add new_file; git reset --hard`, new_file is gone forever.

AFAIC, this is a feature ;-) I occasionally use it to remove a file when 
I already have git-gui in front of me. Then it's often less convenient 
to type the path in a shell, or to pointy-click around in a file browser.

> git add new_file

Because of this ...

> git add -p                       # also not necessary, but distracting
> git reset --hard                 # decided copy from backed up diff
> # boom. new_file is gone forever

... it is not. The file is still among the dangling blobs in the 
repository until you clean it up with 'git gc'. Use 'git fsck --lost-found':

--lost-found

     Write dangling objects into .git/lost-found/commit/ or 
.git/lost-found/other/, depending on type. If the object is a blob, the 
contents are written into the file, rather than its object name.

-- Hannes

^ permalink raw reply

* Re: [PATCH] commit: make --only --allow-empty work without paths
From: Andreas Krey @ 2016-12-03  6:59 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Junio C Hamano
In-Reply-To: <20161203043254.7ozjyucfn6uivnsh@sigill.intra.peff.net>

On Fri, 02 Dec 2016 23:32:55 +0000, Jeff King wrote:
> On Fri, Dec 02, 2016 at 11:15:13PM +0100, Andreas Krey wrote:
> 
> > --only is implied when paths are present, and required
> > them unless --amend. But with --allow-empty it should
> > be allowed as well - it is the only way to create an
> > empty commit in the presence of staged changes.
> 
> OK. I'm not sure why you would want to create an empty commit in such a
> case.

User: Ok tool, make me a pullreq.

Tool: But you haven't mentioned any issue
      in your commit messages. Which are they?

User: Ok, that would be A-123.

Tool: git commit --allow-empty -m 'FIX: A-123'

Originally we checked that the status output was
empty, and later added an option for 'yes, I know
that there are uncommitted changes; I don't want
them included'.

And then someone had staged changes, which lead me here,
because there is no way now to create an empty commit
(just for the commit message) in that situation.
Amending the previous commit wouldn't fly with us
because of a local ban on non-fast-forward pushes.

...
> > (The interdepence of the tests is a strange thing;
> > making --run=N somewhat pointless.)
> 
> Yes, I think --run is a misfeature (I actually had to look it up, as I
...
> implicit. If a single test script is annoyingly long to run, I'd argue

It wasn't about runtime but about output. I would have
liked to see only the output of my still-failing test;
a 'stop after test X' would be helpful there.

Andreas

-- 
"Totally trivial. Famous last words."
From: Linus Torvalds <torvalds@*.org>
Date: Fri, 22 Jan 2010 07:29:21 -0800

^ permalink raw reply

* git add -p doesn't honor diff.noprefix config
From: paddor @ 2016-12-03  6:45 UTC (permalink / raw)
  To: git

Hi all

I set the config diff.noprefix = true because I don't like the a/ and b/ prefixes, which nicely changed the output of `git diff`. Unfortunately, the filenames in the output of `git add --patch` are still prefixed.

To me, this seems like a bug. Or there's a config option missing.

Best regards,
Patrik

^ permalink raw reply

* Re: [RFC PATCH 00/16] Checkout aware of Submodules!
From: Xiaodong Qi @ 2016-12-03  6:13 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, bmwill, gitster, jrnieder, mogulguy10, David.Turner
In-Reply-To: <20161115230651.23953-1-sbeller@google.com>

I found this patch on Reddit and personally support this idea to
simplify the submodule update and checkout process. I don't know how
other users handle the submodule update process, I do sometimes forget
to checkout in superprojects with submodules and get a lot of trouble in
using the submodule function. The patch seems aims to making the process
easier than before, although I am not qualified to review the code in
detail. I suggest experts in this area to review the code promptly and
work out the nuts and bolts toward the goal of this patch. Thank you for
listening.

Regards,
Xiaodong Qi

On 11/15/2016 04:06 PM, Stefan Beller wrote:
> When working with submodules, nearly anytime after checking out 
> a different state of the projects, that has submodules changed
> you'd run "git submodule update" with a current version of Git.
> 
> There are two problems with this approach:
> 
> * The "submodule update" command is dangerous as it
>   doesn't check for work that may be lost in the submodule
>   (e.g. a dangling commit).
> * you may forget to run the command as checkout is supposed
>   to do all the work for you.
> 
> Integrate updating the submodules into git checkout, with the same
> safety promises that git-checkout has, i.e. not throw away data unless
> asked to. This is done by first checking if the submodule is at the same
> sha1 as it is recorded in the superproject. If there are changes we stop
> proceeding the checkout just like it is when checking out a file that
> has local changes.
> 
> The integration happens in the code that is also used in other commands
> such that it will be easier in the future to make other commands aware
> of submodule.
> 
> This also solves d/f conflicts in case you replace a file/directory
> with a submodule or vice versa.
> 
> The patches are still a bit rough, but the overall series seems
> promising enough to me that I want to put it out here.
> 
> Any review, specifically on the design level welcome!
> 
> Thanks,
> Stefan
> 
> Stefan Beller (16):
>   submodule.h: add extern keyword to functions, break line before 80
>   submodule: modernize ok_to_remove_submodule to use argv_array
>   submodule: use absolute path for computing relative path connecting
>   update submodules: add is_submodule_populated
>   update submodules: add submodule config parsing
>   update submodules: add a config option to determine if submodules are
>     updated
>   update submodules: introduce submodule_is_interesting
>   update submodules: add depopulate_submodule
>   update submodules: add scheduling to update submodules
>   update submodules: is_submodule_checkout_safe
>   teach unpack_trees() to remove submodule contents
>   entry: write_entry to write populate submodules
>   submodule: teach unpack_trees() to update submodules
>   checkout: recurse into submodules if asked to
>   completion: add '--recurse-submodules' to checkout
>   checkout: add config option to recurse into submodules by default
> 
>  Documentation/config.txt               |   6 +
>  Documentation/git-checkout.txt         |   8 +
>  builtin/checkout.c                     |  31 ++-
>  cache.h                                |   2 +
>  contrib/completion/git-completion.bash |   2 +-
>  entry.c                                |  17 +-
>  submodule-config.c                     |  22 +++
>  submodule-config.h                     |  17 +-
>  submodule.c                            | 246 +++++++++++++++++++++--
>  submodule.h                            |  77 +++++---
>  t/lib-submodule-update.sh              |  10 +-
>  t/t2013-checkout-submodule.sh          | 344 ++++++++++++++++++++++++++++++++-
>  t/t9902-completion.sh                  |   1 +
>  unpack-trees.c                         | 103 ++++++++--
>  unpack-trees.h                         |   1 +
>  wrapper.c                              |   4 +
>  16 files changed, 806 insertions(+), 85 deletions(-)
> 

^ permalink raw reply

* Re: [PATCH 4/4] shallow.c: remove useless test
From: Jeff King @ 2016-12-03  5:24 UTC (permalink / raw)
  To: Rasmus Villemoes; +Cc: git, Nguyễn Thái Ngọc Duy
In-Reply-To: <1480710664-26290-4-git-send-email-rv@rasmusvillemoes.dk>

On Fri, Dec 02, 2016 at 09:31:04PM +0100, Rasmus Villemoes wrote:

> It seems to be odd to do x=y if x==y. Maybe there's a bug somewhere near
> this, but as is this is somewhat confusing.

Yeah, this code is definitely wrong, but I'm not sure what it's trying
to do. This is the first time I've looked at it.

-Peff

^ permalink raw reply

* Re: [PATCH 3/4] shallow.c: bit manipulation tweaks
From: Jeff King @ 2016-12-03  5:21 UTC (permalink / raw)
  To: Rasmus Villemoes; +Cc: git, Nguyễn Thái Ngọc Duy
In-Reply-To: <1480710664-26290-3-git-send-email-rv@rasmusvillemoes.dk>

On Fri, Dec 02, 2016 at 09:31:03PM +0100, Rasmus Villemoes wrote:

> First of all, 1 << 31 is technically undefined behaviour, so let's just
> use an unsigned literal.

It took me a second to realize that you weren't talking about the
unsigned parameter here. You mean using "1U". It might be worth saying:

   ...use an unsigned literal, "1U".

to make it more obvious.

> If i is 'signed int' and gcc doesn't know that i is positive, gcc
> generates code to compute the C99-mandated values of "i / 32" and "i %
> 32", which is a lot more complicated than simple a simple shifts/mask.

Right, that makes sense (though it is a separate issue).

-Peff

^ permalink raw reply

* Re: [PATCH 2/4] shallow.c: avoid theoretical pointer wrap-around
From: Jeff King @ 2016-12-03  5:17 UTC (permalink / raw)
  To: Rasmus Villemoes; +Cc: git, Nguyễn Thái Ngọc Duy
In-Reply-To: <1480710664-26290-2-git-send-email-rv@rasmusvillemoes.dk>

On Fri, Dec 02, 2016 at 09:31:02PM +0100, Rasmus Villemoes wrote:

> The expression info->free+size is technically undefined behaviour in
> exactly the case we want to test for. Moreover, the compiler is likely
> to translate the expression to
> 
>   (unsigned long)info->free + size > (unsigned long)info->end
> 
> where there's at least a theoretical chance that the LHS could wrap
> around 0, giving a false negative.
> 
> This might as well be written using pointer subtraction avoiding these
> issues.
> [...]
>
> -	if (!info->slab_count || info->free + size > info->end) {
> +	if (!info->slab_count || size > info->end - info->free) {

Yeah, I agree the correct way to write this is to compare the sizes
directly. That is how overflow checks _must_ be written. This one is
less likely to overflow, but even computing the value more than one past
the end of the array is technically undefined.

-Peff

^ permalink raw reply

* Re: [PATCH 1/4] shallow.c: make paint_alloc slightly more robust
From: Jeff King @ 2016-12-03  5:14 UTC (permalink / raw)
  To: Rasmus Villemoes; +Cc: git, Nguyễn Thái Ngọc Duy
In-Reply-To: <1480710664-26290-1-git-send-email-rv@rasmusvillemoes.dk>

On Fri, Dec 02, 2016 at 09:31:01PM +0100, Rasmus Villemoes wrote:

> I have no idea if this is a real issue, but it's not obvious to me that
> paint_alloc cannot be called with info->nr_bits greater than about
> 4M (\approx 8*COMMIT_SLAB_SIZE). In that case the new slab would be too
> small. So just round up the allocation to the maximum of
> COMMIT_SLAB_SIZE and size.

I had trouble understanding what the problem is from this description,
but I think i figured it out from the code.

Let me try to restate it to make sure I understand.

The paint_alloc() may be asked to allocate a certain number of bits,
which it does across a series of independently allocated slabs. Each
slab holds a fixed size, but we only allocate a single slab. If the
number we need to allocate is larger than fits in a single slab, then at
the end we'll have under-allocated.

Your solution is to make the slab we allocate bigger. But that seems
odd to me. Usually when we are using COMMIT_SLAB_SIZE, we are allocating
a series of slabs that make up a virtual array, and we know that each
slab has the same size. So if you need to find the k-th item, and each
slab has length n, then you'd look at slab (k / n), and then at item (k
% n) within that slab.

In other words, I think the solution isn't to make the one slab bigger,
but to allocate slabs until we have enough of them to meet the request.

But I don't really know how this code is used, or why it is using
COMMIT_SLAB_SIZE in the first place. That's generally supposed to be an
internal detail of the commit-slab.h infrastructure. Why is it being
used directly, instead of just using the functions that commit-slab
defines?

-Peff

^ permalink raw reply

* git reset --hard should not irretrievably destroy new files
From: Julian de Bhal @ 2016-12-03  5:04 UTC (permalink / raw)
  To: git

If you `git add new_file; git reset --hard`, new_file is gone forever.

This is totally what git says it will do on the box, but it caught me out.

It might seem a little less stupid if I explain what I was doing: I was
breaking apart a chunk of work into smaller changes:

git commit -a -m 'tmp'           # You feel pretty safe now, right?
git checkout -b backup/my-stuff  # Not necessary, just a convenience
git checkout -
git reset HEAD^                  # mixed
git add new_file
git add -p                       # also not necessary, but distracting
git reset --hard                 # decided copy from backed up diff
# boom. new_file is gone forever

Now, again, this is totally what git says it's going to do, and that was
pretty stupid, but that file is gone for good, and it feels bad.

Everything that was committed is safe, and the other untracked files in
my local directory are also fine, but that particular file is
permanently destroyed. This is the first time I've lost something since I
discovered the reflog a year or two ago.

The behaviour that would make the most sense to me (personally) would be
for a hard reset to unstage new files, but I'd be nearly as happy if a
commit was added to the reflog when the reset happens (I can probably make
that happen with some configuration now that I've been bitten).

If there's support for this idea but no-one is keen to write the code, let
me know and I could have a crack at it.

Cheers,

Julian de Bhál

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox