Git development

Git development
 help / color / mirror / Atom feed

* Re: [PATCH/RFC] sparse-checkout: take care of "--end-of-options" in set/add
From: Junio C Hamano @ 2023-12-21 22:04 UTC (permalink / raw)
  To: Jeff King; +Cc: Josh Steadmon, git
In-Reply-To: <20231221214550.GD1446091@coredump.intra.peff.net>

Jeff King <peff@peff.net> writes:

> Right, that is the "gotcha" I mentioned in my other email. Though that
> is the way it has behaved historically, my argument is that users are
> unreasonable to expect it to work:
>
>   1. It is not consistent with the rest of Git commands.
>
>   2. It is inconsistent with respect to existing options (and is an
>      accident waiting to happen when new options are added).
>
> So I'd consider it a bug-fix.

So the counter-proposal here is just to drop KEEP_UNKNOWN_OPT and
deliberately break them^W^W^Wrealign their expectations?

I do not have much stake in sparse-checkout, so I am fine with that
direction.  But I suspect other folks on the list would have users
of their own who would rather loudly complain to them if we broke
them ;-) 

I'll see how bad the fallout would be if we go that route, using the
test modification I made for the previous rounds as yardsticks.

Thanks.

^ permalink raw reply

* [PATCH] mem-pool: fix big allocations
From: René Scharfe @ 2023-12-21 23:13 UTC (permalink / raw)
  To: Git List; +Cc: Jameson Miller

Memory pool allocations that require a new block and would fill at
least half of it are handled specially.  Before 158dfeff3d (mem-pool:
add life cycle management functions, 2018-07-02) they used to be
allocated outside of the pool.  This patch made mem_pool_alloc() create
a bespoke block instead, to allow releasing it when the pool gets
discarded.

Unfortunately mem_pool_alloc() returns a pointer to the start of such a
bespoke block, i.e. to the struct mp_block at its top.  When the caller
writes to it, the management information gets corrupted.  This affects
mem_pool_discard() and -- if there are no other blocks in the pool --
also mem_pool_alloc().

Return the payload pointer of bespoke blocks, just like for smaller
allocations, to protect the management struct.

Also update next_free to mark the block as full.  This is only strictly
necessary for the first allocated block, because subsequent ones are
inserted after the current block and never considered for further
allocations, but it's easier to just do it in all cases.

Add a basic unit test to demonstate the issue by using mem_pool_calloc()
with a tiny block size, which forces the creation of a bespoke block.
Without the mem_pool_alloc() fix it reports zeroed pointers:

   ok 1 - mem_pool_calloc returns 100 zeroed bytes with big block
   # check "((pool->mp_block->next_free) != (((void*)0))) == 1" failed at t/unit-tests/t-mem-pool.c:22
   #    left: 0
   #   right: 1
   # check "((pool->mp_block->end) != (((void*)0))) == 1" failed at t/unit-tests/t-mem-pool.c:23
   #    left: 0
   #   right: 1
   not ok 2 - mem_pool_calloc returns 100 zeroed bytes with tiny block
   1..2

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 Makefile                  |  1 +
 mem-pool.c                |  6 +++---
 t/unit-tests/t-mem-pool.c | 34 ++++++++++++++++++++++++++++++++++
 3 files changed, 38 insertions(+), 3 deletions(-)
 create mode 100644 t/unit-tests/t-mem-pool.c

diff --git a/Makefile b/Makefile
index 88ba7a3c51..15990ff312 100644
--- a/Makefile
+++ b/Makefile
@@ -1340,6 +1340,7 @@ THIRD_PARTY_SOURCES += sha1collisiondetection/%
 THIRD_PARTY_SOURCES += sha1dc/%

 UNIT_TEST_PROGRAMS += t-basic
+UNIT_TEST_PROGRAMS += t-mem-pool
 UNIT_TEST_PROGRAMS += t-strbuf
 UNIT_TEST_PROGS = $(patsubst %,$(UNIT_TEST_BIN)/%$X,$(UNIT_TEST_PROGRAMS))
 UNIT_TEST_OBJS = $(patsubst %,$(UNIT_TEST_DIR)/%.o,$(UNIT_TEST_PROGRAMS))
diff --git a/mem-pool.c b/mem-pool.c
index c34846d176..e8d976c3ee 100644
--- a/mem-pool.c
+++ b/mem-pool.c
@@ -99,9 +99,9 @@ void *mem_pool_alloc(struct mem_pool *pool, size_t len)

 	if (!p) {
 		if (len >= (pool->block_alloc / 2))
-			return mem_pool_alloc_block(pool, len, pool->mp_block);
-
-		p = mem_pool_alloc_block(pool, pool->block_alloc, NULL);
+			p = mem_pool_alloc_block(pool, len, pool->mp_block);
+		else
+			p = mem_pool_alloc_block(pool, pool->block_alloc, NULL);
 	}

 	r = p->next_free;
diff --git a/t/unit-tests/t-mem-pool.c b/t/unit-tests/t-mem-pool.c
new file mode 100644
index 0000000000..2295779b0b
--- /dev/null
+++ b/t/unit-tests/t-mem-pool.c
@@ -0,0 +1,34 @@
+#include "test-lib.h"
+#include "mem-pool.h"
+
+#define check_ptr(a, op, b) check_int(((a) op (b)), ==, 1)
+
+static void setup_static(void (*f)(struct mem_pool *), size_t block_alloc)
+{
+	struct mem_pool pool = { .block_alloc = block_alloc };
+	f(&pool);
+	mem_pool_discard(&pool, 0);
+}
+
+static void t_calloc_100(struct mem_pool *pool)
+{
+	size_t size = 100;
+	char *buffer = mem_pool_calloc(pool, 1, size);
+	for (size_t i = 0; i < size; i++)
+		check_int(buffer[i], ==, 0);
+	if (!check_ptr(pool->mp_block, !=, NULL))
+		return;
+	check_ptr(pool->mp_block->next_free, <=, pool->mp_block->end);
+	check_ptr(pool->mp_block->next_free, !=, NULL);
+	check_ptr(pool->mp_block->end, !=, NULL);
+}
+
+int cmd_main(int argc, const char **argv)
+{
+	TEST(setup_static(t_calloc_100, 1024 * 1024),
+	     "mem_pool_calloc returns 100 zeroed bytes with big block");
+	TEST(setup_static(t_calloc_100, 1),
+	     "mem_pool_calloc returns 100 zeroed bytes with tiny block");
+
+	return test_done();
+}
--
2.43.0

^ permalink raw reply related

* Re: [PATCH] t1006: add tests for %(objectsize:disk)
From: René Scharfe @ 2023-12-21 23:13 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Ondrej Pohorelsky, brian m . carlson, Junio C Hamano
In-Reply-To: <20231221213034.GB1446091@coredump.intra.peff.net>

Am 21.12.23 um 22:30 schrieb Jeff King:
> On Thu, Dec 21, 2023 at 01:19:53PM +0100, René Scharfe wrote:
>
>> I think we can do it even in shell, especially if...
>> [...]
>
> Yeah, your conversion looks accurate. I do wonder if it is worth golfing
> further, though. If it were a process invocation per object, I'd
> definitely say the efficiency gain is worth it. But dropping one process
> from the whole test isn't that exciting either way.

Fair enough.

>
>> (sort -r), then we don't need to carry the oid forward:
>>
>> 			sort -nr <idx.raw >idx.sorted &&
>> 			packsz=$(test_file_size "${idx%.idx}.pack") &&
>> 			end=$((packsz - rawsz)) &&
>> 			awk -v end="$end" "
>> 			  { print \$2, end - \$1; end = \$1 }
>> 			" idx.sorted ||
>>
>> And at that point it should be easy to use a shell loop instead of awk:
>>
>> 			while read start oid rest
>> 			do
>> 				size=$((end - start)) &&
>> 				end=$start &&
>> 				echo "$oid $size" ||
>> 				return 1
>> 			done <idx.sorted
>
> The one thing I do like is that we don't have to escape anything inside
> an awk program that is forced to use double-quotes. ;)

For me it's processing the data in the "correct" order (descending, i.e.
starting at the end, which we have to calculate first anyway based on the
size).

>> Should we deduplicate here, like cat-file does (i.e. use "sort -u")?
>> Having the same object in multiple places for whatever reason would not
>> be a cause for reporting an error in this test, I would think.
>
> No, for the reasons I said in the commit message: if an object exists in
> multiple places the test is already potentially invalid, as Git does not
> promise which version it will use. So it might work racily, or it might
> work for now but be fragile. By not de-duplicating, we make sure the
> test's assumption holds.

Oh, skipped that paragraph.  Still I don't see how a duplicate object
would necessarily invalidate t1006.  The comment for the test "cat-file
--batch-all-objects shows all objects" a few lines above indicates that
it's picky about the provenance of objects, but it uses a separate
repository.  I can't infer the same requirement for the root repo, but
we already established that I can't read.

Anyway, if someone finds a use for git repack without -d or
git unpack-objects or whatever else causes duplicates in the root
repository of t1006 then they can try to reverse your ban with concrete
arguments.

René

^ permalink raw reply

* [PATCH] Port helper/test-ctype.c to unit-tests/t-ctype.c
From: Achu Luma @ 2023-12-21 23:15 UTC (permalink / raw)
  To: christian.couder; +Cc: git, Achu Luma, Christian Couder

In the recent codebase update (8bf6fbd00d (Merge branch
'js/doc-unit-tests', 2023-12-09)), a new unit testing framework was
merged, providing a standardized approach for testing C code. Prior to
this update, some unit tests relied on the test helper mechanism,
lacking a dedicated unit testing framework. It's more natural to perform
these unit tests using the new unit test framework.

This commit migrates the unit tests for C character classification
functions (isdigit(), isspace(), etc) from the legacy approach
using the test-tool command `test-tool ctype` in t/helper/test-ctype.c
to the new unit testing framework (t/unit-tests/test-lib.h).

The migration involves refactoring the tests to utilize the testing
macros provided by the framework (TEST() and check_*()).

Mentored-by: Christian Couder <chriscool@tuxfamily.org> 
Signed-off-by: Achu Luma <ach.lumap@gmail.com>
---
 Makefile               |  2 +-
 t/helper/test-ctype.c  | 70 --------------------------------------
 t/helper/test-tool.c   |  1 -
 t/helper/test-tool.h   |  1 -
 t/t0070-fundamental.sh |  4 ---
 t/unit-tests/t-ctype.c | 76 ++++++++++++++++++++++++++++++++++++++++++
 6 files changed, 77 insertions(+), 77 deletions(-)
 delete mode 100644 t/helper/test-ctype.c
 create mode 100644 t/unit-tests/t-ctype.c

diff --git a/Makefile b/Makefile
index 88ba7a3c51..a4df48ba65 100644
--- a/Makefile
+++ b/Makefile
@@ -792,7 +792,6 @@ TEST_BUILTINS_OBJS += test-chmtime.o
 TEST_BUILTINS_OBJS += test-config.o
 TEST_BUILTINS_OBJS += test-crontab.o
 TEST_BUILTINS_OBJS += test-csprng.o
-TEST_BUILTINS_OBJS += test-ctype.o
 TEST_BUILTINS_OBJS += test-date.o
 TEST_BUILTINS_OBJS += test-delta.o
 TEST_BUILTINS_OBJS += test-dir-iterator.o
@@ -1341,6 +1340,7 @@ THIRD_PARTY_SOURCES += sha1dc/%
 
 UNIT_TEST_PROGRAMS += t-basic
 UNIT_TEST_PROGRAMS += t-strbuf
+UNIT_TEST_PROGRAMS += t-ctype
 UNIT_TEST_PROGS = $(patsubst %,$(UNIT_TEST_BIN)/%$X,$(UNIT_TEST_PROGRAMS))
 UNIT_TEST_OBJS = $(patsubst %,$(UNIT_TEST_DIR)/%.o,$(UNIT_TEST_PROGRAMS))
 UNIT_TEST_OBJS += $(UNIT_TEST_DIR)/test-lib.o
diff --git a/t/helper/test-ctype.c b/t/helper/test-ctype.c
deleted file mode 100644
index e5659df40b..0000000000
--- a/t/helper/test-ctype.c
+++ /dev/null
@@ -1,70 +0,0 @@
-#include "test-tool.h"
-
-static int rc;
-
-static void report_error(const char *class, int ch)
-{
-	printf("%s classifies char %d (0x%02x) wrongly\n", class, ch, ch);
-	rc = 1;
-}
-
-static int is_in(const char *s, int ch)
-{
-	/*
-	 * We can't find NUL using strchr. Accept it as the first
-	 * character in the spec -- there are no empty classes.
-	 */
-	if (ch == '\0')
-		return ch == *s;
-	if (*s == '\0')
-		s++;
-	return !!strchr(s, ch);
-}
-
-#define TEST_CLASS(t,s) {			\
-	int i;					\
-	for (i = 0; i < 256; i++) {		\
-		if (is_in(s, i) != t(i))	\
-			report_error(#t, i);	\
-	}					\
-	if (t(EOF))				\
-		report_error(#t, EOF);		\
-}
-
-#define DIGIT "0123456789"
-#define LOWER "abcdefghijklmnopqrstuvwxyz"
-#define UPPER "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
-#define PUNCT "!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~"
-#define ASCII \
-	"\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f" \
-	"\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f" \
-	"\x20\x21\x22\x23\x24\x25\x26\x27\x28\x29\x2a\x2b\x2c\x2d\x2e\x2f" \
-	"\x30\x31\x32\x33\x34\x35\x36\x37\x38\x39\x3a\x3b\x3c\x3d\x3e\x3f" \
-	"\x40\x41\x42\x43\x44\x45\x46\x47\x48\x49\x4a\x4b\x4c\x4d\x4e\x4f" \
-	"\x50\x51\x52\x53\x54\x55\x56\x57\x58\x59\x5a\x5b\x5c\x5d\x5e\x5f" \
-	"\x60\x61\x62\x63\x64\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f" \
-	"\x70\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7a\x7b\x7c\x7d\x7e\x7f"
-#define CNTRL \
-	"\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f" \
-	"\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f" \
-	"\x7f"
-
-int cmd__ctype(int argc UNUSED, const char **argv UNUSED)
-{
-	TEST_CLASS(isdigit, DIGIT);
-	TEST_CLASS(isspace, " \n\r\t");
-	TEST_CLASS(isalpha, LOWER UPPER);
-	TEST_CLASS(isalnum, LOWER UPPER DIGIT);
-	TEST_CLASS(is_glob_special, "*?[\\");
-	TEST_CLASS(is_regex_special, "$()*+.?[\\^{|");
-	TEST_CLASS(is_pathspec_magic, "!\"#%&',-/:;<=>@_`~");
-	TEST_CLASS(isascii, ASCII);
-	TEST_CLASS(islower, LOWER);
-	TEST_CLASS(isupper, UPPER);
-	TEST_CLASS(iscntrl, CNTRL);
-	TEST_CLASS(ispunct, PUNCT);
-	TEST_CLASS(isxdigit, DIGIT "abcdefABCDEF");
-	TEST_CLASS(isprint, LOWER UPPER DIGIT PUNCT " ");
-
-	return rc;
-}
diff --git a/t/helper/test-tool.c b/t/helper/test-tool.c
index 37ba996539..33b9501c21 100644
--- a/t/helper/test-tool.c
+++ b/t/helper/test-tool.c
@@ -19,7 +19,6 @@ static struct test_cmd cmds[] = {
 	{ "config", cmd__config },
 	{ "crontab", cmd__crontab },
 	{ "csprng", cmd__csprng },
-	{ "ctype", cmd__ctype },
 	{ "date", cmd__date },
 	{ "delta", cmd__delta },
 	{ "dir-iterator", cmd__dir_iterator },
diff --git a/t/helper/test-tool.h b/t/helper/test-tool.h
index 8a1a7c63da..b72f07ded9 100644
--- a/t/helper/test-tool.h
+++ b/t/helper/test-tool.h
@@ -12,7 +12,6 @@ int cmd__chmtime(int argc, const char **argv);
 int cmd__config(int argc, const char **argv);
 int cmd__crontab(int argc, const char **argv);
 int cmd__csprng(int argc, const char **argv);
-int cmd__ctype(int argc, const char **argv);
 int cmd__date(int argc, const char **argv);
 int cmd__delta(int argc, const char **argv);
 int cmd__dir_iterator(int argc, const char **argv);
diff --git a/t/t0070-fundamental.sh b/t/t0070-fundamental.sh
index 487bc8d905..a4756fbab9 100755
--- a/t/t0070-fundamental.sh
+++ b/t/t0070-fundamental.sh
@@ -9,10 +9,6 @@ Verify wrappers and compatibility functions.
 TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
-test_expect_success 'character classes (isspace, isalpha etc.)' '
-	test-tool ctype
-'
-
 test_expect_success 'mktemp to nonexistent directory prints filename' '
 	test_must_fail test-tool mktemp doesnotexist/testXXXXXX 2>err &&
 	grep "doesnotexist/test" err
diff --git a/t/unit-tests/t-ctype.c b/t/unit-tests/t-ctype.c
new file mode 100644
index 0000000000..41189ba9f9
--- /dev/null
+++ b/t/unit-tests/t-ctype.c
@@ -0,0 +1,76 @@
+#include "test-lib.h"
+
+static int is_in(const char *s, int ch)
+{
+	/*
+	 * We can't find NUL using strchr. Accept it as the first
+	 * character in the spec -- there are no empty classes.
+	 */
+	if (ch == '\0')
+		return ch == *s;
+	if (*s == '\0')
+		s++;
+	return !!strchr(s, ch);
+}
+
+/* Macro to test a character type */
+#define TEST_CTYPE_FUNC(func, string)			\
+static void test_ctype_##func(void)				\
+{								\
+	int i;                                     	 	\
+	for (i = 0; i < 256; i++)                 		\
+		check_int(func(i), ==, is_in(string, i)); 	\
+}
+
+#define DIGIT "0123456789"
+#define LOWER "abcdefghijklmnopqrstuvwxyz"
+#define UPPER "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
+#define PUNCT "!\"#$%&'()*+,-./:;<=>?@[\\]^_`{|}~"
+#define ASCII \
+	"\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f" \
+	"\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f" \
+	"\x20\x21\x22\x23\x24\x25\x26\x27\x28\x29\x2a\x2b\x2c\x2d\x2e\x2f" \
+	"\x30\x31\x32\x33\x34\x35\x36\x37\x38\x39\x3a\x3b\x3c\x3d\x3e\x3f" \
+	"\x40\x41\x42\x43\x44\x45\x46\x47\x48\x49\x4a\x4b\x4c\x4d\x4e\x4f" \
+	"\x50\x51\x52\x53\x54\x55\x56\x57\x58\x59\x5a\x5b\x5c\x5d\x5e\x5f" \
+	"\x60\x61\x62\x63\x64\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f" \
+	"\x70\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7a\x7b\x7c\x7d\x7e\x7f"
+#define CNTRL \
+	"\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f" \
+	"\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f" \
+	"\x7f"
+
+TEST_CTYPE_FUNC(isdigit, DIGIT)
+TEST_CTYPE_FUNC(isspace, " \n\r\t")
+TEST_CTYPE_FUNC(isalpha, LOWER UPPER)
+TEST_CTYPE_FUNC(isalnum, LOWER UPPER DIGIT)
+TEST_CTYPE_FUNC(is_glob_special, "*?[\\")
+TEST_CTYPE_FUNC(is_regex_special, "$()*+.?[\\^{|")
+TEST_CTYPE_FUNC(is_pathspec_magic, "!\"#%&',-/:;<=>@_`~")
+TEST_CTYPE_FUNC(isascii, ASCII)
+TEST_CTYPE_FUNC(islower, LOWER)
+TEST_CTYPE_FUNC(isupper, UPPER)
+TEST_CTYPE_FUNC(iscntrl, CNTRL)
+TEST_CTYPE_FUNC(ispunct, PUNCT)
+TEST_CTYPE_FUNC(isxdigit, DIGIT "abcdefABCDEF")
+TEST_CTYPE_FUNC(isprint, LOWER UPPER DIGIT PUNCT " ")
+
+int cmd_main(int argc, const char **argv) {
+	/* Run all character type tests */
+	TEST(test_ctype_isspace(), "isspace() works as we expect");
+	TEST(test_ctype_isdigit(), "isdigit() works as we expect");
+	TEST(test_ctype_isalpha(), "isalpha() works as we expect");
+	TEST(test_ctype_isalnum(), "isalnum() works as we expect");
+	TEST(test_ctype_is_glob_special(), "is_glob_special() works as we expect");
+	TEST(test_ctype_is_regex_special(), "is_regex_special() works as we expect");
+	TEST(test_ctype_is_pathspec_magic(), "is_pathspec_magic() works as we expect");
+	TEST(test_ctype_isascii(), "isascii() works as we expect");
+	TEST(test_ctype_islower(), "islower() works as we expect");
+	TEST(test_ctype_isupper(), "isupper() works as we expect");
+	TEST(test_ctype_iscntrl(), "iscntrl() works as we expect");
+	TEST(test_ctype_ispunct(), "ispunct() works as we expect");
+	TEST(test_ctype_isxdigit(), "isxdigit() works as we expect");
+	TEST(test_ctype_isprint(), "isprint() works as we expect");
+
+	return test_done();
+}

base-commit: 055bb6e9969085777b7fab83e3fee0017654f134
-- 
2.42.0.windows.2


^ permalink raw reply related

* Re: [PATCH v2 1/9] CodingGuidelines: move period inside parentheses
From: Dragan Simic @ 2023-12-22  1:30 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Josh Soref via GitGitGadget, git, Elijah Newren,
	René Scharfe, Phillip Wood, Josh Soref
In-Reply-To: <xmqqedffl13r.fsf@gitster.g>

On 2023-12-21 22:03, Junio C Hamano wrote:
> "Josh Soref via GitGitGadget" <gitgitgadget@gmail.com> writes:
> 
>> From: Josh Soref <jsoref@gmail.com>
>> 
>> The contents within parenthesis should be omittable without resulting
>> in broken text.
>> 
>> Eliding the parenthesis left a period to end a run without any 
>> content.
> 
> Nobody seems to have commented on this one in the previous round,
> but quite honestly, for this particular instance, I suspect that it
> may become easier to read if we just lost these parentheses, as the
> next sentence "You do not have to include more than ..." is just a
> bit of extra material to support the first sentence like the one we
> see in the parentheses.  Alternatively, moving that "You do not have
> to" also inside the same parentheses might work slightly better.

I agree with simply erasing the parentheses, it makes the paragraph flow 
much better.

> It might be even easier to follow if we moved the list of "approved
> headers" (and the "You do not have to ... more than one" note that
> supports the "currently approved list") totally out of line by
> making it a side note.
> 
>> Signed-off-by: Josh Soref <jsoref@gmail.com>
>> ---
>>  Documentation/CodingGuidelines | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/Documentation/CodingGuidelines 
>> b/Documentation/CodingGuidelines
>> index 8ed517a5ca0..af94ed3a75d 100644
>> --- a/Documentation/CodingGuidelines
>> +++ b/Documentation/CodingGuidelines
>> @@ -450,7 +450,7 @@ For C programs:
>>     one of the approved headers that includes it first for you.  (The
>>     approved headers currently include "builtin.h",
>>     "t/helper/test-tool.h", "xdiff/xinclude.h", or
>> -   "reftable/system.h").  You do not have to include more than one of
>> +   "reftable/system.h".)  You do not have to include more than one of
>>     these.
>> 
>>   - A C file must directly include the header files that declare the

^ permalink raw reply

* Re: rebase invoking pre-commit
From: Phillip Wood @ 2023-12-22 10:05 UTC (permalink / raw)
  To: Sean Allred, Git List
In-Reply-To: <m0sf3vi86g.fsf@epic96565.epic.com>

Hi Sean

On 21/12/2023 20:58, Sean Allred wrote:
> Is there a current reason why pre-commit shouldn't be invoked during
> rebase, or is this just waiting for a reviewable patch?

The reason that we don't run the pre-commit hook is that the commit 
being rebased may have been created with "git commit --no-verify" and so 
running the pre-commit hook would stop it from being rebased - see 
e637122ef2 (rebase -m: do not trigger pre-commit verification, 2008-03-16).

I think that most of the time it would be valuable to run the pre-commit 
hook when committing a conflict resolution but we'd need to add 
something like "git rebase --continue --no-verify" as a way to bypass it 
when resolving conflicts in commits that were created with "git commit 
--no-verify".

Best Wishes

Phillip

> This was brought up before at [1] in 2015, but that thread so old at
> this point that it seemed prudent to double-check before investing time
> in a developing and testing a patch.
> 
> [1]: https://lore.kernel.org/git/1m55i3m.1fum4zo1fpnhncM%25lists@haller-berlin.de/
> 
> --
> Sean Allred
> 

^ permalink raw reply

* Re: [PATCH 01/12] t: introduce DEFAULT_REPO_FORMAT prereq
From: Karthik Nayak @ 2023-12-22 11:41 UTC (permalink / raw)
  To: Patrick Steinhardt, git
In-Reply-To: <239ca38efdbe3195e6319be4423b17d2b42e0b9f.1703067989.git.ps@pks.im>

Patrick Steinhardt <ps@pks.im> writes:

> A limited number of tests require repositories to have the default
> repository format or otherwise they would fail to run, e.g. because they
> fail to detect the correct hash function. While the hash function is the
> only extension right now that creates problems like this, we are about
> to add a second extensions for the ref format.
>

Nit: s/extensions/extension

> -test_expect_success SHA1 'git branch -m q q2 without config should succeed' '
> +test_expect_success DEFAULT_REPO_FORMAT 'git branch -m q q2 without config should succeed' '
>  	git branch -m q q2 &&
>  	git branch -m q2 q
>  '

My only concern is whether we'll end up blindly adding
DEFAULT_REPO_FORMAT for tests where only SHA1 is a prereq or only where
the new format extension is a prereq.

^ permalink raw reply

* Re: [PATCH 02/12] worktree: skip reading HEAD when repairing worktrees
From: Karthik Nayak @ 2023-12-22 12:23 UTC (permalink / raw)
  To: Patrick Steinhardt, git
In-Reply-To: <e895091025e6786e0a831e4fb9b092eb74bd4379.1703067989.git.ps@pks.im>

Patrick Steinhardt <ps@pks.im> writes:

> When calling `git init --separate-git-dir=<new-path>` on a preexisting
> repository, then we move the Git directory of that repository to the new
> path specified by the user. If there are worktrees present in the Git
> repository, we need to repair the worktrees so that their gitlinks point
> to the new location of the repository.
>

Nit: s/then we/we

^ permalink raw reply

* Re: [PATCH 03/12] refs: refactor logic to look up storage backends
From: Karthik Nayak @ 2023-12-22 12:38 UTC (permalink / raw)
  To: Patrick Steinhardt, git
In-Reply-To: <f712d5ef5bc5bed423cf50a0d0489095ba103df7.1703067989.git.ps@pks.im>

Patrick Steinhardt <ps@pks.im> writes:

> +
> +const char *ref_storage_format_to_name(int ref_storage_format)
> +{
> +	const struct ref_storage_be *be = find_ref_storage_backend(ref_storage_format);
> +	if (!be)
> +		return "unknown";
> +	return be->name;
> +}
>

Would it make more sense to return NULL here?

> +
>  /*
>   * How to handle various characters in refnames:
>   * 0: An acceptable character for refs
> @@ -2029,12 +2045,12 @@ static struct ref_store *ref_store_init(struct repository *repo,
>  					const char *gitdir,
>  					unsigned int flags)
>  {
> -	const char *be_name = "files";
> -	struct ref_storage_be *be = find_ref_storage_backend(be_name);
> +	int format = REF_STORAGE_FORMAT_FILES;
> +	const struct ref_storage_be *be = find_ref_storage_backend(format);
>  	struct ref_store *refs;
>
>  	if (!be)
> -		BUG("reference backend %s is unknown", be_name);
> +		BUG("reference backend %s is unknown", ref_storage_format_to_name(format));
>

This doesn't really get us more information, since be == NULL, calling
`ref_storage_format_to_name` will again call `find_ref_storage_backend`,
which will be NULL. I wonder if its better to:

@@ -2029,12 +2045,12 @@ static struct ref_store *ref_store_init(struct
repository *repo,
                     const char *gitdir,
                     unsigned int flags)
  {
 -	const char *be_name = "files";
 -	struct ref_storage_be *be = find_ref_storage_backend(be_name);
 +	int format = REF_STORAGE_FORMAT_FILES;
 +	const struct ref_storage_be *be = find_ref_storage_backend(format);
     struct ref_store *refs;

     if (!be)
 -		BUG("reference backend %s is unknown", be_name);
 +		BUG("reference backend is unknown");

^ permalink raw reply

* Re: [PATCH 04/12] setup: start tracking ref storage format when
From: Karthik Nayak @ 2023-12-22 13:09 UTC (permalink / raw)
  To: Patrick Steinhardt, git
In-Reply-To: <6564659d403de098799ddb8101b74c2803a655d4.1703067989.git.ps@pks.im>

Patrick Steinhardt <ps@pks.im> writes:

> In order to discern which ref storage format a repository is supposed to
> use we need to start setting up and/or discovering the format. This
> needs to happen in two separate code paths.
>
>   - The first path is when we create a repository via `init_db()`. When
>     we are re-initializing a preexisting repository we need to retain
>     the previously used ref storage format -- if the user asked for a
>     different format then this indicates an erorr and we error out.

Nit: s/erorr/error

> diff --git a/refs.c b/refs.c
> index e87c85897d..d289a5e175 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -2045,12 +2045,12 @@ static struct ref_store *ref_store_init(struct repository *repo,
>  					const char *gitdir,
>  					unsigned int flags)
>  {
> -	int format = REF_STORAGE_FORMAT_FILES;
> -	const struct ref_storage_be *be = find_ref_storage_backend(format);
> +	const struct ref_storage_be *be;
>  	struct ref_store *refs;
>
> +	be = find_ref_storage_backend(repo->ref_storage_format);
>  	if (!be)
> -		BUG("reference backend %s is unknown", ref_storage_format_to_name(format));
> +		BUG("reference backend is unknown");
>
>  	refs = be->init(repo, gitdir, flags);
>  	return refs;
> diff --git a/refs.h b/refs.h
> index c6571bcf6c..944a67ac1b 100644
> --- a/refs.h
> +++ b/refs.h
> @@ -11,6 +11,7 @@ struct string_list;
>  struct string_list_item;
>  struct worktree;
>
> +int default_ref_storage_format(void);
>

Is this used/defined somewhere?

^ permalink raw reply

* Re: [PATCH 12/12] t9500: write "extensions.refstorage" into config
From: Karthik Nayak @ 2023-12-22 13:43 UTC (permalink / raw)
  To: Patrick Steinhardt, git
In-Reply-To: <bbe2fbb15495ad6c8eb0824b4b4aaa7f3e6e2537.1703067989.git.ps@pks.im>

Patrick Steinhardt <ps@pks.im> writes:

> In t9500 we're writing a custom configuration that sets up gitweb. This
> requires us manually ensure that the repository format is configured as
>

Nit: s/requires us/requires us to/

^ permalink raw reply

* Re: [PATCH 00/12] Introduce `refStorage` extension
From: Karthik Nayak @ 2023-12-22 13:43 UTC (permalink / raw)
  To: Patrick Steinhardt, git
In-Reply-To: <cover.1703067989.git.ps@pks.im>

Hello,

Patrick Steinhardt <ps@pks.im> writes:
> Hi,
>
> this patch series introduces a new `refStorage` extension and related
> tooling. While there is only a single user-visible ref backend for now,
> this extension will eventually allow us to determine which backend shall
> be used for a repository. Furthermore, the series introduces tooling so
> that the ref storage format becomes more accessible.

Apart from small nits, the series looks good to me, thanks!

^ permalink raw reply

* Re: [PATCH 2/2] ref-filter: support filtering of operational refs
From: Karthik Nayak @ 2023-12-22 14:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, ps, christian.couder
In-Reply-To: <xmqqzfy3l270.fsf@gitster.g>

Junio C Hamano <gitster@pobox.com> writes:
> We have pseudoref (those all caps files outside the refs/ hierarchy)
> as an official term defined in the glossary, and Patrick's reftable
> work based on Han-Wen's work revealed the need to treat FETCH_HEAD
> and MERGE_HEAD as "even more pecurilar than pseudorefs" that need
> different term (tentatively called "special refs").  Please avoid
> coming up with yet another random name "operational" without
> discussing.
>

I totally agree with the term usage and will switch to "pseudorefs".

> With a quick look at the table in this patch, "pseudorefs" appears
> to be the closest word that people are already familiar with, I
> think.  A lot more reasonable thing to do may be to scan the
> $GIT_DIR for files whose name satisfy refs.c:is_pseudoref_syntax()
> and list them, instead of having a hardcoded list of these special
> refs.  In addition, when reftable and other backends that can
> natively store things outside refs/ hierarchy is in use, they ought
> to know what they have so enumerating these would not be an issue
> for them without having such a hardcoded table of names.

Thanks for that, I did play around with trying to find files which
could be refs in the $GIT_DIR, but the issue is that there will be
false positives. e.g. `COMMIT_EDITMSG` could be confused for a
pseudoref (passes is_pseudoref_syntax()) and it could potentially also
contain a commit-ish value.

While you're here, I wonder if you have any thoughts on the last block
of my first mail.

> Over this, I'm also curious on what the mailing list thinks about
> exposing this to the client side. Would an `--all` option for
> git-for-each-ref(1) be sufficient?
>

Thanks
- Karthik

^ permalink raw reply

* Re: Git mirror at gitlab
From: Olliver Schinagl @ 2023-12-22 14:06 UTC (permalink / raw)
  To: Patrick Steinhardt
  Cc: git, gitster, Ævar Arnfjörð Bjarmason, psteinhardt
In-Reply-To: <ZYQl_G-S4vQibHWn@framework>

Hey Patrick,

On December 21, 2023 12:48:12 p.m. GMT+01:00, Patrick Steinhardt <ps@pks.im> wrote:
>On Thu, Dec 21, 2023 at 12:30:02PM +0100, Olliver Schinagl wrote:
>> Hey list,
>> 
>> For years, I wanted (tried, but time) to run the mirror for git on gitlab.
>> Actually, the original idea was to run a docker container ([0] 10k+ pulls
>> :p)
>> 
>> I initially set this up via docker build containers, where docker hub would
>> pull my mirror of the git repo. My mirror, because I added a Dockerfile
>> which was enough for docker to do its trick. I was planning (time ..) on
>> submitting this upstream to the list, but never did. Because of me not doing
>> that, I had to manually (I was even too lazy to script it) rebase the
>> branch. Docker then did some changes to their business, where the docker
>> builds where not possible anymore.
>> 
>> So then I figured, I'll do the same on gitlab and push it to the docker hub.
>> Thus I setup a mirror on gitlab [1], with the idea to work there on it.
>> 
>> Again, I never got around to finalize this work, mostly because the docker
>> container 'just worked' for pretty much everything. After all, git is very
>> stable overal.
>> 
>> So very interestingly, last month commit 0e3b67e2aa25edb ("ci: add support
>> for GitLab CI") landed, which started to trigger pipeline jobs!
>> 
>> Sadly, this only worked for 3 builds, as that's when the minutes ran out :)
>> 
>> So one, I would very much like to offer the registered names (cause they are
>> pretty nice in name) to here, so people can use and find it.
>
>Not to throw a wrench into this, but are you aware of the official
>GitLab mirror at https://gitlab.com/git-vcs/git? I myself wasn't aware
>of this mirror for a rather long time.
Not a wrench at all, and no, I didn't know. How old is it though :p could be that git-vcs was created cause I owned gitscm :)


I had chosen gitscm to match the official site, git-scm.org. the hyphen I left out because afaik it wasn't allowed on docker hub.

>
>I also wondered whether we want to have https://gitlab.com/git/git as we
>do on GitHub. I don't think anybody registered it, but it is blocked
>from being registered as far as I can tell. Maybe we block the namespace
>out of caution, I dunno. I can certainly check in with our SREs in case
>it is something the Git project would like to own.
Yeah couldn't figure out who it was either ... hence gitscm.

Sadly gitlab doesn't support aliases :) I'm more then happy to hand over the space. Whatever name is decided to be best.

>
>> Two, hopefully get Patrick Steinhardt to help out to get unlimited minutes
>> and storage on the repo :)
>
>I'm sure we can do something here, but I'd rather aim to do this for the
>official mirror which currently is the one I mentioned above. If the
>project is interested in running builds on GitLab then I'm happy to
>coordinate.
>
>Also Cc Ævar, who is the current owner of the mirror. Would it be
>possible to add myself as a second owner to the project? This might help
>setting up the CI infrastructure. But please, if anybody disagrees with
>me being added as an owner here I encourage you to say so.
>
>> Three, see what the opinion of people here is on this. I'll do the work to
>> get the dockerfile (now containerfile, we're inclusive after all) merged,
>> and the CI file updated to create, store (and push to docker hub) the
>> generated containers.
>
>I don't really have much of an opinion here and will leave it to others
>to discuss.
>
>Thanks!
>
>Patrick
>
>PS: As most other folks I'll be OOO during holidays, so I may only
>    answer sporadically.
>
>> Thanks,
>> Olliver
>> 
>> [0]: https://hub.docker.com/r/gitscm/git
>> [1]: https://gitlab.com/gitscm/git
>> 
>> P.S. I'm not subscribed, so please keep me in the CC :)

^ permalink raw reply

* [PATCH] sideband.c: replace int with size_t for clarity
From: Chandra Pratap via GitGitGadget @ 2023-12-22 17:01 UTC (permalink / raw)
  To: git; +Cc: Chandra Pratap, Chandra Pratap

From: Chandra Pratap <chandrapratap3519@gmail.com>

Replace int with size_t for clarity and remove the
'NEEDSWORK' tag associated with it.

Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
---
    sideband.c: replace int with size_t for clarity

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1625%2FChand-ra%2Fdusra-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1625/Chand-ra/dusra-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1625

 sideband.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/sideband.c b/sideband.c
index 6cbfd391c47..1599e408d1b 100644
--- a/sideband.c
+++ b/sideband.c
@@ -69,9 +69,8 @@ void list_config_color_sideband_slots(struct string_list *list, const char *pref
  * of the line. This should be called for a single line only, which is
  * passed as the first N characters of the SRC array.
  *
- * NEEDSWORK: use "size_t n" instead for clarity.
  */
-static void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n)
+static void maybe_colorize_sideband(struct strbuf *dest, const char *src, size_t n)
 {
 	int i;
 

base-commit: 1a87c842ece327d03d08096395969aca5e0a6996
-- 
gitgitgadget

^ permalink raw reply related

* Re: [PATCH] sideband.c: replace int with size_t for clarity
From: Torsten Bögershausen @ 2023-12-22 17:57 UTC (permalink / raw)
  To: Chandra Pratap via GitGitGadget; +Cc: git, Chandra Pratap, Chandra Pratap
In-Reply-To: <pull.1625.git.1703264469238.gitgitgadget@gmail.com>

On Fri, Dec 22, 2023 at 05:01:09PM +0000, Chandra Pratap via GitGitGadget wrote:
> From: Chandra Pratap <chandrapratap3519@gmail.com>
>
> Replace int with size_t for clarity and remove the
> 'NEEDSWORK' tag associated with it.
>
> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
> ---
>     sideband.c: replace int with size_t for clarity
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1625%2FChand-ra%2Fdusra-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1625/Chand-ra/dusra-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1625
>
>  sideband.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/sideband.c b/sideband.c
> index 6cbfd391c47..1599e408d1b 100644
> --- a/sideband.c
> +++ b/sideband.c
> @@ -69,9 +69,8 @@ void list_config_color_sideband_slots(struct string_list *list, const char *pref
>   * of the line. This should be called for a single line only, which is
>   * passed as the first N characters of the SRC array.
>   *
> - * NEEDSWORK: use "size_t n" instead for clarity.
>   */
> -static void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n)
> +static void maybe_colorize_sideband(struct strbuf *dest, const char *src, size_t n)
>  {
>  	int i;
>

Thanks for working on this.
There is, however, more potential for improvements.
First of all: the "int i" from above.
Further down, we read
	for (i = 0; i < ARRAY_SIZE(keywords); i++) {

However, a size of an array can never be negative, so that
an unsigned data type is a better choice than a signed.
And, arrays can have more elements than an int can address,
at least in theory.
For a reader it makes more sense, to replace
int i;
with
size_t i;


And further down, there is another place for improvments:

		int len = strlen(p->keyword);
		if (n < len)
			continue;

Even here, a strlen is never negative. And a size_t is the choice for len,
since all "modern" implementations declare strlen() to return size_t

Do you think that you can have a look at these changes ?

I will be happy to do a review, and possibly other people as well.

^ permalink raw reply

* Re: [PATCH] sideband.c: replace int with size_t for clarity
From: Chandra Pratap @ 2023-12-22 18:27 UTC (permalink / raw)
  To: Torsten Bögershausen
  Cc: Chandra Pratap via GitGitGadget, git, Chandra Pratap
In-Reply-To: <20231222175747.GA2880@tb-raspi4>

Thanks for the feedback, Torsten. I was working on improving the rest of
sideband.c as you suggested when I encountered this snippet on line 82:

while (0 < n && isspace(*src)) {
strbuf_addch(dest, *src);
src++;
n--;
}

Here, we are decreasing the value of an unsigned type to potentially below
0 which may lead to overflow and result in some nasty bug. In this case,
is it wise to continue with replacing 'int n' with 'size_t n' as the
NEEDSWORK tag suggests or should we improve upon the rest of the file
and revert 'size_t n' to 'int n' ?

On Fri, 22 Dec 2023 at 23:27, Torsten Bögershausen <tboegi@web.de> wrote:
>
> On Fri, Dec 22, 2023 at 05:01:09PM +0000, Chandra Pratap via GitGitGadget wrote:
> > From: Chandra Pratap <chandrapratap3519@gmail.com>
> >
> > Replace int with size_t for clarity and remove the
> > 'NEEDSWORK' tag associated with it.
> >
> > Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
> > ---
> >     sideband.c: replace int with size_t for clarity
> >
> > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1625%2FChand-ra%2Fdusra-v1
> > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1625/Chand-ra/dusra-v1
> > Pull-Request: https://github.com/gitgitgadget/git/pull/1625
> >
> >  sideband.c | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/sideband.c b/sideband.c
> > index 6cbfd391c47..1599e408d1b 100644
> > --- a/sideband.c
> > +++ b/sideband.c
> > @@ -69,9 +69,8 @@ void list_config_color_sideband_slots(struct string_list *list, const char *pref
> >   * of the line. This should be called for a single line only, which is
> >   * passed as the first N characters of the SRC array.
> >   *
> > - * NEEDSWORK: use "size_t n" instead for clarity.
> >   */
> > -static void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n)
> > +static void maybe_colorize_sideband(struct strbuf *dest, const char *src, size_t n)
> >  {
> >       int i;
> >
>
> Thanks for working on this.
> There is, however, more potential for improvements.
> First of all: the "int i" from above.
> Further down, we read
>         for (i = 0; i < ARRAY_SIZE(keywords); i++) {
>
> However, a size of an array can never be negative, so that
> an unsigned data type is a better choice than a signed.
> And, arrays can have more elements than an int can address,
> at least in theory.
> For a reader it makes more sense, to replace
> int i;
> with
> size_t i;
>
>
> And further down, there is another place for improvments:
>
>                 int len = strlen(p->keyword);
>                 if (n < len)
>                         continue;
>
> Even here, a strlen is never negative. And a size_t is the choice for len,
> since all "modern" implementations declare strlen() to return size_t
>
> Do you think that you can have a look at these changes ?
>
> I will be happy to do a review, and possibly other people as well.

^ permalink raw reply

* Re: [PATCH] sideband.c: replace int with size_t for clarity
From: Torsten Bögershausen @ 2023-12-22 18:39 UTC (permalink / raw)
  To: Chandra Pratap; +Cc: Chandra Pratap via GitGitGadget, git, Chandra Pratap
In-Reply-To: <CA+J6zkQ=+bCsi7URtpTfgawTrmVVALvMi3kdnMi=mUjB_Yxxwg@mail.gmail.com>


(PLease, avoid top-posting on this list)
On Fri, Dec 22, 2023 at 11:57:25PM +0530, Chandra Pratap wrote:
> Thanks for the feedback, Torsten. I was working on improving the rest of
> sideband.c as you suggested when I encountered this snippet on line 82:
>
> while (0 < n && isspace(*src)) {
> strbuf_addch(dest, *src);
> src++;
> n--;
> }
>
> Here, we are decreasing the value of an unsigned type to potentially below
> 0 which may lead to overflow and result in some nasty bug. In this case,
> is it wise to continue with replacing 'int n' with 'size_t n' as the
> NEEDSWORK tag suggests or should we improve upon the rest of the file
> and revert 'size_t n' to 'int n' ?

Yes, that could be a solution.
But.
In general, we are are striving to use size_t for all objects that live in
memory, and that is a long term thing.
Careful review is needed, and that is what you just did.

If we look at this code again:
while (0 < n && isspace(*src)) {
  strbuf_addch(dest, *src);
  src++;
  n--;
}

When n is signed, it makes sense to use "0 < n".
However, if I think about it, it should work for unsigned as well,
wouldn't it ?
We can leave it as is, or replace it by

while (n && isspace(*src)) {
  strbuf_addch(dest, *src);
  src++;
  n--;
}




>
> On Fri, 22 Dec 2023 at 23:27, Torsten Bögershausen <tboegi@web.de> wrote:
> >
> > On Fri, Dec 22, 2023 at 05:01:09PM +0000, Chandra Pratap via GitGitGadget wrote:
> > > From: Chandra Pratap <chandrapratap3519@gmail.com>
> > >
> > > Replace int with size_t for clarity and remove the
> > > 'NEEDSWORK' tag associated with it.
> > >
> > > Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
> > > ---
> > >     sideband.c: replace int with size_t for clarity
> > >
> > > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1625%2FChand-ra%2Fdusra-v1
> > > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1625/Chand-ra/dusra-v1
> > > Pull-Request: https://github.com/gitgitgadget/git/pull/1625
> > >
> > >  sideband.c | 3 +--
> > >  1 file changed, 1 insertion(+), 2 deletions(-)
> > >
> > > diff --git a/sideband.c b/sideband.c
> > > index 6cbfd391c47..1599e408d1b 100644
> > > --- a/sideband.c
> > > +++ b/sideband.c
> > > @@ -69,9 +69,8 @@ void list_config_color_sideband_slots(struct string_list *list, const char *pref
> > >   * of the line. This should be called for a single line only, which is
> > >   * passed as the first N characters of the SRC array.
> > >   *
> > > - * NEEDSWORK: use "size_t n" instead for clarity.
> > >   */
> > > -static void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n)
> > > +static void maybe_colorize_sideband(struct strbuf *dest, const char *src, size_t n)
> > >  {
> > >       int i;
> > >
> >
> > Thanks for working on this.
> > There is, however, more potential for improvements.
> > First of all: the "int i" from above.
> > Further down, we read
> >         for (i = 0; i < ARRAY_SIZE(keywords); i++) {
> >
> > However, a size of an array can never be negative, so that
> > an unsigned data type is a better choice than a signed.
> > And, arrays can have more elements than an int can address,
> > at least in theory.
> > For a reader it makes more sense, to replace
> > int i;
> > with
> > size_t i;
> >
> >
> > And further down, there is another place for improvments:
> >
> >                 int len = strlen(p->keyword);
> >                 if (n < len)
> >                         continue;
> >
> > Even here, a strlen is never negative. And a size_t is the choice for len,
> > since all "modern" implementations declare strlen() to return size_t
> >
> > Do you think that you can have a look at these changes ?
> >
> > I will be happy to do a review, and possibly other people as well.
>

^ permalink raw reply

* Re: [PATCH] sideband.c: replace int with size_t for clarity
From: Junio C Hamano @ 2023-12-22 19:01 UTC (permalink / raw)
  To: Torsten Bögershausen
  Cc: Chandra Pratap via GitGitGadget, git, Chandra Pratap,
	Chandra Pratap
In-Reply-To: <20231222175747.GA2880@tb-raspi4>

Torsten Bögershausen <tboegi@web.de> writes:

Just this part.

> Further down, we read
> 	for (i = 0; i < ARRAY_SIZE(keywords); i++) {
>
> However, a size of an array can never be negative, so that
> an unsigned data type is a better choice than a signed.
> And, arrays can have more elements than an int can address,
> at least in theory.
> For a reader it makes more sense, to replace
> int i;
> with
> size_t i;

It is a very good discipline to use size_t to index into an array
whose size is externally controled (e.g., we slurp what the end user
or the server on the other end of the connection gave us into a
piece of memory we allocate) to avoid integer overflows as "int" is
often narrower than "size_t".  But this particular one is a Meh; the
keywords[] is a small hardcoded array whose size and contents are
totally under our control.

^ permalink raw reply

* Re: rebase invoking pre-commit
From: Sean Allred @ 2023-12-22 22:05 UTC (permalink / raw)
  To: phillip.wood; +Cc: Sean Allred, Git List
In-Reply-To: <bf1ce173-50d7-405f-88c1-7edb7ec5a55a@gmail.com>


Phillip Wood <phillip.wood123@gmail.com> writes:
> [...], we'd need to add something like "git rebase --continue
> --no-verify" as a way to bypass it when resolving conflicts in commits
> that were created with "git commit --no-verify".

Looks like such a flag already exists to skip the 'pre-rebase' hook.
Presumably it can pull double-duty to skip pre-commit as well.

(It's a little messy to me to not be able to pick/choose which hooks to
skip, but looks like this is already status quo with other hooks.)

--
Sean Allred

^ permalink raw reply

* Re: [PATCH/RFC] sparse-checkout: take care of "--end-of-options" in set/add
From: Jeff King @ 2023-12-23 10:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Josh Steadmon, git
In-Reply-To: <xmqqle9njjp3.fsf@gitster.g>

On Thu, Dec 21, 2023 at 02:04:56PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > Right, that is the "gotcha" I mentioned in my other email. Though that
> > is the way it has behaved historically, my argument is that users are
> > unreasonable to expect it to work:
> >
> >   1. It is not consistent with the rest of Git commands.
> >
> >   2. It is inconsistent with respect to existing options (and is an
> >      accident waiting to happen when new options are added).
> >
> > So I'd consider it a bug-fix.
> 
> So the counter-proposal here is just to drop KEEP_UNKNOWN_OPT and
> deliberately break them^W^W^Wrealign their expectations?

Yes. :) But keep in mind we are un-breaking other people, like those who
typo:

  git sparse-checkout --sikp-checks

and don't see an error (instead, we make a garbage entry in the sparse
checkout file).

> I do not have much stake in sparse-checkout, so I am fine with that
> direction.  But I suspect other folks on the list would have users
> of their own who would rather loudly complain to them if we broke
> them ;-) 

Likewise, I have never used sparse-checkout myself, and I don't care
_that_ much. My interest is mostly in having various Git commands behave
consistently. This whole discussion started because the centralized
--end-of-options fix interacted badly with this unusual behavior.

-Peff

^ permalink raw reply

* [PATCH v2] t1006: add tests for %(objectsize:disk)
From: Jeff King @ 2023-12-23 10:09 UTC (permalink / raw)
  To: René Scharfe
  Cc: git, Ondrej Pohorelsky, brian m . carlson, Junio C Hamano
In-Reply-To: <120b3194-5eee-47ed-b2d8-bc6731b71a6b@web.de>

On Fri, Dec 22, 2023 at 12:13:10AM +0100, René Scharfe wrote:

> >> 			while read start oid rest
> >> 			do
> >> 				size=$((end - start)) &&
> >> 				end=$start &&
> >> 				echo "$oid $size" ||
> >> 				return 1
> >> 			done <idx.sorted
> >
> > The one thing I do like is that we don't have to escape anything inside
> > an awk program that is forced to use double-quotes. ;)
> 
> For me it's processing the data in the "correct" order (descending, i.e.
> starting at the end, which we have to calculate first anyway based on the
> size).

That was one thing that I thought made it more complicated. The obvious
order to me is start-to-end in the pack. But I do agree that going in
reverse order makes things much simpler, as we compute the size of each
entry as we see it (and so there are fewer special cases).

So I'm convinced that it's worth switching. Here's a v2 with your
suggestion.

-- >8 --
Subject: t1006: add tests for %(objectsize:disk)

Back when we added this placeholder in a4ac106178 (cat-file: add
%(objectsize:disk) format atom, 2013-07-10), there were no tests,
claiming "[...]the exact numbers returned are volatile and subject to
zlib and packing decisions".

But we can use a little shell hackery to get the expected numbers
ourselves. To a certain degree this is just re-implementing what Git is
doing under the hood, but it is still worth doing. It makes sure we
exercise the %(objectsize:disk) code at all, and having the two
implementations agree gives us more confidence.

Note that our shell code assumes that no object appears twice (either in
two packs, or as both loose and packed), as then the results really are
undefined. That's OK for our purposes, and the test will notice if that
assumption is violated (the shell version would produce duplicate lines
that Git's output does not have).

Helped-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Jeff King <peff@peff.net>
---
 t/t1006-cat-file.sh | 36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index 271c5e4fd3..e0c6482797 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -1100,6 +1100,42 @@ test_expect_success 'cat-file --batch="batman" with --batch-all-objects will wor
 	cmp expect actual
 '

+test_expect_success 'cat-file %(objectsize:disk) with --batch-all-objects' '
+	# our state has both loose and packed objects,
+	# so find both for our expected output
+	{
+		find .git/objects/?? -type f |
+		awk -F/ "{ print \$0, \$3\$4 }" |
+		while read path oid
+		do
+			size=$(test_file_size "$path") &&
+			echo "$oid $size" ||
+			return 1
+		done &&
+		rawsz=$(test_oid rawsz) &&
+		find .git/objects/pack -name "*.idx" |
+		while read idx
+		do
+			git show-index <"$idx" >idx.raw &&
+			sort -nr <idx.raw >idx.sorted &&
+			packsz=$(test_file_size "${idx%.idx}.pack") &&
+			end=$((packsz - rawsz)) &&
+			while read start oid rest
+			do
+				size=$((end - start)) &&
+				end=$start &&
+				echo "$oid $size" ||
+				return 1
+			done <idx.sorted ||
+			return 1
+		done
+	} >expect.raw &&
+	sort <expect.raw >expect &&
+	git cat-file --batch-all-objects \
+		--batch-check="%(objectname) %(objectsize:disk)" >actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'set up replacement object' '
 	orig=$(git rev-parse HEAD) &&
 	git cat-file commit $orig >orig &&
-- 
2.43.0.448.g93112243fb

^ permalink raw reply related

* Re: [PATCH] t1006: add tests for %(objectsize:disk)
From: Jeff King @ 2023-12-23 10:18 UTC (permalink / raw)
  To: René Scharfe
  Cc: git, Ondrej Pohorelsky, brian m . carlson, Junio C Hamano
In-Reply-To: <120b3194-5eee-47ed-b2d8-bc6731b71a6b@web.de>

On Fri, Dec 22, 2023 at 12:13:10AM +0100, René Scharfe wrote:

> >> Should we deduplicate here, like cat-file does (i.e. use "sort -u")?
> >> Having the same object in multiple places for whatever reason would not
> >> be a cause for reporting an error in this test, I would think.
> >
> > No, for the reasons I said in the commit message: if an object exists in
> > multiple places the test is already potentially invalid, as Git does not
> > promise which version it will use. So it might work racily, or it might
> > work for now but be fragile. By not de-duplicating, we make sure the
> > test's assumption holds.
> 
> Oh, skipped that paragraph.  Still I don't see how a duplicate object
> would necessarily invalidate t1006.  The comment for the test "cat-file
> --batch-all-objects shows all objects" a few lines above indicates that
> it's picky about the provenance of objects, but it uses a separate
> repository.  I can't infer the same requirement for the root repo, but
> we already established that I can't read.

The cat-file documentation explicitly calls this situation out:

  Note also that multiple copies of an object may be present in the
  object database; in this case, it is undefined which copy’s size or
  delta base will be reported.

So if t1006 were to grow such a duplicate object, what will happen? If
we de-dup in the new test, then we might end up mentioning the same copy
(and the test passes), or we might not (and the test fails). But much
worse, the results might be racy (depending on how cat-file happens to
decide which one to use). By no de-duping, then the test will reliably
fail and the author can decide how to handle it then.

IOW it is about failing immediately and predictably rather than letting
a future change to sneak a race or other accident-waiting-to-happen into
t1006.

> Anyway, if someone finds a use for git repack without -d or
> git unpack-objects or whatever else causes duplicates in the root
> repository of t1006 then they can try to reverse your ban with concrete
> arguments.

In the real world, the most common way to get a duplicate is to fetch or
push into a repository, such that:

  1. There are enough objects to retain the pack (100 by default)

  2. There's a thin delta in the on-the-wire pack (i.e., a delta against
     a base that the sender knows the receiver has, but which isn't
     itself sent).

Then "index-pack --fix-thin" will complete the on-disk pack by storing a
copy of the base object in it. And now we have it in two packs (and if
it's a delta or loose in the original, the size will be different).

-Peff

^ permalink raw reply

* RE: [PATCH/RFC] sparse-checkout: take care of "--end-of-options" in set/add
From: rsbecker @ 2023-12-23 15:38 UTC (permalink / raw)
  To: 'Jeff King', 'Junio C Hamano'
  Cc: 'Josh Steadmon', git
In-Reply-To: <20231223100229.GA2016274@coredump.intra.peff.net>

On Saturday, December 23, 2023 5:02 AM, Peff wrote:
>On Thu, Dec 21, 2023 at 02:04:56PM -0800, Junio C Hamano wrote:
>> Jeff King <peff@peff.net> writes:
>> > Right, that is the "gotcha" I mentioned in my other email. Though
>> > that is the way it has behaved historically, my argument is that
>> > users are unreasonable to expect it to work:
>> >
>> >   1. It is not consistent with the rest of Git commands.
>> >
>> >   2. It is inconsistent with respect to existing options (and is an
>> >      accident waiting to happen when new options are added).
>> >
>> > So I'd consider it a bug-fix.
>>
>> So the counter-proposal here is just to drop KEEP_UNKNOWN_OPT and
>> deliberately break them^W^W^Wrealign their expectations?
>
>Yes. :) But keep in mind we are un-breaking other people, like those who
>typo:
>
>  git sparse-checkout --sikp-checks

I don't see why 

git sparse-checkout -- --sikp-checks

would be the only way to get that typo into a garbage entry, to be consistent with other commands. Without the --, --sikp-checks should result in an error.

>and don't see an error (instead, we make a garbage entry in the sparse checkout
>file).
>
>> I do not have much stake in sparse-checkout, so I am fine with that
>> direction.  But I suspect other folks on the list would have users of
>> their own who would rather loudly complain to them if we broke them
>> ;-)
>
>Likewise, I have never used sparse-checkout myself, and I don't care _that_ much.
>My interest is mostly in having various Git commands behave consistently. This
>whole discussion started because the centralized --end-of-options fix interacted
>badly with this unusual behavior.

I use sparse-checkout every day and do depend on it working predictably (although I would take a fix to see it work as above, with the -- path separator), so personally, I care about this a whole lot -- in terms of consistency.


^ permalink raw reply

* [PATCH v2] sideband.c: replace int with size_t for clarity
From: Chandra Pratap via GitGitGadget @ 2023-12-23 17:03 UTC (permalink / raw)
  To: git; +Cc: Torsten Bögershausen, Chandra Pratap, Chandra Pratap
In-Reply-To: <pull.1625.git.1703264469238.gitgitgadget@gmail.com>

From: Chandra Pratap <chandrapratap3519@gmail.com>

Replace int with size_t for non-negative values to improve
clarity and remove the 'NEEDSWORK' tag associated with it.wq

Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
---
    sideband.c: replace int with size_t for clarity

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1625%2FChand-ra%2Fdusra-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1625/Chand-ra/dusra-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/1625

Range-diff vs v1:

 1:  be67e1bca38 ! 1:  fdacc69ae3b sideband.c: replace int with size_t for clarity
     @@ Metadata
       ## Commit message ##
          sideband.c: replace int with size_t for clarity
      
     -    Replace int with size_t for clarity and remove the
     -    'NEEDSWORK' tag associated with it.
     +    Replace int with size_t for non-negative values to improve
     +    clarity and remove the 'NEEDSWORK' tag associated with it.wq
      
          Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
      
     @@ sideband.c: void list_config_color_sideband_slots(struct string_list *list, cons
       {
       	int i;
       
     +@@ sideband.c: static void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n)
     + 
     + 	for (i = 0; i < ARRAY_SIZE(keywords); i++) {
     + 		struct keyword_entry *p = keywords + i;
     +-		int len = strlen(p->keyword);
     ++		size_t len = strlen(p->keyword);
     + 
     + 		if (n < len)
     + 			continue;


 sideband.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/sideband.c b/sideband.c
index 6cbfd391c47..7c3ec0ece29 100644
--- a/sideband.c
+++ b/sideband.c
@@ -69,9 +69,8 @@ void list_config_color_sideband_slots(struct string_list *list, const char *pref
  * of the line. This should be called for a single line only, which is
  * passed as the first N characters of the SRC array.
  *
- * NEEDSWORK: use "size_t n" instead for clarity.
  */
-static void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n)
+static void maybe_colorize_sideband(struct strbuf *dest, const char *src, size_t n)
 {
 	int i;
 
@@ -88,7 +87,7 @@ static void maybe_colorize_sideband(struct strbuf *dest, const char *src, int n)
 
 	for (i = 0; i < ARRAY_SIZE(keywords); i++) {
 		struct keyword_entry *p = keywords + i;
-		int len = strlen(p->keyword);
+		size_t len = strlen(p->keyword);
 
 		if (n < len)
 			continue;

base-commit: 1a87c842ece327d03d08096395969aca5e0a6996
-- 
gitgitgadget

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox