Git development
 help / color / mirror / Atom feed
* Re: [PATCH v2] test-lib-functions.sh: fix test_grep fail message wording
From: Kousik Sanagavarapu @ 2023-12-04 18:43 UTC (permalink / raw)
  To: Shreyansh Paliwal; +Cc: git, Junio C Hamano
In-Reply-To: <20231203171956.771-1-shreyanshpaliwalcmsmn@gmail.com>

On Sun, Dec 03, 2023 at 10:47:59PM +0530, Shreyansh Paliwal wrote:
> From: shreyp135 <shreyanshpaliwalcmsmn@gmail.com>
> 
> In the recent commit
> 2e87fca189 (test framework: further deprecate test_i18ngrep, 2023-10-31),
> the test_i18ngrep() function was deprecated.

s/In the/In a

is gramatically correct, but probably not worth a reroll.

> So if a test employing this function fails,
> the error messages may be confusing due to wording issues.

Isn't the confusion due to test_i18ngrep being displayed in place of
test_grep and not the other way around? Because the formation of the
sentence makes it look like the latter.

> It's important to address these wording changes to ensure smooth transitions
> for developers adapting to the deprecation of test_i18ngrep,
> and to maintain the effectiveness of the testing process.
> 
> Signed-off-by: Shreyansh Paliwal <Shreyanshpaliwalcmsmn@gmail.com>
> ---
>  t/test-lib-functions.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> index 9c3cf12b26..8737c95e0c 100644
> --- a/t/test-lib-functions.sh
> +++ b/t/test-lib-functions.sh
> @@ -1277,7 +1277,7 @@ test_grep () {
>  	if test $# -lt 2 ||
>  	   { test "x!" = "x$1" && test $# -lt 3 ; }
>  	then
> -		BUG "too few parameters to test_i18ngrep"
> +		BUG "too few parameters to test_grep"
>  	fi
>  
>  	if test "x!" = "x$1"
> -- 
> 2.43.0.1

Rest looks good.

Have a great time at the vacation Junio (and sorry for pinging in the
first place... although this email will indirectly ping too :P).

Thanks

^ permalink raw reply

* [Outreachy][PATCH v3] t2400: avoid using pipes
From: Achu Luma @ 2023-12-04 15:37 UTC (permalink / raw)
  To: christian.couder; +Cc: ach.lumap, git
In-Reply-To: <CAP8UFD0KDdwoJw6AzLUpqos=bLumcmDax59_MfQ9TUFqmmpcoA@mail.gmail.com>

The exit code of the preceding command in a pipe is disregarded,
so it's advisable to refrain from relying on it. Instead, by
saving the output of a Git command to a file, we gain the
ability to examine the exit codes of both commands separately.

Signed-off-by: Achu Luma <ach.lumap@gmail.com>
---
 Since v2 I don't send a cover  letter anymore, and I changed 
 my "Signed-of-by: ..." line so that it
 contains my full real name and I added "Outreachy" to the subject.

 t/t2400-worktree-add.sh | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/t/t2400-worktree-add.sh b/t/t2400-worktree-add.sh
index df4aff7825..7ead05bb98 100755
--- a/t/t2400-worktree-add.sh
+++ b/t/t2400-worktree-add.sh
@@ -468,7 +468,8 @@ test_expect_success 'put a worktree under rebase' '
 		cd under-rebase &&
 		set_fake_editor &&
 		FAKE_LINES="edit 1" git rebase -i HEAD^ &&
-		git worktree list | grep "under-rebase.*detached HEAD"
+		git worktree list >actual && 
+		grep "under-rebase.*detached HEAD" actual
 	)
 '
 
@@ -509,7 +510,8 @@ test_expect_success 'checkout a branch under bisect' '
 		git bisect start &&
 		git bisect bad &&
 		git bisect good HEAD~2 &&
-		git worktree list | grep "under-bisect.*detached HEAD" &&
+		git worktree list >actual && 
+		grep "under-bisect.*detached HEAD" actual &&
 		test_must_fail git worktree add new-bisect under-bisect &&
 		! test -d new-bisect
 	)
-- 
2.41.0.windows.1


^ permalink raw reply related

* [PATCH v2] This PR enables a successful git build on z/OS.
From: Haritha  via GitGitGadget @ 2023-12-04 14:19 UTC (permalink / raw)
  To: git; +Cc: brian m. carlson, Haritha, Haritha D
In-Reply-To: <pull.1537.git.git.1699871056.gitgitgadget@gmail.com>

From: Haritha D <harithamma.d@ibm.com>

Rename functions like "release" and "fetch"
due to conflict in z/OS standard C libraries.
Also disables autoconversion facility on z/OS
and relies on iconv.
New files created in binary format are also
tagged as binary.

Signed-off-by: Haritha D <harithamma.d@ibm.com>
---
    Enabling z/OS workflow for git
    
    z/OS is an IBM mainframe operating system, also known as OS/390. Our
    team has been actively involved in porting Git to z/OS and we have made
    significant modifications to facilitate this process. The patch below is
    the initial configuration for z/OS. I also have few follow up changes
    and I will send that after these changes are approved. Please let me
    know if there are any concerns.

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1537%2FHarithaIBM%2Fenablezos-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1537/HarithaIBM/enablezos-v2
Pull-Request: https://github.com/git/git/pull/1537

Range-diff vs v1:

  1:  712eb3712f1 <  -:  ----------- Enabling z/OS workflow for git
  2:  098b9ca8ece !  1:  b9882d90c5d Enable builds for z/OS.
     @@ Metadata
      Author: Haritha D <harithamma.d@ibm.com>
      
       ## Commit message ##
     -    Enable builds for z/OS.
     +    This PR enables a successful git build on z/OS.
      
     -    This commit enables git to build on z/OS.
     -    It takes advantage of enahanced ASCII
     -    services on z/OS to auto-convert input
     -    files to ASCII
     -    It also adds support for
     -    [platform]-working-tree-encoding.
     -    Platform is substituted with uname_info.sysname,
     -    so it will only apply to the given platform.
     -    Also adds support for scripts that are not in
     -    standard locations so that /bin/env bash
     -    can be specified.
     +    Rename functions like "release" and "fetch"
     +    due to conflict in z/OS standard C libraries.
     +    Also disables autoconversion facility on z/OS
     +    and relies on iconv.
     +    New files created in binary format are also
     +    tagged as binary.
      
     -    Signed-off-by: Harithamma D <harithamma.d@ibm.com>
     -
     - ## Makefile ##
     -@@ Makefile: include shared.mak
     - #
     - # Define SHELL_PATH to a POSIX shell if your /bin/sh is broken.
     - #
     -+# Define SHELL_PATH_FOR_SCRIPTS to a POSIX shell if your /bin/sh is broken.
     -+#
     - # Define SANE_TOOL_PATH to a colon-separated list of paths to prepend
     - # to PATH if your tools in /usr/bin are broken.
     - #
     -@@ Makefile: include shared.mak
     - #
     - # Define PERL_PATH to the path of your Perl binary (usually /usr/bin/perl).
     - #
     -+# Define PERL_PATH_FOR_SCRIPTS to a Perl binary if your /usr/bin/perl is broken.
     -+#
     - # Define NO_PERL if you do not want Perl scripts or libraries at all.
     - #
     - # Define NO_PERL_CPAN_FALLBACKS if you do not want to install bundled
     -@@ Makefile: BINDIR_PROGRAMS_NO_X += git-cvsserver
     - ifndef SHELL_PATH
     - 	SHELL_PATH = /bin/sh
     - endif
     -+ifndef SHELL_PATH_FOR_SCRIPTS
     -+	SHELL_PATH_FOR_SCRIPTS = /bin/sh
     -+endif
     - ifndef PERL_PATH
     - 	PERL_PATH = /usr/bin/perl
     - endif
     -+ifndef PERL_PATH_FOR_SCRIPTS
     -+	PERL_PATH_FOR_SCRIPTS = /usr/bin/perl
     -+endif
     - ifndef PYTHON_PATH
     - 	PYTHON_PATH = /usr/bin/python
     - endif
     -@@ Makefile: THIRD_PARTY_SOURCES += sha1dc/%
     - 
     - # xdiff and reftable libs may in turn depend on what is in libgit.a
     - GITLIBS = common-main.o $(LIB_FILE) $(XDIFF_LIB) $(REFTABLE_LIB) $(LIB_FILE)
     --EXTLIBS =
     -+EXTLIBS = $(ZOPEN_EXTRA_LIBS)
     - 
     - GIT_USER_AGENT = git/$(GIT_VERSION)
     - 
     -@@ Makefile: perllibdir_relative_SQ = $(subst ','\'',$(perllibdir_relative))
     - gitwebdir_SQ = $(subst ','\'',$(gitwebdir))
     - gitwebstaticdir_SQ = $(subst ','\'',$(gitwebstaticdir))
     - 
     --SHELL_PATH_SQ = $(subst ','\'',$(SHELL_PATH))
     -+SHELL_PATH_SQ = $(subst ','\'',$(SHELL_PATH_FOR_SCRIPTS))
     - TEST_SHELL_PATH_SQ = $(subst ','\'',$(TEST_SHELL_PATH))
     - PERL_PATH_SQ = $(subst ','\'',$(PERL_PATH))
     -+PERL_PATH_FOR_SCRIPTS_SQ = $(subst ','\'',$(PERL_PATH_FOR_SCRIPTS))
     - PYTHON_PATH_SQ = $(subst ','\'',$(PYTHON_PATH))
     - TCLTK_PATH_SQ = $(subst ','\'',$(TCLTK_PATH))
     - DIFF_SQ = $(subst ','\'',$(DIFF))
     -@@ Makefile: hook-list.h: generate-hooklist.sh Documentation/githooks.txt
     - 
     - SCRIPT_DEFINES = $(SHELL_PATH_SQ):$(DIFF_SQ):\
     - 	$(localedir_SQ):$(USE_GETTEXT_SCHEME):$(SANE_TOOL_PATH_SQ):\
     --	$(gitwebdir_SQ):$(PERL_PATH_SQ):$(PAGER_ENV):\
     -+	$(gitwebdir_SQ):$(PERL_PATH_FOR_SCRIPTS_SQ):$(PAGER_ENV):\
     - 	$(perllibdir_SQ)
     - GIT-SCRIPT-DEFINES: FORCE
     - 	@FLAGS='$(SCRIPT_DEFINES)'; \
     -@@ Makefile: sed -e '1s|#!.*/sh|#!$(SHELL_PATH_SQ)|' \
     -     -e 's/@@USE_GETTEXT_SCHEME@@/$(USE_GETTEXT_SCHEME)/g' \
     -     -e $(BROKEN_PATH_FIX) \
     -     -e 's|@@GITWEBDIR@@|$(gitwebdir_SQ)|g' \
     --    -e 's|@@PERL@@|$(PERL_PATH_SQ)|g' \
     -+    -e 's|@@PERL@@|$(PERL_PATH_FOR_SCRIPTS_SQ)|g' \
     -     -e 's|@@PAGER_ENV@@|$(PAGER_ENV_SQ)|g' \
     -     $@.sh >$@+
     - endef
     -@@ Makefile: PERL_DEFINES += $(gitexecdir) $(perllibdir) $(localedir)
     - $(SCRIPT_PERL_GEN): % : %.perl GIT-PERL-DEFINES GIT-PERL-HEADER GIT-VERSION-FILE
     - 	$(QUIET_GEN) \
     - 	sed -e '1{' \
     --	    -e '	s|#!.*perl|#!$(PERL_PATH_SQ)|' \
     -+	    -e '	s|#!.*perl|#!$(PERL_PATH_FOR_SCRIPTS_SQ)|' \
     - 	    -e '	r GIT-PERL-HEADER' \
     - 	    -e '	G' \
     - 	    -e '}' \
     -
     - ## builtin.h ##
     -@@ builtin.h: int cmd_verify_pack(int argc, const char **argv, const char *prefix);
     - int cmd_show_ref(int argc, const char **argv, const char *prefix);
     - int cmd_pack_refs(int argc, const char **argv, const char *prefix);
     - int cmd_replace(int argc, const char **argv, const char *prefix);
     -+#ifdef __MVS__
     -+  extern int setbinaryfd(int);
     -+#endif
     - 
     - #endif
     +    Signed-off-by: Haritha D <harithamma.d@ibm.com>
      
       ## builtin/archive.c ##
      @@
     @@ builtin/archive.c
       {
       	int output_fd = xopen(output_file, O_CREAT | O_WRONLY | O_TRUNC, 0666);
      +#ifdef __MVS__
     -+ #if (__CHARSET_LIB == 1)
     -+	if (setbinaryfd(output_fd))
     ++	/*
     ++	 * Since the data is in binary format,
     ++	 * we need to set the z/OS file tag
     ++	 * to binary to disable autoconversion
     ++	 */
     ++	if (__setfdbinary(output_fd))
      +		die_errno(_("could not tag archive file '%s'"), output_file);
     -+ #endif
      +#endif
       	if (output_fd != 1) {
       		if (dup2(output_fd, 1) < 0)
       			die_errno(_("could not redirect output"));
      
       ## builtin/hash-object.c ##
     -@@ builtin/hash-object.c: static void hash_fd(int fd, const char *type, const char *path, unsigned flags,
     - 	maybe_flush_or_die(stdout, "hash to stdout");
     - }
     - 
     -+#ifdef __MVS__
     -+#  if (__CHARSET_LIB == 1)
     -+#  include <stdio.h>
     -+#  include <stdlib.h>
     -+
     -+   int setbinaryfd(int fd)
     -+   {
     -+     attrib_t attr;
     -+     int rc;
     -+
     -+     memset(&attr, 0, sizeof(attr));
     -+     attr.att_filetagchg = 1;
     -+     attr.att_filetag.ft_ccsid = FT_BINARY;
     -+     attr.att_filetag.ft_txtflag = 0;
     -+
     -+     rc = __fchattr(fd, &attr, sizeof(attr));
     -+     return rc;
     -+   }
     -+#  endif
     -+#endif
     -+
     -+
     - static void hash_object(const char *path, const char *type, const char *vpath,
     - 			unsigned flags, int literally)
     +@@ builtin/hash-object.c: static void hash_object(const char *path, const char *type, const char *vpath,
       {
       	int fd;
       	fd = xopen(path, O_RDONLY);
      +#ifdef __MVS__
     -+#  if (__CHARSET_LIB == 1)
     -+  if (setbinaryfd(fd))
     ++	/*
     ++	 * Since the data being read is in binary format,
     ++	 * we need to disable autoconversion for z/OS
     ++	 */
     ++	if (__setfdbinary(fd))
      +		die_errno("Cannot set to binary '%s'", path);
     -+#  endif
      +#endif
       	hash_fd(fd, type, vpath, flags, literally);
       }
     @@ combine-diff.c: static void show_patch_diff(struct combine_diff_path *elem, int
       			ssize_t done;
       			int is_file, i;
       
     -+#ifdef __MVS__
     -+      __disableautocvt(fd);
     -+#endif
     ++		#ifdef __MVS__
     ++			/*
     ++			 * Disable implicit autconversion on z/os,
     ++			 * rely on conversion from iconv
     ++			 */
     ++			__disableautocvt(fd);
     ++		#endif
      +
       			elem->mode = canon_mode(st.st_mode);
       			/* if symlinks don't work, assume symlink if all parents
       			 * are symlinks
      
     - ## config.c ##
     -@@ config.c: static int git_default_core_config(const char *var, const char *value,
     - 		return 0;
     - 	}
     - 
     -+	#ifdef __MVS__
     -+	if (!strcmp(var, "core.ignorefiletags")) {
     -+		ignore_file_tags = git_config_bool(var, value);
     -+		return 0;
     -+	}
     -+	#endif
     -+
     - 	if (!strcmp(var, "core.safecrlf")) {
     - 		int eol_rndtrp_die;
     - 		if (value && !strcasecmp(value, "warn")) {
     -
     - ## configure.ac ##
     -@@ configure.ac: else
     -             CC_LD_DYNPATH=-Wl,+b,
     -           else
     -              CC_LD_DYNPATH=
     -+             if test "$(uname -s)" = "OS/390"; then
     -+                CC_LD_DYNPATH=-L
     -+             fi
     -              AC_MSG_WARN([linker does not support runtime path to dynamic libraries])
     -           fi
     -       fi
     -
     - ## convert.c ##
     -@@ convert.c: static int check_roundtrip(const char *enc_name)
     - static const char *default_encoding = "UTF-8";
     - 
     - static int encode_to_git(const char *path, const char *src, size_t src_len,
     --			 struct strbuf *buf, const char *enc, int conv_flags)
     -+			 struct strbuf *buf, const char *enc, enum convert_crlf_action attr_action, int conv_flags)
     - {
     - 	char *dst;
     - 	size_t dst_len;
     - 	int die_on_error = conv_flags & CONV_WRITE_OBJECT;
     - 
     -+  if (attr_action == CRLF_BINARY) {
     -+    return 0;
     -+  }
     - 	/*
     - 	 * No encoding is specified or there is nothing to encode.
     - 	 * Tell the caller that the content was not modified.
     -@@ convert.c: static int encode_to_git(const char *path, const char *src, size_t src_len,
     - 		return 0;
     - 
     - 	trace_encoding("source", path, enc, src, src_len);
     -+#ifdef __MVS__
     -+  // Don't convert ISO8859-1 on z/OS
     -+  if (strcasecmp("ISO8859-1", enc) == 0)
     -+    return 0;
     -+#endif
     - 	dst = reencode_string_len(src, src_len, default_encoding, enc,
     - 				  &dst_len);
     - 	if (!dst) {
     -@@ convert.c: static int encode_to_git(const char *path, const char *src, size_t src_len,
     - }
     - 
     - static int encode_to_worktree(const char *path, const char *src, size_t src_len,
     --			      struct strbuf *buf, const char *enc)
     -+			      struct strbuf *buf, enum convert_crlf_action attr_action, const char *enc)
     - {
     - 	char *dst;
     - 	size_t dst_len;
     - 
     -+  if (attr_action == CRLF_BINARY) {
     -+    return 0;
     -+  }
     - 	/*
     - 	 * No encoding is specified or there is nothing to encode.
     - 	 * Tell the caller that the content was not modified.
     -@@ convert.c: static int git_path_check_ident(struct attr_check_item *check)
     - 
     - static struct attr_check *check;
     - 
     -+static const char* get_platform() {
     -+	struct utsname uname_info;
     -+
     -+	if (uname(&uname_info))
     -+		die(_("uname() failed with error '%s' (%d)\n"),
     -+			    strerror(errno),
     -+			    errno);
     -+
     -+  if (!strcmp(uname_info.sysname, "OS/390"))
     -+    return "zos";
     -+  return uname_info.sysname;
     -+}
     -+
     -+
     - void convert_attrs(struct index_state *istate,
     - 		   struct conv_attrs *ca, const char *path)
     - {
     - 	struct attr_check_item *ccheck = NULL;
     -+  struct strbuf platform_working_tree_encoding = STRBUF_INIT;
     -+
     -+	strbuf_addf(&platform_working_tree_encoding, "%s-working-tree-encoding", get_platform());
     -+
     - 
     - 	if (!check) {
     - 		check = attr_check_initl("crlf", "ident", "filter",
     --					 "eol", "text", "working-tree-encoding",
     -+					 "eol", "text", "working-tree-encoding", platform_working_tree_encoding.buf,
     - 					 NULL);
     - 		user_convert_tail = &user_convert;
     - 		git_config(read_convert_config, NULL);
     - 	}
     -+	strbuf_release(&platform_working_tree_encoding);
     - 
     - 	git_check_attr(istate, path, check);
     - 	ccheck = check->items;
     -@@ convert.c: void convert_attrs(struct index_state *istate,
     - 			ca->crlf_action = CRLF_TEXT_CRLF;
     - 	}
     - 	ca->working_tree_encoding = git_path_check_encoding(ccheck + 5);
     -+  if (git_path_check_encoding(ccheck + 6))
     -+    ca->working_tree_encoding = git_path_check_encoding(ccheck + 6);
     - 
     - 	/* Save attr and make a decision for action */
     - 	ca->attr_action = ca->crlf_action;
     -@@ convert.c: int convert_to_git(struct index_state *istate,
     - 		len = dst->len;
     - 	}
     - 
     --	ret |= encode_to_git(path, src, len, dst, ca.working_tree_encoding, conv_flags);
     -+	ret |= encode_to_git(path, src, len, dst, ca.working_tree_encoding, ca.attr_action, conv_flags);
     - 	if (ret && dst) {
     - 		src = dst->buf;
     - 		len = dst->len;
     -@@ convert.c: void convert_to_git_filter_fd(struct index_state *istate,
     - 	if (!apply_filter(path, NULL, 0, fd, dst, ca.drv, CAP_CLEAN, NULL, NULL))
     - 		die(_("%s: clean filter '%s' failed"), path, ca.drv->name);
     - 
     --	encode_to_git(path, dst->buf, dst->len, dst, ca.working_tree_encoding, conv_flags);
     -+	encode_to_git(path, dst->buf, dst->len, dst, ca.working_tree_encoding, ca.attr_action, conv_flags);
     - 	crlf_to_git(istate, path, dst->buf, dst->len, dst, ca.crlf_action, conv_flags);
     - 	ident_to_git(dst->buf, dst->len, dst, ca.ident);
     - }
     -@@ convert.c: static int convert_to_working_tree_ca_internal(const struct conv_attrs *ca,
     - 		}
     - 	}
     - 
     --	ret |= encode_to_worktree(path, src, len, dst, ca->working_tree_encoding);
     -+	ret |= encode_to_worktree(path, src, len, dst, ca->attr_action, ca->working_tree_encoding);
     - 	if (ret) {
     - 		src = dst->buf;
     - 		len = dst->len;
     -
       ## copy.c ##
      @@ copy.c: int copy_fd(int ifd, int ofd)
       		if (write_in_full(ofd, buffer, len) < 0)
       			return COPY_WRITE_ERROR;
       	}
     -+#ifdef __MVS__
     -+  __copyfdccsid(ifd, ofd);
     -+#endif
     ++	#ifdef __MVS__
     ++		/*
     ++		 * This is to ensure that file tags are copied
     ++		 * from source to destination
     ++		 */
     ++		__copyfdccsid(ifd, ofd);
     ++	#endif
       	return 0;
       }
       
      
     - ## diff.c ##
     -@@ diff.c: int diff_populate_filespec(struct repository *r,
     - 	int check_binary = options ? options->check_binary : 0;
     - 	int err = 0;
     - 	int conv_flags = global_conv_flags_eol;
     -+#ifdef __MVS__
     -+	int autocvtToASCII;
     -+#endif
     - 	/*
     - 	 * demote FAIL to WARN to allow inspecting the situation
     - 	 * instead of refusing.
     -@@ diff.c: int diff_populate_filespec(struct repository *r,
     - 			s->is_binary = 1;
     - 			return 0;
     - 		}
     -+#ifdef __MVS__
     -+    validate_codeset(r->index, s->path, &autocvtToASCII);
     -+#endif
     - 		fd = open(s->path, O_RDONLY);
     - 		if (fd < 0)
     - 			goto err_empty;
     -+
     -+#ifdef __MVS__
     -+    if (!autocvtToASCII)
     -+      __disableautocvt(fd);
     -+#endif
     - 		s->data = xmmap(NULL, s->size, PROT_READ, MAP_PRIVATE, fd, 0);
     - 		close(fd);
     - 		s->should_munmap = 1;
     -
     - ## entry.c ##
     -@@ entry.c: int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st)
     - 	return 0;
     + ## fetch-negotiator.h ##
     +@@ fetch-negotiator.h: struct repository;
     +  * Then, when "have" lines are required, call next(). Call ack() to report what
     +  * the server tells us.
     +  *
     +- * Once negotiation is done, call release(). The negotiator then cannot be used
     ++ * Once negotiation is done, call release_negotiator(). The negotiator then cannot be used
     +  * (unless reinitialized with fetch_negotiator_init()).
     +  */
     + struct fetch_negotiator {
     +@@ fetch-negotiator.h: struct fetch_negotiator {
     + 	 */
     + 	int (*ack)(struct fetch_negotiator *, struct commit *);
     + 
     +-	void (*release)(struct fetch_negotiator *);
     ++	void (*release_negotiator)(struct fetch_negotiator *);
     + 
     + 	/* internal use */
     + 	void *data;
     +
     + ## fetch-pack.c ##
     +@@ fetch-pack.c: static struct ref *do_fetch_pack(struct fetch_pack_args *args,
     + 
     +  all_done:
     + 	if (negotiator)
     +-		negotiator->release(negotiator);
     ++		negotiator->release_negotiator(negotiator);
     + 	return ref;
       }
       
     -+#ifdef __MVS__
     -+void tag_file_as_working_tree_encoding(struct index_state *istate, char* path, int fd) {
     -+	struct conv_attrs ca;
     -+	convert_attrs(istate, &ca, path);
     -+  if (ca.attr_action != CRLF_BINARY) {
     -+    if (ca.working_tree_encoding)
     -+      __chgfdcodeset(fd, ca.working_tree_encoding);
     -+    else
     -+      __setfdtext(fd);
     -+  }
     -+  else {
     -+    __setfdbinary(fd);
     -+  }
     -+
     -+  __disableautocvt(fd);
     -+}
     -+#endif
     -+
     - static int streaming_write_entry(const struct cache_entry *ce, char *path,
     - 				 struct stream_filter *filter,
     - 				 const struct checkout *state, int to_tempfile,
     -@@ entry.c: static int streaming_write_entry(const struct cache_entry *ce, char *path,
     - 	if (fd < 0)
     - 		return -1;
     +@@ fetch-pack.c: static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
     + 		die("fsck failed");
       
     -+#ifdef __MVS__
     -+  tag_file_as_working_tree_encoding(state->istate, path, fd);
     -+#endif
     -+
     - 	result |= stream_blob_to_fd(fd, &ce->oid, filter, 1);
     - 	*fstat_done = fstat_checkout_output(fd, state, statbuf);
     - 	result |= close(fd);
     -@@ entry.c: static int write_entry(struct cache_entry *ce, char *path, struct conv_attrs *ca
     - 			return error_errno("unable to create file %s", path);
     - 		}
     + 	if (negotiator)
     +-		negotiator->release(negotiator);
     ++		negotiator->release_negotiator(negotiator);
       
     -+#ifdef __MVS__
     -+    tag_file_as_working_tree_encoding(state->istate, path, fd);
     -+#endif
     -+
     - 		wrote = write_in_full(fd, new_blob, size);
     - 		if (!to_tempfile)
     - 			fstat_done = fstat_checkout_output(fd, state, &st);
     -
     - ## environment.c ##
     -@@ environment.c: const char *git_hooks_path;
     - int zlib_compression_level = Z_BEST_SPEED;
     - int pack_compression_level = Z_DEFAULT_COMPRESSION;
     - int fsync_object_files = -1;
     -+#ifdef __MVS__
     -+int ignore_file_tags = 0;
     -+#endif
     - int use_fsync = -1;
     - enum fsync_method fsync_method = FSYNC_METHOD_DEFAULT;
     - enum fsync_component fsync_components = FSYNC_COMPONENTS_DEFAULT;
     + 	oidset_clear(&common);
     + 	return ref;
      
       ## git-compat-util.h ##
      @@ git-compat-util.h: struct strbuf;
     @@ git-compat-util.h: struct strbuf;
       #include <fcntl.h>
       #include <stddef.h>
      +#ifdef __MVS__
     -+#define release stdlib_release
     -+#define fetch stdlib_fetch
     ++	#define release stdlib_release
     ++	#define fetch stdlib_fetch
      +#endif
       #include <stdlib.h>
      +#ifdef __MVS__
     -+#undef fetch
     -+#undef release
     ++	#undef fetch
     ++	#undef release
      +#endif
       #include <stdarg.h>
       #include <string.h>
     @@ negotiator/noop.c: static int ack(struct fetch_negotiator *n UNUSED, struct comm
       }
       
      -static void release(struct fetch_negotiator *n UNUSED)
     -+static void release_negotiator(struct fetch_negotiator *n UNUSED)
     ++static void release_negotiator (struct fetch_negotiator *n UNUSED)
       {
       	/* nothing to release */
       }
     @@ negotiator/skipping.c: void skipping_negotiator_init(struct fetch_negotiator *ne
       	data->rev_list.compare = compare;
       
      
     - ## object-file.c ##
     -@@
     - #include "setup.h"
     - #include "submodule.h"
     - #include "fsck.h"
     --
     -+#ifdef __MVS__
     -+#include <_Ccsid.h>
     -+#endif
     - /* The maximum size for an object header. */
     - #define MAX_HEADER_LEN 32
     - 
     -@@ object-file.c: int index_fd(struct index_state *istate, struct object_id *oid,
     - 	return ret;
     - }
     - 
     -+#ifdef __MVS__
     -+void validate_codeset(struct index_state *istate, const char *path, int* autoconvertToASCII) {
     -+       struct conv_attrs ca;
     -+  struct stat st;
     -+  unsigned short attr_ccsid;
     -+  unsigned short file_ccsid;
     -+
     -+  if (ignore_file_tags)
     -+   return;
     -+
     -+  *autoconvertToASCII = 0;
     -+       convert_attrs(istate, &ca, path);
     -+  if (ca.attr_action == CRLF_BINARY) {
     -+    attr_ccsid = FT_BINARY;
     -+  }
     -+  else if (ca.working_tree_encoding) {
     -+    attr_ccsid = __toCcsid(ca.working_tree_encoding);
     -+  }
     -+  else
     -+    attr_ccsid = 819;
     -+
     -+  if (stat(path, &st) < 0)
     -+    return;
     -+
     -+  file_ccsid = st.st_tag.ft_ccsid;
     -+
     -+  if (file_ccsid == FT_UNTAGGED) {
     -+    die("File %s is untagged, set the correct file tag (using the chtag command).", path);
     -+  }
     -+
     -+  if (attr_ccsid != file_ccsid) {
     -+    if (file_ccsid == 1047 && attr_ccsid == 819) {
     -+      *autoconvertToASCII = 1;
     -+      return;
     -+    }
     -+    // Allow tag mixing of 819 and 1208
     -+    if ((file_ccsid == 819 || file_ccsid == 1208) && (attr_ccsid == 1208 || attr_ccsid == 819)) {
     -+      return;
     -+    }
     -+    // Don't check for binary files, just add them
     -+    if (attr_ccsid == FT_BINARY)
     -+      return;
     -+
     -+    char attr_csname[_XOPEN_PATH_MAX] = {0};
     -+    char file_csname[_XOPEN_PATH_MAX] = {0};
     -+    if (attr_ccsid != FT_BINARY) {
     -+      __toCSName(attr_ccsid, attr_csname);
     -+    } else {
     -+      snprintf(attr_csname, _XOPEN_PATH_MAX, "%s", "binary");
     -+    }
     -+    if (file_ccsid != FT_BINARY) {
     -+      __toCSName(file_ccsid, file_csname);
     -+    } else {
     -+      snprintf(file_csname, _XOPEN_PATH_MAX, "%s", "binary");
     -+    }
     -+    die("%s added file: file tag (%s) does not match working-tree-encoding (%s)", path, file_csname, attr_csname);
     -+  }
     -+}
     -+#endif
     -+
     -+
     -+
     - int index_path(struct index_state *istate, struct object_id *oid,
     - 	       const char *path, struct stat *st, unsigned flags)
     - {
     -@@ object-file.c: int index_path(struct index_state *istate, struct object_id *oid,
     - 	struct strbuf sb = STRBUF_INIT;
     - 	int rc = 0;
     - 
     -+#ifdef __MVS__
     -+	struct conv_attrs ca;
     -+	int autocvtToASCII;
     -+#endif
     -+
     - 	switch (st->st_mode & S_IFMT) {
     - 	case S_IFREG:
     -+#ifdef __MVS__
     -+    validate_codeset(istate, path, &autocvtToASCII);
     -+#endif
     - 		fd = open(path, O_RDONLY);
     - 		if (fd < 0)
     - 			return error_errno("open(\"%s\")", path);
     -+
     -+#ifdef __MVS__
     -+   if (!autocvtToASCII)
     -+     __disableautocvt(fd);
     -+#endif
     -+
     - 		if (index_fd(istate, oid, fd, st, OBJ_BLOB, path, flags) < 0)
     - 			return error(_("%s: failed to insert into database"),
     - 				     path);
     -
       ## read-cache.c ##
      @@ read-cache.c: static int ce_compare_data(struct index_state *istate,
       	int fd = git_open_cloexec(ce->name, O_RDONLY);
       
       	if (fd >= 0) {
     -+#ifdef __MVS__
     -+    __disableautocvt(fd);
     -+#endif
     ++	#ifdef __MVS__
     ++		/*
     ++		 * Since the data is in binary format,
     ++		 * we need to set the z/OS file tag to
     ++		 * binary to disable autoconversion
     ++		 */
     ++		__disableautocvt(fd);
     ++	#endif
       		struct object_id oid;
       		if (!index_fd(istate, &oid, fd, st, OBJ_BLOB, ce->name, 0))
       			match = !oideq(&oid, &ce->oid);
     -
     - ## utf8.c ##
     -@@ utf8.c: char *reencode_string_len(const char *in, size_t insz,
     - #endif
     - 	}
     - 
     -+#ifdef __MVS__
     -+  //HACK: For backwards compat, ISO8859-1 really means utf-8 in the z/OS world
     -+  if (strcasecmp("ISO8859-1", in_encoding) == 0) {
     -+    in_encoding = "UTF-8";
     -+    out_encoding = "UTF-8";
     -+  }
     -+  if (strcasecmp("ISO8859-1", out_encoding) == 0) {
     -+    in_encoding = "UTF-8";
     -+    out_encoding = "UTF-8";
     -+  }
     -+#endif
     - 	conv = iconv_open(out_encoding, in_encoding);
     - 	if (conv == (iconv_t) -1) {
     - 		in_encoding = fallback_encoding(in_encoding);
  3:  e31be0d764f <  -:  ----------- spaces and errors fix Handled git pipeline errors
  4:  7bace397b4a <  -:  ----------- fixes for build errors Handled git pipeline errorse
  5:  3b6d1f80668 <  -:  ----------- fixes for build errors
  6:  3c9b02e18d2 <  -:  ----------- spaces and errors fix Handled git pipeline errors
  7:  8165196f869 <  -:  ----------- spaces and errors fix Handled git pipeline errors
  8:  9fb74d92e3f <  -:  ----------- platform_name fix Handled git pipeline errors
  9:  8fa15ac45f7 <  -:  ----------- strncpy fix Handled git pipeline errors
 10:  63479fe3696 <  -:  ----------- strncpy fix Handled git pipeline errors
 11:  25271363e57 <  -:  ----------- strncpy fix Handled git pipeline errors
 12:  06658ebad10 <  -:  ----------- Handled git pipeline errors - Memory leak
 13:  804624950ae <  -:  ----------- Handled git pipeline errors - z/OS enable


 builtin/archive.c     | 9 +++++++++
 builtin/hash-object.c | 8 ++++++++
 combine-diff.c        | 8 ++++++++
 copy.c                | 7 +++++++
 fetch-negotiator.h    | 4 ++--
 fetch-pack.c          | 4 ++--
 git-compat-util.h     | 8 ++++++++
 negotiator/default.c  | 4 ++--
 negotiator/noop.c     | 4 ++--
 negotiator/skipping.c | 4 ++--
 read-cache.c          | 8 ++++++++
 11 files changed, 58 insertions(+), 10 deletions(-)

diff --git a/builtin/archive.c b/builtin/archive.c
index 90761fdfee0..3b1b258e383 100644
--- a/builtin/archive.c
+++ b/builtin/archive.c
@@ -14,6 +14,15 @@
 static void create_output_file(const char *output_file)
 {
 	int output_fd = xopen(output_file, O_CREAT | O_WRONLY | O_TRUNC, 0666);
+#ifdef __MVS__
+	/*
+	 * Since the data is in binary format,
+	 * we need to set the z/OS file tag
+	 * to binary to disable autoconversion
+	 */
+	if (__setfdbinary(output_fd))
+		die_errno(_("could not tag archive file '%s'"), output_file);
+#endif
 	if (output_fd != 1) {
 		if (dup2(output_fd, 1) < 0)
 			die_errno(_("could not redirect output"));
diff --git a/builtin/hash-object.c b/builtin/hash-object.c
index 5ffec99dcea..f43450db02d 100644
--- a/builtin/hash-object.c
+++ b/builtin/hash-object.c
@@ -62,6 +62,14 @@ static void hash_object(const char *path, const char *type, const char *vpath,
 {
 	int fd;
 	fd = xopen(path, O_RDONLY);
+#ifdef __MVS__
+	/*
+	 * Since the data being read is in binary format,
+	 * we need to disable autoconversion for z/OS
+	 */
+	if (__setfdbinary(fd))
+		die_errno("Cannot set to binary '%s'", path);
+#endif
 	hash_fd(fd, type, vpath, flags, literally);
 }
 
diff --git a/combine-diff.c b/combine-diff.c
index f90f4424829..3230b660371 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -1082,6 +1082,14 @@ static void show_patch_diff(struct combine_diff_path *elem, int num_parent,
 			ssize_t done;
 			int is_file, i;
 
+		#ifdef __MVS__
+			/*
+			 * Disable implicit autconversion on z/os,
+			 * rely on conversion from iconv
+			 */
+			__disableautocvt(fd);
+		#endif
+
 			elem->mode = canon_mode(st.st_mode);
 			/* if symlinks don't work, assume symlink if all parents
 			 * are symlinks
diff --git a/copy.c b/copy.c
index 23d84c6c1db..f5b9828b1c9 100644
--- a/copy.c
+++ b/copy.c
@@ -14,6 +14,13 @@ int copy_fd(int ifd, int ofd)
 		if (write_in_full(ofd, buffer, len) < 0)
 			return COPY_WRITE_ERROR;
 	}
+	#ifdef __MVS__
+		/*
+		 * This is to ensure that file tags are copied
+		 * from source to destination
+		 */
+		__copyfdccsid(ifd, ofd);
+	#endif
 	return 0;
 }
 
diff --git a/fetch-negotiator.h b/fetch-negotiator.h
index e348905a1f0..71d44102fdc 100644
--- a/fetch-negotiator.h
+++ b/fetch-negotiator.h
@@ -14,7 +14,7 @@ struct repository;
  * Then, when "have" lines are required, call next(). Call ack() to report what
  * the server tells us.
  *
- * Once negotiation is done, call release(). The negotiator then cannot be used
+ * Once negotiation is done, call release_negotiator(). The negotiator then cannot be used
  * (unless reinitialized with fetch_negotiator_init()).
  */
 struct fetch_negotiator {
@@ -47,7 +47,7 @@ struct fetch_negotiator {
 	 */
 	int (*ack)(struct fetch_negotiator *, struct commit *);
 
-	void (*release)(struct fetch_negotiator *);
+	void (*release_negotiator)(struct fetch_negotiator *);
 
 	/* internal use */
 	void *data;
diff --git a/fetch-pack.c b/fetch-pack.c
index 26999e3b659..c1f2e714f8e 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -1232,7 +1232,7 @@ static struct ref *do_fetch_pack(struct fetch_pack_args *args,
 
  all_done:
 	if (negotiator)
-		negotiator->release(negotiator);
+		negotiator->release_negotiator(negotiator);
 	return ref;
 }
 
@@ -1853,7 +1853,7 @@ static struct ref *do_fetch_pack_v2(struct fetch_pack_args *args,
 		die("fsck failed");
 
 	if (negotiator)
-		negotiator->release(negotiator);
+		negotiator->release_negotiator(negotiator);
 
 	oidset_clear(&common);
 	return ref;
diff --git a/git-compat-util.h b/git-compat-util.h
index 3e7a59b5ff1..be4516fa64e 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -223,7 +223,15 @@ struct strbuf;
 #include <sys/stat.h>
 #include <fcntl.h>
 #include <stddef.h>
+#ifdef __MVS__
+	#define release stdlib_release
+	#define fetch stdlib_fetch
+#endif
 #include <stdlib.h>
+#ifdef __MVS__
+	#undef fetch
+	#undef release
+#endif
 #include <stdarg.h>
 #include <string.h>
 #ifdef HAVE_STRINGS_H
diff --git a/negotiator/default.c b/negotiator/default.c
index 9a5b6963272..b1f9f153372 100644
--- a/negotiator/default.c
+++ b/negotiator/default.c
@@ -174,7 +174,7 @@ static int ack(struct fetch_negotiator *n, struct commit *c)
 	return known_to_be_common;
 }
 
-static void release(struct fetch_negotiator *n)
+static void release_negotiator(struct fetch_negotiator *n)
 {
 	clear_prio_queue(&((struct negotiation_state *)n->data)->rev_list);
 	FREE_AND_NULL(n->data);
@@ -187,7 +187,7 @@ void default_negotiator_init(struct fetch_negotiator *negotiator)
 	negotiator->add_tip = add_tip;
 	negotiator->next = next;
 	negotiator->ack = ack;
-	negotiator->release = release;
+	negotiator->release_negotiator = release_negotiator;
 	negotiator->data = CALLOC_ARRAY(ns, 1);
 	ns->rev_list.compare = compare_commits_by_commit_date;
 
diff --git a/negotiator/noop.c b/negotiator/noop.c
index de39028ab7f..b2d555e0fec 100644
--- a/negotiator/noop.c
+++ b/negotiator/noop.c
@@ -30,7 +30,7 @@ static int ack(struct fetch_negotiator *n UNUSED, struct commit *c UNUSED)
 	return 0;
 }
 
-static void release(struct fetch_negotiator *n UNUSED)
+static void release_negotiator (struct fetch_negotiator *n UNUSED)
 {
 	/* nothing to release */
 }
@@ -41,6 +41,6 @@ void noop_negotiator_init(struct fetch_negotiator *negotiator)
 	negotiator->add_tip = add_tip;
 	negotiator->next = next;
 	negotiator->ack = ack;
-	negotiator->release = release;
+	negotiator->release_negotiator = release_negotiator;
 	negotiator->data = NULL;
 }
diff --git a/negotiator/skipping.c b/negotiator/skipping.c
index 5b91520430c..783b3f27e63 100644
--- a/negotiator/skipping.c
+++ b/negotiator/skipping.c
@@ -243,7 +243,7 @@ static int ack(struct fetch_negotiator *n, struct commit *c)
 	return known_to_be_common;
 }
 
-static void release(struct fetch_negotiator *n)
+static void release_negotiator(struct fetch_negotiator *n)
 {
 	clear_prio_queue(&((struct data *)n->data)->rev_list);
 	FREE_AND_NULL(n->data);
@@ -256,7 +256,7 @@ void skipping_negotiator_init(struct fetch_negotiator *negotiator)
 	negotiator->add_tip = add_tip;
 	negotiator->next = next;
 	negotiator->ack = ack;
-	negotiator->release = release;
+	negotiator->release_negotiator = release_negotiator;
 	negotiator->data = CALLOC_ARRAY(data, 1);
 	data->rev_list.compare = compare;
 
diff --git a/read-cache.c b/read-cache.c
index 080bd39713b..b7189c58144 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -205,6 +205,14 @@ static int ce_compare_data(struct index_state *istate,
 	int fd = git_open_cloexec(ce->name, O_RDONLY);
 
 	if (fd >= 0) {
+	#ifdef __MVS__
+		/*
+		 * Since the data is in binary format,
+		 * we need to set the z/OS file tag to
+		 * binary to disable autoconversion
+		 */
+		__disableautocvt(fd);
+	#endif
 		struct object_id oid;
 		if (!index_fd(istate, &oid, fd, st, OBJ_BLOB, ce->name, 0))
 			match = !oideq(&oid, &ce->oid);

base-commit: 564d0252ca632e0264ed670534a51d18a689ef5d
-- 
gitgitgadget

^ permalink raw reply related

* Re: [PATCH 3/4] refs: complete list of special refs
From: Phillip Wood @ 2023-12-04 14:18 UTC (permalink / raw)
  To: Patrick Steinhardt, phillip.wood; +Cc: git, hanwenn
In-Reply-To: <ZWmAn20UYWBo9i8C@tanuki>

Hi Patrick

On 01/12/2023 06:43, Patrick Steinhardt wrote:
> On Thu, Nov 30, 2023 at 03:42:06PM +0000, Phillip Wood wrote:
>> Hi Patrick
>>
>> Thanks for working on this. I've left a couple of thought below.
>>
>> On 29/11/2023 08:14, Patrick Steinhardt wrote:
>>> +static int is_special_ref(const char *refname)
>>> +{
>>> +	/*
>>> +	 * Special references get written and read directly via the filesystem
>>> +	 * by the subsystems that create them. Thus, they must not go through
>>> +	 * the reference backend but must instead be read directly. It is
>>> +	 * arguable whether this behaviour is sensible, or whether it's simply
>>> +	 * a leaky abstraction enabled by us only having a single reference
>>> +	 * backend implementation. But at least for a subset of references it
>>> +	 * indeed does make sense to treat them specially:
>>> +	 *
>>> +	 * - FETCH_HEAD may contain multiple object IDs, and each one of them
>>> +	 *   carries additional metadata like where it came from.
>>> +	 *
>>> +	 * - MERGE_HEAD may contain multiple object IDs when merging multiple
>>> +	 *   heads.
>>> +	 *
>>> +	 * - "rebase-apply/" and "rebase-merge/" contain all of the state for
>>> +	 *   rebases, where keeping it closely together feels sensible.
>>
>> I'd really like to get away from treating these files as refs. I think their
>> use as refs is purely historic and predates the reflog and possibly
>> ORIG_HEAD. These days I'm not sure there is a good reason to be running
>>
>>      git rev-parse rebase-merge/orig-head
>>
>> One reason for not wanting to treat them as refs is that we do not handle
>> multi-level refs that do not begin with "refs/" consistently.
>>
>>      git update-ref foo/bar HEAD
>>
>> succeeds and creates .git/foo/bar but
>>
>>      git update-ref -d foo/bar
>>
>> fails with
>>
>>      error: refusing to update ref with bad name 'foo/bar'
>>
>> To me it would make sense to refuse to create 'foo/bar' but allow an
>> existing ref named 'foo/bar' to be deleted but the current behavior is the
>> opposite of that.
>>
>> I'd be quite happy to see us refuse to treat anything that fails
>>
>>      if (starts_with(refname, "refs/") || refname_is_safe(refname))
>>
>> as a ref but I don't know how much pain that would cause.
> 
> Well, we already do use these internally as references, but I don't
> disagree with you.

I should have been clearer that I was talking about the refs starting 
"rebase-*" rather than FETCH_HEAD and MERGE_HEAD. As a user find it 
convenient to be able to run "git fetch ... && git log -p FETCH_HEAD" 
even if the implementation is a bit ugly. As far as I can see we do not 
use "rebase-(apply|merge)/(orig-head|amend|autostash)" as a ref in our 
code or tests.

> I think the current state is extremely confusing,
> which is why my first approach was to simply document what falls into
> the category of these "special" references.

That's certainly a good place to start

> In my mind, this patch series here is a first step towards addressing
> the problem more generally. For now it is more or less only documenting
> _what_ is a special ref and why they are special, while also ensuring
> that these refs are compatible with the reftable backend. But once this
> lands, I'd certainly want to see us continue to iterate on this.
> 
> Most importantly, I'd love to see us address two issues:
> 
>    - Start to refuse writing these special refs via the refdb so that
>      the rules I've now layed out are also enforced. This would also
>      address your point about things being inconsistent.
> 
>    - Gradually reduce the list of special refs so that they are reduced
>      to a bare minimum and so that most refs are simply that, a normal
>      ref.

That sounds like a good plan

>>> +	const char * const special_refs[] = {
>>> +		"AUTO_MERGE",
>>
>> Is there any reason to treat this specially in the long term? It points to a
>> tree rather than a commit but unlike MERGE_HEAD and FETCH_HEAD it is
>> effectively a "normal" ref.
> 
> No, I'd love to see this and others converted to become a normal ref
> eventually. The goal of this patch series was mostly to document what we
> already have, and address those cases which are inconsistent with the
> new rules. But I'd be happy to convert more of these special refs to
> become normal refs after it lands.

That's great

>>> +		"BISECT_EXPECTED_REV",
>>> +		"FETCH_HEAD",
>>> +		"MERGE_AUTOSTASH",
>>
>> Should we be treating this as a ref? I thought it was written as an
>> implementation detail of the autostash implementation rather than to provide
>> a ref for users and scripts.
> 
> Yes, we have to in the context of the reftable backend. There's a bunch
> of tests that exercise our ability to parse this as a ref, and they
> would otherwise fail with the reftable backend.

Ah, looking at the the man page for "git merge" it seems we do actually 
document the existence of MERGE_AUTOSTASH so it is not just an 
implementation detail after all.

Best Wishes

Phillip


^ permalink raw reply

* [PATCH] gitk: add setting to hide unknown refs
From: Joachim B Haga via GitGitGadget @ 2023-12-04 13:18 UTC (permalink / raw)
  To: git; +Cc: Joachim B Haga, Joachim B Haga

From: Joachim B Haga <jobh@simula.no>

Tools such as branchless (https://github.com/arxanas/git-branchless)
add a lot of refs under the "refs/branchless" prefix. By default,
these are filtered out from `git log` using the `log.excludeDecoration`
config directive.

However, gitk applies decoration itself from the output of `git show-ref`,
so these refs clutter up the UI.

This patch adds a setting to gitk to exclude all unknown refs - which
is considerably simpler than trying to honour the `excludeDecoration`
pattern. Note that this also hides f.x. the `git bisect` refs, which I
think is fine given that this behaviour is opt-in (it defaults to not
hide anything).

Signed-off-by: Joachim B Haga <jobh@simula.no>
---
    gitk: add setting to hide unknown refs
    
    Tools such as branchless (https://github.com/arxanas/git-branchless) add
    a lot of refs under the "refs/branchless" prefix. By default, these are
    filtered out from git log using the log.excludeDecoration config
    directive.
    
    However, gitk applies decoration itself from the output of git show-ref,
    so these refs clutter up the UI.
    
    This patch adds a setting to gitk to exclude all unknown refs - which is
    considerably simpler than trying to honour the excludeDecoration
    pattern. Note that this also hides f.x. the git bisect refs, which I
    think is fine given that this behaviour is opt-in (defaults to not hide
    anything).

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1619%2Fjobh%2Fmaster-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1619/jobh/master-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1619

 gitk-git/gitk | 26 ++++++++++++++++----------
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/gitk-git/gitk b/gitk-git/gitk
index df3ba2ea99b..e91856b33a0 100755
--- a/gitk-git/gitk
+++ b/gitk-git/gitk
@@ -1798,7 +1798,7 @@ proc readrefs {} {
     global tagids idtags headids idheads tagobjid
     global otherrefids idotherrefs mainhead mainheadid
     global selecthead selectheadid
-    global hideremotes
+    global hideremotes hideunknown
     global tclencoding
 
     foreach v {tagids idtags headids idheads otherrefids idotherrefs} {
@@ -1835,8 +1835,10 @@ proc readrefs {} {
             set tagids($name) $id
             lappend idtags($id) $name
         } else {
-            set otherrefids($name) $id
-            lappend idotherrefs($id) $name
+	    if {[string match "stash" $name] || !$hideunknown} {
+		set otherrefids($name) $id
+		lappend idotherrefs($id) $name
+	    }
         }
     }
     catch {close $refd}
@@ -11577,7 +11579,7 @@ proc create_prefs_page {w} {
 proc prefspage_general {notebook} {
     global NS maxwidth maxgraphpct showneartags showlocalchanges
     global tabstop limitdiffs autoselect autosellen extdifftool perfile_attrs
-    global hideremotes want_ttk have_ttk maxrefs web_browser
+    global hideremotes hideunknown want_ttk have_ttk maxrefs web_browser
 
     set page [create_prefs_page $notebook.general]
 
@@ -11601,6 +11603,9 @@ proc prefspage_general {notebook} {
     ${NS}::checkbutton $page.hideremotes -text [mc "Hide remote refs"] \
         -variable hideremotes
     grid x $page.hideremotes -sticky w
+    ${NS}::checkbutton $page.hideunknown -text [mc "Hide unknown refs"] \
+        -variable hideunknown
+    grid x $page.hideunknown -sticky w
 
     ${NS}::label $page.ddisp -text [mc "Diff display options"]
     grid $page.ddisp - -sticky w -pady 10
@@ -11725,7 +11730,7 @@ proc doprefs {} {
     global oldprefs prefstop showneartags showlocalchanges
     global uicolor bgcolor fgcolor ctext diffcolors selectbgcolor markbgcolor
     global tabstop limitdiffs autoselect autosellen extdifftool perfile_attrs
-    global hideremotes want_ttk have_ttk
+    global hideremotes hideunknown want_ttk have_ttk
 
     set top .gitkprefs
     set prefstop $top
@@ -11734,7 +11739,7 @@ proc doprefs {} {
         return
     }
     foreach v {maxwidth maxgraphpct showneartags showlocalchanges \
-                   limitdiffs tabstop perfile_attrs hideremotes want_ttk} {
+                   limitdiffs tabstop perfile_attrs hideremotes hideunknown want_ttk} {
         set oldprefs($v) [set $v]
     }
     ttk_toplevel $top
@@ -11860,7 +11865,7 @@ proc prefscan {} {
     global oldprefs prefstop
 
     foreach v {maxwidth maxgraphpct showneartags showlocalchanges \
-                   limitdiffs tabstop perfile_attrs hideremotes want_ttk} {
+                   limitdiffs tabstop perfile_attrs hideremotes hideunknown want_ttk} {
         global $v
         set $v $oldprefs($v)
     }
@@ -11874,7 +11879,7 @@ proc prefsok {} {
     global oldprefs prefstop showneartags showlocalchanges
     global fontpref mainfont textfont uifont
     global limitdiffs treediffs perfile_attrs
-    global hideremotes
+    global hideremotes hideunknown
 
     catch {destroy $prefstop}
     unset prefstop
@@ -11920,7 +11925,7 @@ proc prefsok {} {
           $limitdiffs != $oldprefs(limitdiffs)} {
         reselectline
     }
-    if {$hideremotes != $oldprefs(hideremotes)} {
+    if {$hideremotes != $oldprefs(hideremotes) || $hideunknown != $oldprefs(hideunknown)} {
         rereadrefs
     }
 }
@@ -12394,6 +12399,7 @@ set cmitmode "patch"
 set wrapcomment "none"
 set showneartags 1
 set hideremotes 0
+set hideunknown 0
 set maxrefs 20
 set visiblerefs {"master"}
 set maxlinelen 200
@@ -12498,7 +12504,7 @@ config_check_tmp_exists 50
 set config_variables {
     mainfont textfont uifont tabstop findmergefiles maxgraphpct maxwidth
     cmitmode wrapcomment autoselect autosellen showneartags maxrefs visiblerefs
-    hideremotes showlocalchanges datetimeformat limitdiffs uicolor want_ttk
+    hideremotes hideunknown showlocalchanges datetimeformat limitdiffs uicolor want_ttk
     bgcolor fgcolor uifgcolor uifgdisabledcolor colors diffcolors mergecolors
     markbgcolor diffcontext selectbgcolor foundbgcolor currentsearchhitbgcolor
     extdifftool perfile_attrs headbgcolor headfgcolor headoutlinecolor

base-commit: 564d0252ca632e0264ed670534a51d18a689ef5d
-- 
gitgitgadget

^ permalink raw reply related

* Re: [PATCH 2/2] checkout: forbid "-B <branch>" from touching a branch used elsewhere
From: Willem Verstraeten @ 2023-12-04 12:20 UTC (permalink / raw)
  To: phillip.wood; +Cc: Junio C Hamano, git, Jeff King
In-Reply-To: <b3532261-3cf4-4666-9cbd-4ce668cd2e49@gmail.com>

Hi everyone,

It's not clear for me from the email thread what the status is of this
bug report, and whether there is still something expected from me.

Is the current consensus that this is a real issue that needs fixing?
If so, does the current patch-set fix the issue, and how does the fix
get into (one of) the next release(s)?

Do I still need to do something?

Kind regards,
Willem

On Thu, 30 Nov 2023 at 16:22, Phillip Wood <phillip.wood123@gmail.com> wrote:
>
> Hi Junio
>
> On 27/11/2023 01:51, Junio C Hamano wrote:
> > Phillip Wood <phillip.wood123@gmail.com> writes:
> >
> >> At the moment this is academic as neither of the test scripts changed
> >> by this patch are leak free and so I don't think we need to worry
> >> about it but it raises an interesting question about how we should
> >> handle memory leaks when dying. Leaving the leak when dying means that
> >> a test script that tests an expected failure will never be leak free
> >> but using UNLEAK() would mean we miss a leak being introduced in the
> >> successful case should the call to "free()" ever be removed.
> >
> > Is there a leak here?  The piece of memory is pointed at by an on-stack
> > variable full_ref when leak sanitizer starts scanning the heap and
> > the stack just before the process exits due to die, so I do not see
> > a reason to worry about this particular variable over all the other
> > on stack variables we accumulated before the control reached this
> > point of the code.
>
> Oh, good point. I was thinking "we exit without calling free() so it is
> leaked" but as you say the leak checker (thankfully) does not consider
> it a leak as there is still a reference to the allocation on the stack.
>
> Sorry for the noise
>
> Phillip
>
> > Are you worried about optimizing compilers that behave more cleverly
> > than their own good to somehow lose the on-stack reference to
> > full_ref while calling die_if_switching_to_a_branch_in_use()?  We
> > might need to squelch them with UNLEAK() but that does not mean we
> > have to remove the free() we see above, and I suspect a more
> > productive use of our time to solve that issue is ensure that our
> > leak-sanitizing build will not triger such an unwanted optimization
> > anyway.
> >
> > Thanks.

^ permalink raw reply

* Re: [PATCH v6] subtree: fix split processing with multiple subtrees present
From: Christian Couder @ 2023-12-04 11:08 UTC (permalink / raw)
  To: Zach FettersMoore via GitGitGadget; +Cc: git, Zach FettersMoore
In-Reply-To: <pull.1587.v6.git.1701442494319.gitgitgadget@gmail.com>

On Fri, Dec 1, 2023 at 3:54 PM Zach FettersMoore via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Zach FettersMoore <zach.fetters@apollographql.com>
>
> When there are multiple subtrees present in a repository and they are
> all using 'git subtree split', the 'split' command can take a
> significant (and constantly growing) amount of time to run even when
> using the '--rejoin' flag. This is due to the fact that when processing
> commits to determine the last known split to start from when looking
> for changes, if there has been a split/merge done from another subtree
> there will be 2 split commits, one mainline and one subtree, for the
> second subtree that are part of the processing. The non-mainline
> subtree split commit will cause the processing to always need to search
> the entire history of the given subtree as part of its processing even
> though those commits are totally irrelevant to the current subtree
> split being run.
>
> To see this in practice you can use the open source GitHub repo
> 'apollo-ios-dev' and do the following in order:
>
> -Make a changes to a file in 'apollo-ios' and 'apollo-ios-codegen'
>  directories
> -Create a commit containing these changes
> -Do a split on apollo-ios-codegen
>    - Do a fetch on the subtree repo
>       - git fetch git@github.com:apollographql/apollo-ios-codegen.git
>    - git subtree split --prefix=apollo-ios-codegen --squash --rejoin

Now I get the following without your patch at this step:

$ git subtree split --prefix=apollo-ios-codegen --squash --rejoin
[...]/libexec/git-core/git-subtree: 318: Maximum function recursion
depth (1000) reached

Line 318 in git-subtree.sh contains the following:

missed=$(cache_miss "$@") || exit $?

With your patch it seems to work:

$ git subtree split --prefix=apollo-ios-codegen --squash --rejoin
Merge made by the 'ort' strategy.
e274aed3ba6d0659fb4cc014587cf31c1e8df7f4

>    - Depending on the current state of the 'apollo-ios-dev' repo
>      you may see the issue at this point if the last split was on
>      apollo-ios

I guess I see it, but it seems a bit different for me than what you describe.

Otherwise your patch looks good to me now.

Thanks,
Christian.

^ permalink raw reply

* Re: [PATCH 00/24] pack-objects: multi-pack verbatim reuse
From: Patrick Steinhardt @ 2023-12-04  8:49 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Jeff King, Junio C Hamano
In-Reply-To: <ZWo7zjs+q8SZ5o1o@nand.local>

[-- Attachment #1: Type: text/plain, Size: 2098 bytes --]

On Fri, Dec 01, 2023 at 03:02:22PM -0500, Taylor Blau wrote:
> On Fri, Dec 01, 2023 at 09:31:14AM +0100, Patrick Steinhardt wrote:
[snip]
> I suppose if we're relatively confident that the last series will be
> merged eventually, then that seems like less of a concern. But I'm not
> sure that we're at that point yet.

That's an additional valid concern indeed.

[snip]
> > >   - The fourth series (which actually implements multi-pack reuse) still
> > >     remains the most complicated, and would likely be the most difficult
> > >     to review. So I think you'd still have one difficult series to
> > >     review, plus four other series which are slightly less difficult to
> > >     review ;-).
> >
> > That's fine in my opinion, there's no surprise here that a complicated
> > topic is demanding for both the author and the reviewer.
> 
> My preference is to avoid splitting the series if we can help it. But if
> you feel strongly, or others feel similarly, I'm happy to take another
> crack at breaking it up. Thanks for all of your review so far!

I don't feel strongly about this at all, I've only tried to spell out my
own thoughts in this context as I thought they were kind of relevant
here. I've thought quite a lot about this topic recently due to my work
on the reftable backend, where I'm trying to get as many pieces as
possible landed individually before landing the actual backend itself.
It's working well for most of the part, but in other contexts it's a bit
weird that we try to cater towards something that doesn't exist yet. But
naturally, the reftable work is of different nature than the topic you
work on here and thus my own takeaways may likely not apply heer.

To summarize, I think there is merit in splitting up patches into chunks
that make it in individually and thus gradually work toward a topic, but
I also totally understand why you (or Junio as the maintainer) might
think that this is not a good idea. The ultimate decision for how to
handle topics should be with the patch series' author and maintainer.

Patrick

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* [PATCH v2] test-lib-functions.sh: fix test_grep fail message wording
From: Shreyansh Paliwal @ 2023-12-03 17:17 UTC (permalink / raw)
  To: git; +Cc: five231003, gitster, shreyp135, Shreyansh Paliwal
In-Reply-To: <CAPYXD64yCuMta_iGE+ZwgxrJn0U5shcwcB9jaiNkFhvff=R7MQ@mail.gmail.com>

From: shreyp135 <shreyanshpaliwalcmsmn@gmail.com>

In the recent commit
2e87fca189 (test framework: further deprecate test_i18ngrep, 2023-10-31),
the test_i18ngrep() function was deprecated.

So if a test employing this function fails,
the error messages may be confusing due to wording issues.

It's important to address these wording changes to ensure smooth transitions
for developers adapting to the deprecation of test_i18ngrep,
and to maintain the effectiveness of the testing process.

Signed-off-by: Shreyansh Paliwal <Shreyanshpaliwalcmsmn@gmail.com>
---
 t/test-lib-functions.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 9c3cf12b26..8737c95e0c 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1277,7 +1277,7 @@ test_grep () {
 	if test $# -lt 2 ||
 	   { test "x!" = "x$1" && test $# -lt 3 ; }
 	then
-		BUG "too few parameters to test_i18ngrep"
+		BUG "too few parameters to test_grep"
 	fi
 
 	if test "x!" = "x$1"
-- 
2.43.0.1


^ permalink raw reply related

* (no subject)
From: Shreyansh Paliwal @ 2023-12-03 16:51 UTC (permalink / raw)
  To: git; +Cc: five231003, gitster, shreyp135, Shreyansh Paliwal

From: shreyp135 <shreyanshpaliwalcmsmn@gmail.com>

Subject: [PATCH v2] test-lib-functions.sh: fix test_grep fail message wording

In the recent commit
2e87fca189 (test framework: further deprecate test_i18ngrep, 2023-10-31),
the test_i18ngrep() function was deprecated.

So if a test employing this function fails,
the error messages may be confusing due to wording issues.

It's important to address these wording changes to ensure smooth transitions
for developers adapting to the deprecation of test_i18ngrep,
and to maintain the effectiveness of the testing process.

Signed-off-by: Shreyansh Paliwal <Shreyanshpaliwalcmsmn@gmail.com>
---
 t/test-lib-functions.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 9c3cf12b26..8737c95e0c 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1277,7 +1277,7 @@ test_grep () {
 	if test $# -lt 2 ||
 	   { test "x!" = "x$1" && test $# -lt 3 ; }
 	then
-		BUG "too few parameters to test_i18ngrep"
+		BUG "too few parameters to test_grep"
 	fi
 
 	if test "x!" = "x$1"
-- 
2.43.0.1


^ permalink raw reply related

* [RFC PATCH 4/4] unpack-trees: introduce parallel_unlink
From: Han Young @ 2023-12-03 13:39 UTC (permalink / raw)
  To: git; +Cc: Han Young
In-Reply-To: <20231203133911.41594-1-hanyoung@protonmail.com>

From: Han Young <hanyang.tony@bytedance.com>

We have parallel_checkout option since 04155bdad, but the unlink is still executed single threaded. On very large repo, checkout across directory rename or restructure commit can lead to large amount of unlinked entries. In some instance, the unlink operation can be slower than the parallel checkout. This commit add parallel unlink support, parallel unlink uses multithreaded removal of entries.
---
Unlink operation by itself is way faster than checkout, the default threshold should be way higher
than parallel_checkout. I hardcoded the threshold to be 100 times higher, probably need to introduce
a new config option with sensible default.
To discover how many entries to remove require us to iterate index->cache, this is fast even for large
number of entries compare to filesystem operation.
I think we can reuse checkout.workers as the main switch for parallel_unlink, since it's also part of
checkout process.

 unpack-trees.c | 15 ++-------------
 1 file changed, 2 insertions(+), 13 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index c2b20b80d5..53589cde8a 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -452,17 +452,8 @@ static int check_updates(struct unpack_trees_options *o,
 	if (should_update_submodules())
 		load_gitmodules_file(index, NULL);
 
-	for (i = 0; i < index->cache_nr; i++) {
-		const struct cache_entry *ce = index->cache[i];
-
-		if (ce->ce_flags & CE_WT_REMOVE) {
-			display_progress(progress, ++cnt);
-			unlink_entry(ce, o->super_prefix);
-		}
-	}
-
-	remove_marked_cache_entries(index, 0);
-	remove_scheduled_dirs();
+	get_parallel_checkout_configs(&pc_workers, &pc_threshold);
+	cnt = run_parallel_unlink(index, progress, o->super_prefix, pc_workers, pc_threshold * 100, cnt);
 
 	if (should_update_submodules())
 		load_gitmodules_file(index, &state);
@@ -474,8 +465,6 @@ static int check_updates(struct unpack_trees_options *o,
 		 */
 		prefetch_cache_entries(index, must_checkout);
 
-	get_parallel_checkout_configs(&pc_workers, &pc_threshold);
-
 	enable_delayed_checkout(&state);
 	if (pc_workers > 1)
 		init_parallel_checkout();
-- 
2.43.0


^ permalink raw reply related

* [RFC PATCH 3/4] parallel-checkout: add parallel_unlink
From: Han Young @ 2023-12-03 13:39 UTC (permalink / raw)
  To: git; +Cc: Han Young
In-Reply-To: <20231203133911.41594-1-hanyoung@protonmail.com>

From: Han Young <hanyang.tony@bytedance.com>

Add parallel_unlink to parallel-checkout, parallel_unlink uses multiple threads to unlink entries. Because the path to be removed is sorted, each thread iterate through the entry list interleaved to distribute the workload as evenly as possible. Due to the multithread nature, it's not possible to remove all the dirs in one pass. The dir one thread is about to remove may have item that are being removed by another thread. Whenever we failed to remove the dir, we save it in a hashset. When every thread has finished its job, we remove all the entries in the hashset.
---
Note that we display progress after thread join, the progress count is updated for every thread instead of every path.
During testing, threads almost finished at around the same time. This caused the abrupt progress update.
We can use a mutex to display the progress, but that nullified the optimization on environment with fast file deletion time.

 parallel-checkout.c | 80 +++++++++++++++++++++++++++++++++++++++++++++
 parallel-checkout.h | 25 ++++++++++++++
 2 files changed, 105 insertions(+)

diff --git a/parallel-checkout.c b/parallel-checkout.c
index b5a714c711..6e62e044d8 100644
--- a/parallel-checkout.c
+++ b/parallel-checkout.c
@@ -328,6 +328,24 @@ static int close_and_clear(int *fd)
 	return ret;
 }
 
+void *parallel_unlink_proc(void *_data)
+{
+	struct parallel_unlink_data *data = _data;
+	struct cache_def cache = CACHE_DEF_INIT;
+	int i = data->start;
+	data->cnt = 0;
+
+	while (i < data->len) {
+		const struct cache_entry *ce = data->cache[i];
+		if (ce->ce_flags & CE_WT_REMOVE) {
+			++data->cnt;
+			threaded_unlink_entry(ce, data->super_prefix, data->removal_cache, &cache);
+		}
+		i += data->step;
+	}
+	return &data->cnt;
+}
+
 void write_pc_item(struct parallel_checkout_item *pc_item,
 		   struct checkout *state)
 {
@@ -678,3 +696,65 @@ int run_parallel_checkout(struct checkout *state, int num_workers, int threshold
 	finish_parallel_checkout();
 	return ret;
 }
+
+unsigned run_parallel_unlink(struct index_state *index,
+			  struct progress *progress,
+			  const char *super_prefix, int num_workers, int threshold,
+			  unsigned cnt)
+{
+	int i, use_parallel = 0, errs = 0;
+	if (num_workers > 1 && index->cache_nr >= threshold) {
+		int unlink_cnt = 0;
+		for (i = 0; i < index->cache_nr; i++) {
+			const struct cache_entry *ce = index->cache[i];
+			if (ce->ce_flags & CE_WT_REMOVE) {
+				unlink_cnt++;
+			}
+		}
+		if (unlink_cnt >= threshold) {
+			use_parallel = 1;
+		}
+	}
+	if (use_parallel) {
+		struct parallel_unlink_data *unlink_data;
+		CALLOC_ARRAY(unlink_data, num_workers);
+		threaded_init_remove_scheduled_dirs();
+		struct strbuf removal_caches[num_workers];
+		for (i = 0; i < num_workers; i++) {
+			struct parallel_unlink_data *data = &unlink_data[i];
+			strbuf_init(&removal_caches[i], 50);
+			data->start = i;
+			data->cache = index->cache;
+			data->len = index->cache_nr;
+			data->step = num_workers;
+			data->super_prefix = super_prefix;
+			data->removal_cache = &removal_caches[i];
+			errs = pthread_create(&data->pthread, NULL, parallel_unlink_proc, data);
+			if (errs)
+				die(_("unable to create parallel_checkout thread: %s"), strerror(errs));
+		}
+		for (i = 0; i < num_workers; i++) {
+			void *t_cnt;
+			if (pthread_join(unlink_data[i].pthread, &t_cnt))
+				die("unable to join parallel_unlink_thread");
+			cnt += *((unsigned *)t_cnt);
+			display_progress(progress, cnt);
+		}
+		threaded_remove_scheduled_dirs_clean_up();
+		for (i = 0; i < num_workers; i++) {
+			threaded_remove_scheduled_dirs(&removal_caches[i]);
+		}
+		remove_marked_cache_entries(index, 0);
+	} else {
+		for (i = 0; i < index->cache_nr; i++) {
+			const struct cache_entry *ce = index->cache[i];
+			if (ce->ce_flags & CE_WT_REMOVE) {
+				display_progress(progress, ++cnt);
+				unlink_entry(ce, super_prefix);
+			}
+		}
+		remove_marked_cache_entries(index, 0);
+	    remove_scheduled_dirs();
+	}
+	return cnt;
+}
diff --git a/parallel-checkout.h b/parallel-checkout.h
index c575284005..e851b773d9 100644
--- a/parallel-checkout.h
+++ b/parallel-checkout.h
@@ -43,6 +43,18 @@ size_t pc_queue_size(void);
 int run_parallel_checkout(struct checkout *state, int num_workers, int threshold,
 			  struct progress *progress, unsigned int *progress_cnt);
 
+/*
+ * Unlink all the unlink entries in the index, returning the number of entries
+ * unlinked plus the origin value of cnt. If the number of entries
+ * to be removed is smaller than the specified threshold, the operation
+ * is performed sequentially.
+ */
+unsigned run_parallel_unlink(struct index_state *index,
+			  struct progress *progress,
+			  const char *super_prefix,
+			  int num_workers, int threshold,
+			  unsigned cnt);
+
 /****************************************************************
  * Interface with checkout--worker
  ****************************************************************/
@@ -76,6 +88,19 @@ struct parallel_checkout_item {
 	struct stat st;
 };
 
+struct parallel_unlink_data {
+	pthread_t pthread;
+	struct cache_entry **cache;
+	struct strbuf *removal_cache;
+	size_t len;
+	int start;
+	size_t step;
+	unsigned cnt;
+	const char *super_prefix;
+};
+
+void *parallel_unlink_proc(void *_data);
+
 /*
  * The fixed-size portion of `struct parallel_checkout_item` that is sent to the
  * workers. Following this will be 2 strings: ca.working_tree_encoding and
-- 
2.43.0


^ permalink raw reply related

* [RFC PATCH 2/4] entry: add threaded_unlink_entry function
From: Han Young @ 2023-12-03 13:39 UTC (permalink / raw)
  To: git; +Cc: Han Young
In-Reply-To: <20231203133911.41594-1-hanyoung@protonmail.com>

From: Han Young <hanyang.tony@bytedance.com>

Add threaded_unlink_entry function, the threaded function uses cache passed by arguments instead of the default cache. It also calls threaded variant of schedule_dir_for_removal to ensure dirs are removed in multithreaded unlink.
---
Another duplicated function. Because default removal cache and default lstat cache live in different source files,
threaded variant of check_leading_path and schedule_dir_for_removal must be called here
instead of choosing to pass explicit or default cache.

 entry.c | 16 ++++++++++++++++
 entry.h |  3 +++
 2 files changed, 19 insertions(+)

diff --git a/entry.c b/entry.c
index 076e97eb89..04440beb2b 100644
--- a/entry.c
+++ b/entry.c
@@ -567,6 +567,22 @@ int checkout_entry_ca(struct cache_entry *ce, struct conv_attrs *ca,
 	return write_entry(ce, path.buf, ca, state, 0, nr_checkouts);
 }
 
+void threaded_unlink_entry(const struct cache_entry *ce, const char *super_prefix,
+			   struct strbuf *removal, struct cache_def *cache)
+{
+	const struct submodule *sub = submodule_from_ce(ce);
+	if (sub) {
+		/* state.force is set at the caller. */
+		submodule_move_head(ce->name, super_prefix, "HEAD", NULL,
+				    SUBMODULE_MOVE_HEAD_FORCE);
+	}
+	if (threaded_check_leading_path(cache, ce->name, ce_namelen(ce), 1) >= 0)
+		return;
+	if (remove_or_warn(ce->ce_mode, ce->name))
+		return;
+	threaded_schedule_dir_for_removal(ce->name, ce_namelen(ce), removal);
+}
+
 void unlink_entry(const struct cache_entry *ce, const char *super_prefix)
 {
 	const struct submodule *sub = submodule_from_ce(ce);
diff --git a/entry.h b/entry.h
index ca3ed35bc0..413ca3822d 100644
--- a/entry.h
+++ b/entry.h
@@ -2,6 +2,7 @@
 #define ENTRY_H
 
 #include "convert.h"
+#include "symlinks.h"
 
 struct cache_entry;
 struct index_state;
@@ -56,6 +57,8 @@ int finish_delayed_checkout(struct checkout *state, int show_progress);
  * down from "read-tree" et al.
  */
 void unlink_entry(const struct cache_entry *ce, const char *super_prefix);
+void threaded_unlink_entry(const struct cache_entry *ce, const char *super_prefix,
+			   struct strbuf *removal, struct cache_def *cache);
 
 void *read_blob_entry(const struct cache_entry *ce, size_t *size);
 int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st);
-- 
2.43.0


^ permalink raw reply related

* [RFC PATCH 1/4] symlinks: add and export threaded rmdir variants
From: Han Young @ 2023-12-03 13:39 UTC (permalink / raw)
  To: git; +Cc: Han Young
In-Reply-To: <20231203133911.41594-1-hanyoung@protonmail.com>

From: Han Young <hanyang.tony@bytedance.com>

Add and export threaded variants of remove dir related functions, these functions will be used by parallel unlink
---
Most of the code of threaded_schedule_dir_for_removal and threaded_do_remove_scheduled_dirs is duplicated.
We can remove the duplication either via breaking the function into smaller functions, or pass the cache as parameters.
If we choose to pass the cache explicitly, default cache in both entry.c and symlinks.c probably need to be moved to
unpack-trees.c. I'm not satisfied with using mutex guarded hashset to ensure every dir is removed. But I can't come
up with a better way.

 symlinks.c | 120 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 symlinks.h |   6 +++
 2 files changed, 123 insertions(+), 3 deletions(-)

diff --git a/symlinks.c b/symlinks.c
index b29e340c2d..c8cb0a7eb7 100644
--- a/symlinks.c
+++ b/symlinks.c
@@ -2,9 +2,9 @@
 #include "gettext.h"
 #include "setup.h"
 #include "symlinks.h"
+#include "hashmap.h"
+#include "pthread.h"
 
-static int threaded_check_leading_path(struct cache_def *cache, const char *name,
-				       int len, int warn_on_lstat_err);
 static int threaded_has_dirs_only_path(struct cache_def *cache, const char *name, int len, int prefix_len);
 
 /*
@@ -229,7 +229,7 @@ int check_leading_path(const char *name, int len, int warn_on_lstat_err)
  * directory, or if we were unable to lstat() it. If warn_on_lstat_err is true,
  * also emit a warning for this error.
  */
-static int threaded_check_leading_path(struct cache_def *cache, const char *name,
+int threaded_check_leading_path(struct cache_def *cache, const char *name,
 				       int len, int warn_on_lstat_err)
 {
 	int flags;
@@ -277,6 +277,51 @@ static int threaded_has_dirs_only_path(struct cache_def *cache, const char *name
 }
 
 static struct strbuf removal = STRBUF_INIT;
+static struct hashmap dir_set;
+pthread_mutex_t dir_set_mutex = PTHREAD_MUTEX_INITIALIZER;
+struct rmdir_hash_entry {
+      struct hashmap_entry hash;
+      char *dir;
+      size_t dirlen;
+};
+
+/* rmdir_hashmap comparison function */
+static int rmdir_hash_entry_cmp(const void *cmp_data UNUSED,
+			       const struct hashmap_entry *eptr,
+			       const struct hashmap_entry *entry_or_key UNUSED,
+			       const void *keydata)
+{
+	const struct rmdir_hash_entry *a, *b;
+
+	a = container_of(eptr, const struct rmdir_hash_entry, hash);
+	return strcmp(a->dir, (char *)keydata);
+}
+
+void threaded_init_remove_scheduled_dirs(void)
+{
+	unsigned flags = 0;
+	hashmap_init(&dir_set, rmdir_hash_entry_cmp, &flags, 0);
+}
+
+static void add_dir_to_rmdir_hash(char *dir, size_t dirlen)
+{
+	struct rmdir_hash_entry *e;
+	struct hashmap_entry *ent;
+	int hash = strhash(dir);
+	pthread_mutex_lock(&dir_set_mutex);
+	ent = hashmap_get_from_hash(&dir_set, hash, dir);
+
+	if (!ent) {
+		e = xmalloc(sizeof(struct rmdir_hash_entry));
+		hashmap_entry_init(&e->hash, hash);
+		char *_dir= xmallocz(dirlen);
+		memcpy(_dir, dir, dirlen+1);
+		e->dir = _dir;
+		e->dirlen = dirlen;
+		hashmap_put_entry(&dir_set, e, hash);
+	}
+	pthread_mutex_unlock(&dir_set_mutex);
+}
 
 static void do_remove_scheduled_dirs(int new_len)
 {
@@ -294,6 +339,26 @@ static void do_remove_scheduled_dirs(int new_len)
 	removal.len = new_len;
 }
 
+
+static void threaded_do_remove_scheduled_dirs(int new_len, struct strbuf *removal)
+{
+	while (removal->len > new_len) {
+		removal->buf[removal->len] = '\0';
+		if (startup_info->original_cwd &&
+		     !strcmp(removal->buf, startup_info->original_cwd))
+			 break;
+		if (rmdir(removal->buf)) {
+			add_dir_to_rmdir_hash(removal->buf, removal->len);
+			break;
+		}
+		do {
+			removal->len--;
+		} while (removal->len > new_len &&
+			 removal->buf[removal->len] != '/');
+	}
+	removal->len = new_len;
+}
+
 void schedule_dir_for_removal(const char *name, int len)
 {
 	int match_len, last_slash, i, previous_slash;
@@ -327,11 +392,60 @@ void schedule_dir_for_removal(const char *name, int len)
 		strbuf_add(&removal, &name[match_len], last_slash - match_len);
 }
 
+void threaded_schedule_dir_for_removal(const char *name, int len, struct strbuf *removal_cache)
+{
+	int match_len, last_slash, i, previous_slash;
+
+	if (startup_info->original_cwd &&
+	    !strcmp(name, startup_info->original_cwd))
+		return;	/* Do not remove the current working directory */
+
+	match_len = last_slash = i =
+		longest_path_match(name, len, removal_cache->buf, removal_cache->len,
+				   &previous_slash);
+	/* Find last slash inside 'name' */
+	while (i < len) {
+		if (name[i] == '/')
+			last_slash = i;
+		i++;
+	}
+
+	/*
+	 * If we are about to go down the directory tree, we check if
+	 * we must first go upwards the tree, such that we then can
+	 * remove possible empty directories as we go upwards.
+	 */
+	if (match_len < last_slash && match_len < removal_cache->len)
+		threaded_do_remove_scheduled_dirs(match_len, removal_cache);
+	/*
+	 * If we go deeper down the directory tree, we only need to
+	 * save the new path components as we go down.
+	 */
+	if (match_len < last_slash)
+		strbuf_add(removal_cache, &name[match_len], last_slash - match_len);
+}
+
 void remove_scheduled_dirs(void)
 {
 	do_remove_scheduled_dirs(0);
 }
 
+void threaded_remove_scheduled_dirs_clean_up(void)
+{
+	struct hashmap_iter iter;
+	const struct rmdir_hash_entry *entry;
+
+	hashmap_for_each_entry(&dir_set, &iter, entry, hash /* member name */) {
+		schedule_dir_for_removal(entry->dir, entry->dirlen);
+	}
+	remove_scheduled_dirs();
+}
+
+void threaded_remove_scheduled_dirs(struct strbuf *removal_cache)
+{
+	threaded_do_remove_scheduled_dirs(0, removal_cache);
+}
+
 void invalidate_lstat_cache(void)
 {
 	reset_lstat_cache(&default_cache);
diff --git a/symlinks.h b/symlinks.h
index 7ae3d5b856..7898eae941 100644
--- a/symlinks.h
+++ b/symlinks.h
@@ -20,9 +20,15 @@ static inline void cache_def_clear(struct cache_def *cache)
 int has_symlink_leading_path(const char *name, int len);
 int threaded_has_symlink_leading_path(struct cache_def *, const char *, int);
 int check_leading_path(const char *name, int len, int warn_on_lstat_err);
+int threaded_check_leading_path(struct cache_def *cache, const char *name,
+				       int len, int warn_on_lstat_err);
 int has_dirs_only_path(const char *name, int len, int prefix_len);
 void invalidate_lstat_cache(void);
 void schedule_dir_for_removal(const char *name, int len);
+void threaded_schedule_dir_for_removal(const char *name, int len, struct strbuf *removal_cache);
 void remove_scheduled_dirs(void);
+void threaded_remove_scheduled_dirs(struct strbuf *removal_cache);
+void threaded_init_remove_scheduled_dirs(void);
+void threaded_remove_scheduled_dirs_clean_up(void);
 
 #endif /* SYMLINKS_H */
-- 
2.43.0


^ permalink raw reply related

* [RFC PATCH 0/4] add parallel unlink
From: Han Young @ 2023-12-03 13:39 UTC (permalink / raw)
  To: git; +Cc: Han Young

We have had parallel_checkout option since 04155bdad, but the unlink is still performed single threaded.
With a very large repository, directory rename or reorganization can lead to a large amount of unlinked entries.
In some instance, the unlink process can be slower than the parallel checkout.

This series of patches introduces basic support for parallel unlink. The removal of individual files
can be easily multithreaded, but removing empty directories is a little tricky.
If one thread decides to remove the directory, it may still have files that need to be deleted by
another thread. I had to use a mutex-guarded hashset to collect these 'race' directories,
and remove them after all threads have been joined. Maybe there are ways to do this
without mutex and hashmap?

The speed of unlinking files seems to vary from system to system. I did some tests with a private repo.
When I checkout a commit with 15000 moved files on a Linux machine with btrfs, parallel_unlink yields
10% speed up. But on a Intel MacBook Pro with APFS, the speed up is over 100%. I find it difficult to
choose the default threshold of parallel_unlink.

This series is by no means complete. Many functions contains duplicated code, and there are some
memory leaks. I want to know the community opinion before proceed, if it's worth doing or a waste of time.

Han Young (4):
  symlinks: add and export threaded rmdir variants
  entry: add threaded_unlink_entry function
  parallel-checkout: add parallel_unlink
  unpack-trees: introduce parallel_unlink

 entry.c             |  16 ++++++
 entry.h             |   3 ++
 parallel-checkout.c |  80 +++++++++++++++++++++++++++++
 parallel-checkout.h |  25 +++++++++
 symlinks.c          | 120 ++++++++++++++++++++++++++++++++++++++++++--
 symlinks.h          |   6 +++
 unpack-trees.c      |  15 +-----
 7 files changed, 249 insertions(+), 16 deletions(-)

-- 
2.43.0


^ permalink raw reply

* Re: [Patch] test-lib-functions.sh : change test_i18ngrep to test_grep
From: Junio C Hamano @ 2023-12-03 13:19 UTC (permalink / raw)
  To: Kousik Sanagavarapu; +Cc: Shreyansh Paliwal, git
In-Reply-To: <ZWw6r2EDGkpgyYEM@five231003>

Kousik Sanagavarapu <five231003@gmail.com> writes:

> Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com> wrote:
>
>> Subject: [Patch] test-lib-functions.sh : change test_i18ngrep to test_grep
>
> For anyone reading the subject, I think reading
>
> 	change test_i18ngrep to test_grep
>
> would be confusing, as from the looks of it, the patch does remove
> test_i18ngrep() and replace it with test_grep (I mean the plan is to
> remove test_i18ngrep only after we are sure that it doesn't exist in the
> code anywhere, anymore) but only making a change in the wording of an
> error message within test_grep().

;-)  

Yes, that was exactly my reaction to the subject (I'm on
vacation so I only scanned the subject lines of incoming patches
without looking at anything else and thought "hmph, it is good
somebody else is cleaning up new uses of test_i18ngrep that have
been introduced by topics simultaneously in flight").

^ permalink raw reply

* Re: [PATCH] doc: make the gitfile syntax easier to discover
From: Junio C Hamano @ 2023-12-03 13:15 UTC (permalink / raw)
  To: Marcel Krause; +Cc: git mailing list
In-Reply-To: <20231128065558.1061206-1-mk+copyleft@pimpmybyte.de>

Marcel Krause <mk+copyleft@pimpmybyte.de> writes:

> Signed-off-by: Marcel Krause <mk+copyleft@pimpmybyte.de>
> ---
>  Documentation/gitrepository-layout.txt | 4 +++-
>  Documentation/glossary-content.txt     | 2 ++
>  2 files changed, 5 insertions(+), 1 deletion(-)

Looking good.  Will queue.  Thanks.

^ permalink raw reply

* Re: [PATCH 1/2] completion: refactor existence checks for special refs
From: Junio C Hamano @ 2023-12-03 13:15 UTC (permalink / raw)
  To: Stan Hu; +Cc: git, Patrick Steinhardt, Christian Couder
In-Reply-To: <20231130202404.89791-2-stanhu@gmail.com>

Stan Hu <stanhu@gmail.com> writes:

> In preparation for the reftable backend, this commit introduces a
> '__git_ref_exists' function that continues to use 'test -f' to
> determine whether a given ref exists in the local filesystem.

I do not think git_ref_exists is a good name for what this one does.
The name adds undue cognitive load on readers.  As far as I can
tell, with this helper function, you are interested in handling only
pseudorefs like $GIT_DIR/FOOBAR_HEAD (oh, retitle the patch to call
them "pseudorefs", not "special refs", by the way), and that is why
you can get away with a simple

    [ -f "$__git_repo_path/$ref" ]

without bothering to check the packed-refs file.  The checks this
patch replace to calls to this helper functions in the original make
it clear, simply because they spell out what they are checking, like
"CHERRY_PICK_HEAD", why a mere filesystem check was sufficient, but
once you give an overly generic name like "ref-exists", it becomes
tempting to (ab|mis)use it to check for branches and tags, which is
not your intention at all, and the implementation does not work well
for that purpose.

> Each caller of '__git_ref_exists' must call '__git_find_repo_path`
> first. '_git_restore' already used 'git rev-parse HEAD', but to use
> '__git_ref_exists' insert a '__git_find_repo_path' at the start.

To whom do you think the above piece of information is essential for
them to work?  Whoever updates the completion script, finds existing
use of __git_ref_exists, and thinks the helper would be useful for
their own use.  To them, the above needs be in the in-code comment
to make it discoverable.  It is OK to have it also in the proposed
log message, but it is not as essential, especially if you have it
in-code anyway.

Another thing you would need to make sure that the potential users
of this helpers understand is of course this is meant to be used
only on pseudorefs.  You can of course do so with an additional
in-code comment, but giving a more appropriate name to the helper
would be the easiest and simplest, e.g. __git_pseudoref_exists or
something.

> Reviewed-by: Patrick Steinhardt <ps@pks.im>
> Reviewed-by: Christian Couder <christian.couder@gmail.com>
> Signed-off-by: Stan Hu <stanhu@gmail.com>
> ---
>  contrib/completion/git-completion.bash | 20 +++++++++++++++-----
>  1 file changed, 15 insertions(+), 5 deletions(-)
>
> diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
> index 13a39ebd2e..9fbdc13f9a 100644
> --- a/contrib/completion/git-completion.bash
> +++ b/contrib/completion/git-completion.bash
> @@ -122,6 +122,15 @@ __git ()
>  		${__git_dir:+--git-dir="$__git_dir"} "$@" 2>/dev/null
>  }
>  
> +# Runs git in $__git_repo_path to determine whether a ref exists.
> +# 1: The ref to search
> +__git_ref_exists ()
> +{
> +	local ref=$1
> +
> +	[ -f "$__git_repo_path/$ref" ]
> +}

^ permalink raw reply

* Re: [PATCH 05/24] midx: implement `DISP` chunk
From: Junio C Hamano @ 2023-12-03 13:15 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Jeff King, Patrick Steinhardt
In-Reply-To: <c52d7e7b27a9add4f58b8334db4fe4498af1c90f.1701198172.git.me@ttaylorr.com>

Taylor Blau <me@ttaylorr.com> writes:

> diff --git a/Documentation/git-multi-pack-index.txt b/Documentation/git-multi-pack-index.txt
> index 3696506eb3..d130e65b28 100644
> --- a/Documentation/git-multi-pack-index.txt
> +++ b/Documentation/git-multi-pack-index.txt
> @@ -49,6 +49,10 @@ write::
>  	--stdin-packs::
>  		Write a multi-pack index containing only the set of
>  		line-delimited pack index basenames provided over stdin.
> +		Lines beginning with a '+' character (followed by the
> +		pack index basename as before) have their pack marked as
> +		"disjoint". See the "`DISP` chunk and disjoint packs"
> +		section in linkgit:gitformat-pack[5] for more.

Makes one wonder who computes the set of packfiles, decides to
prefix '+' to which ones, and how it does so, none of which appear
in this step (which is understandable).  As the flow of information
is from the producer of individual "disjoint" packs (not in this
step) to this new logic in "--stdin-packs" to the new "DISP" chunk
writer (the primary focus of this step) to the final consumer of
"DISP" chunk (not in this step), we are digging from the middle
(hopefully to both directions in other steps).  It is probably the
easiest way to explain the idea to start from the primary data
structures and "DISP" seems to be a good place to start.

> +	    Two packs are "disjoint" with respect to one another when they have
> +	    disjoint sets of objects.
> + In other words, any object found in a pack
> +	    contained in the set of disjoint packfiles is guaranteed to be
> +	    uniquely located among those packs.

I often advise people to rethink what they wrote _before_ "In other
words", because the use of that phrase is a sign that the author
considers the statement is hard to grok and needs rephrasing, in
which case, the rephrased version may be a better way to explain the
concept being presented without the harder-to-grok version.

But I do not think this one is a good example to apply the advice.
It is because "In other words," is somewhat misused in the sentence.
Two "disjoint" packs do not store any common object (which is how
you defined the adjective "disjoint" in the first sentence).  "As a
consequence"/"Hence", an object found in one pack among many
"disjoint" packs will not appear in others.

By the way, how strict does this disjointness have to be?

Let's examine an extreme case.  When you have two packs that are
"mostly" disjoint, but have one single object in common, how would
that object interfere with the bulk streaming of existing packdata
out of these two packs?  Would we be able to, say, safely pretend
that the problematic single object lives only in one but not in the
other (in other words, can we safely "ignore" the presence of the
copy in the other pack)?  I think it would break down if that
ignored copy is used as a delta base of another object in the same
pack, and the base object for the delta is recorded as OFS_DELTA
(which most likely every delta is these days), because we no longer
can stream out such deltified object without re-pointing its base to
the other copy, which will in turn screw up the relative offset of
other objects in the same stream.

OK, so it seems they really need to be strictly disjoint in order to
participate in the reuse of the existing packdata.  

> +When a chunk of bytes are reused from an existing pack, any objects
> +contained therein do not need to be added to the packing list, saving
> +memory and CPU time. But a chunk from an existing packfile can only be
> +reused when the following conditions are met:
> +
> +  - The chunk contains only objects which were requested by the caller
> +    (i.e. does not contain any objects which the caller didn't ask for
> +    explicitly or implicitly).

OK.

> +  - All objects stored as offset- or reference-deltas also include their
> +    base object in the resulting pack.

Are thin packs obsolete?

> diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh
> index c4c6060cee..fd24e0c952 100755
> --- a/t/t5319-multi-pack-index.sh
> +++ b/t/t5319-multi-pack-index.sh
> @@ -1157,4 +1157,62 @@ test_expect_success 'reader notices too-small revindex chunk' '
>  	test_cmp expect.err err
>  '
>  
> +test_expect_success 'disjoint packs are stored via the DISP chunk' '
> +	test_when_finished "rm -fr repo" &&
> +	git init repo &&
> +	(
> +		cd repo &&
> +
> +		for i in 1 2 3 4 5
> +		do
> +			test_commit "$i" &&
> +			git repack -d || return 1
> +		done &&
> +
> +		find $objdir/pack -type f -name "*.idx" | xargs -n 1 basename | sort >packs &&

That is an overly-long line.

> +test_expect_success 'non-disjoint packs are detected' '
> +	test_when_finished "rm -fr repo" &&
> +	git init repo &&
> +	(
> +		cd repo &&
> +
> +		test_commit base &&
> +		git repack -d &&
> +		test_commit other &&
> +		git repack -a &&
> +
> +		ls -la .git/objects/pack/ &&

Is this line a leftover debugging aid?

> +		find $objdir/pack -type f -name "*.idx" |
> +			sed -e "s/.*\/\(.*\)$/+\1/g" >in &&

Lose "g"; it adds unnecessary cognitive burden to the readers if the
patterh is expected to match multiple times, and you know that is
not possible (your pattern is right anchored at the end).  This may
apply equally to other uses of "sed" in this patch.

> +		test_must_fail git multi-pack-index write --stdin-packs \
> +			--bitmap <in 2>err &&
> +		grep "duplicate object.* among disjoint packs" err
> +	)
> +'
> +
>  test_done

^ permalink raw reply

* Re: [PATCH v2] hooks--pre-commit: detect non-ASCII when renaming
From: Junio C Hamano @ 2023-12-03 13:15 UTC (permalink / raw)
  To: Julian Prein via GitGitGadget; +Cc: git, Julian Prein
In-Reply-To: <pull.1291.v2.git.git.1701360836307.gitgitgadget@gmail.com>

"Julian Prein via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Julian Prein <druckdev@protonmail.com>
>
> When diff.renames is turned on, the diff-filter will not return renamed
> files (or copied ones with diff.renames=copy) and potential non-ASCII
> characters would not be caught by this hook.
>
> Use the plumbing command diff-index instead of the porcelain one to not
> be affected by diff.rename.

Makes sense.

An obvious alternative would be to pass "--no-renames" and keep
using the Porcelain "git diff", but this is how the plumbing
"diff-index" and friends are meant to be used.  Looking good.

Will queue.  Thanks.

^ permalink raw reply

* Re: [PATCH v3 4/4] completion: avoid user confusion in non-cone mode
From: Junio C Hamano @ 2023-12-03 13:15 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: git, Elijah Newren, SZEDER Gábor
In-Reply-To: <89501b366ff0476c1b3d36ff9b6b7c80fa6fc98f.1701583024.git.gitgitgadget@gmail.com>

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> @@ -1581,7 +1581,12 @@ test_expect_success 'non-cone mode sparse-checkout uses bash completion' '
>  		# expected to be empty since we have not configured
>  		# custom completion for non-cone mode
>  		test_completion "git sparse-checkout set f" <<-\EOF
> -
> +		/folder1/0/1/t.txt 
> +		/folder1/expected 
> +		/folder1/out 
> +		/folder1/out_sorted 
> +		/folder2/0/t.txt 
> +		/folder3/t.txt 
>  		EOF

The "test_completion" helper strips "Z" at the end of its input
lines so that a hunk like this can be written without having to
worry about mail transport eating trailing whitespaces.

I'll squash the following while queuing.

Thanks.

 t/t9902-completion.sh | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh
index 5634b78fc5..aa9a614de3 100755
--- a/t/t9902-completion.sh
+++ b/t/t9902-completion.sh
@@ -1581,12 +1581,12 @@ test_expect_success 'non-cone mode sparse-checkout gives rooted paths' '
 		# expected to be empty since we have not configured
 		# custom completion for non-cone mode
 		test_completion "git sparse-checkout set f" <<-\EOF
-		/folder1/0/1/t.txt
-		/folder1/expected
-		/folder1/out
-		/folder1/out_sorted
-		/folder2/0/t.txt
-		/folder3/t.txt
+		/folder1/0/1/t.txt Z
+		/folder1/expected Z
+		/folder1/out Z
+		/folder1/out_sorted Z
+		/folder2/0/t.txt Z
+		/folder3/t.txt Z
 		EOF
 	)
 '

^ permalink raw reply related

* Re: [PATCH 02/12] treewide: remove unnecessary includes in source files
From: Junio C Hamano @ 2023-12-03 13:15 UTC (permalink / raw)
  To: Elijah Newren via GitGitGadget; +Cc: git, Elijah Newren
In-Reply-To: <dbfb108214d71ab29c29230eed3c4d40fe4b42b7.1701585682.git.gitgitgadget@gmail.com>

"Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes:

> diff --git a/trace2.c b/trace2.c
> index 6dc74dff4c7..d4220af9ae1 100644
> --- a/trace2.c
> +++ b/trace2.c
> @@ -1,12 +1,8 @@
>  #include "git-compat-util.h"
> -#include "config.h"
> -#include "json-writer.h"
> -#include "quote.h"
>  #include "repository.h"
>  #include "run-command.h"
>  #include "sigchain.h"
>  #include "thread-utils.h"
> -#include "version.h"
>  #include "trace.h"
>  #include "trace2.h"
>  #include "trace2/tr2_cfg.h"

An in-flight topic seem to want to see git_env_bool() that is
declared in parse.h that is pulled in via inclusion of config.h
hence this hunk breaks 'seen'.

> diff --git a/t/helper/test-trace2.c b/t/helper/test-trace2.c
> index d5ca0046c89..a0032ee3964 100644
> --- a/t/helper/test-trace2.c
> +++ b/t/helper/test-trace2.c
> @@ -2,7 +2,6 @@
>  #include "strvec.h"
>  #include "run-command.h"
>  #include "exec-cmd.h"
> -#include "config.h"
>  #include "repository.h"
>  #include "trace2.h"

An in-flight topic starts using "struct key_value_info" that is
available via the inclusion of "config.h", hence this hunk breaks
the build of 'seen'.

> diff --git a/t/helper/test-fast-rebase.c b/t/helper/test-fast-rebase.c
> index cac20a72b3f..f9472c99143 100644
> --- a/t/helper/test-fast-rebase.c
> +++ b/t/helper/test-fast-rebase.c
> @@ -24,7 +24,6 @@
>  #include "read-cache-ll.h"
>  #include "refs.h"
>  #include "revision.h"
> -#include "sequencer.h"
>  #include "setup.h"
>  #include "strvec.h"
>  #include "tree.h"

I'll register the following evil merge as the merge-fix/ for this
topic.

In addition, t/helper/test-fast-rebase.c that is touched by this
step will simply disappear with the cc/git-replay topic, so it may
not be a bad idea to exclude it from the patchset.

Thanks.

diff --git a/t/helper/test-trace2.c b/t/helper/test-trace2.c
index 16089f04e1..55c06e4269 100644
--- a/t/helper/test-trace2.c
+++ b/t/helper/test-trace2.c
@@ -1,4 +1,5 @@
 #include "test-tool.h"
+#include "config-parse.h"
 #include "strvec.h"
 #include "run-command.h"
 #include "exec-cmd.h"
diff --git a/trace2.c b/trace2.c
index 4fa059199c..452428b09b 100644
--- a/trace2.c
+++ b/trace2.c
@@ -1,4 +1,5 @@
 #include "git-compat-util.h"
+#include "parse.h"
 #include "repository.h"
 #include "run-command.h"
 #include "sigchain.h"

^ permalink raw reply related

* Re: [Patch] test-lib-functions.sh : change test_i18ngrep to test_grep
From: Kousik Sanagavarapu @ 2023-12-03  8:22 UTC (permalink / raw)
  To: Shreyansh Paliwal; +Cc: git, Junio C Hamano
In-Reply-To: <CAPYXD64yCuMta_iGE+ZwgxrJn0U5shcwcB9jaiNkFhvff=R7MQ@mail.gmail.com>

Shreyansh Paliwal <shreyanshpaliwalcmsmn@gmail.com> wrote:

> Subject: [Patch] test-lib-functions.sh : change test_i18ngrep to test_grep

For anyone reading the subject, I think reading

	change test_i18ngrep to test_grep

would be confusing, as from the looks of it, the patch does remove
test_i18ngrep() and replace it with test_grep (I mean the plan is to
remove test_i18ngrep only after we are sure that it doesn't exist in the
code anywhere, anymore) but only making a change in the wording of an
error message within test_grep().

Also I think we can drop the SP after "related topic" part of the patch
and the colon (but have the SP after the colon), that is

	"test-lib-functions.sh: ..."

Also, nit, but I think we should have [PATCH] instead of [Patch]. I'm not
really sure if Junio's setup treats [PATCH] and [Patch] to be same :)

> Recently the test_i18ngrep was deprecated from the source code and
> test_grep was implemented but in the test-lib-functions.sh file , in
> the test_grep() function definition,

This recent deprecation was made in the commit,
2e87fca189 (test framework: further deprecate test_i18ngrep, 2023-10-31)
and it makes sense to include it in the commit message as the following
change is essentially something that the previous commit seems to have
forgotten to do.

> it is written BUG "too few parameters to test_i18ngrep".

I think it is not necessary to mention what is the current code
in _this case_ as it can be read in the change itself :)

> So the following patch solves the minor problem.

What exactly is the problem? I think it should be mentioned in the commit
message that the wording of the error message causes confusion ;) as when
test_grep() is used in a test and this test fails. That the change is - it
would be clear to see

	"too few parameters to test_grep"

instead of

	"too few parameters to test_i18ngrep"

> Signed-off-by: Shreyansh Paliwal <Shreyanshpaliwalcmsmn@gmail.com>
> ---
>  t/test-lib-functions.sh | 2 +-
>  1 file changed, 1 insertions(+), 1 deletions(-)
> 
>  t/test-lib-functions.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)diff --git
> a/t/test-lib-functions.sh b/t/test-lib-functions.sh
> index 9c3cf12b26..8737c95e0c 100644
> --- a/t/test-lib-functions.sh
> --- a/t/test-lib-functions.sh
> +++ b/t/test-lib-functions.sh
> @@ -1277,7 +1277,7 @@ test_grep () {
>         if test $# -lt 2 ||
>            { test "x!" = "x$1" && test $# -lt 3 ; }
>         then
> -               BUG "too few parameters to test_i18ngrep"
> +               BUG "too few parameters to test_grep"
>         fi
> 
>         if test "x!" = "x$1"
> --
> 2.43

The diff format doesn't seem proper (some repeated lines and no newlines
at the required places).

If you have no go-to tool to send patches through email then git-send-email
is a really good tool to do it. It handles most of the work for you.
"MyFirstContribution" has a guide to do so

	https://git-send-email.io/ (also has setup with GMail)
	https://git-scm.com/docs/MyFirstContribution#howto-git-send-email

Another good resource which is not linked often is

	https://flusp.ime.usp.br/git/sending-patches-by-email-with-git/

by Matheus Tavares, also a Git Contributor. It also has other useful links
which are worth a read.

Thanks

^ permalink raw reply

* [PATCH 12/12] treewide: remove unnecessary includes in source files
From: Elijah Newren via GitGitGadget @ 2023-12-03  6:41 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren
In-Reply-To: <pull.1617.git.1701585682.gitgitgadget@gmail.com>

From: Elijah Newren <newren@gmail.com>

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/archive.c                   | 1 -
 builtin/commit-graph.c              | 1 -
 builtin/fsck.c                      | 1 -
 builtin/fsmonitor--daemon.c         | 2 --
 builtin/grep.c                      | 1 -
 builtin/mktag.c                     | 1 -
 builtin/rev-list.c                  | 1 -
 builtin/send-pack.c                 | 1 -
 commit-graph.c                      | 1 -
 compat/simple-ipc/ipc-shared.c      | 3 ---
 compat/simple-ipc/ipc-unix-socket.c | 1 -
 fsmonitor-ipc.c                     | 1 -
 http.c                              | 1 -
 line-log.c                          | 1 -
 merge-ort.c                         | 1 -
 notes-utils.c                       | 1 -
 ref-filter.c                        | 1 -
 remote-curl.c                       | 1 -
 repo-settings.c                     | 1 -
 t/helper/test-repository.c          | 1 -
 trace2/tr2_ctr.c                    | 1 -
 trace2/tr2_tmr.c                    | 1 -
 22 files changed, 25 deletions(-)

diff --git a/builtin/archive.c b/builtin/archive.c
index 90761fdfee0..15ee1ec7bb7 100644
--- a/builtin/archive.c
+++ b/builtin/archive.c
@@ -9,7 +9,6 @@
 #include "parse-options.h"
 #include "pkt-line.h"
 #include "repository.h"
-#include "sideband.h"
 
 static void create_output_file(const char *output_file)
 {
diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index 81a28c6fcdd..666ad574a46 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -4,7 +4,6 @@
 #include "environment.h"
 #include "gettext.h"
 #include "hex.h"
-#include "lockfile.h"
 #include "parse-options.h"
 #include "repository.h"
 #include "commit-graph.h"
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 9317b7b841d..a7cf94f67ed 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -10,7 +10,6 @@
 #include "refs.h"
 #include "pack.h"
 #include "cache-tree.h"
-#include "tree-walk.h"
 #include "fsck.h"
 #include "parse-options.h"
 #include "progress.h"
diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index 9f80b9eaff5..1593713f4cb 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -7,7 +7,6 @@
 #include "parse-options.h"
 #include "fsmonitor-ll.h"
 #include "fsmonitor-ipc.h"
-#include "fsmonitor-path-utils.h"
 #include "fsmonitor-settings.h"
 #include "compat/fsmonitor/fsm-health.h"
 #include "compat/fsmonitor/fsm-listen.h"
@@ -15,7 +14,6 @@
 #include "repository.h"
 #include "simple-ipc.h"
 #include "khash.h"
-#include "pkt-line.h"
 #include "run-command.h"
 #include "trace.h"
 #include "trace2.h"
diff --git a/builtin/grep.c b/builtin/grep.c
index f076cc705b4..c8e33f97755 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -14,7 +14,6 @@
 #include "parse-options.h"
 #include "string-list.h"
 #include "run-command.h"
-#include "userdiff.h"
 #include "grep.h"
 #include "quote.h"
 #include "dir.h"
diff --git a/builtin/mktag.c b/builtin/mktag.c
index d8e0b5afc07..4767f1a97e6 100644
--- a/builtin/mktag.c
+++ b/builtin/mktag.c
@@ -3,7 +3,6 @@
 #include "hex.h"
 #include "parse-options.h"
 #include "strbuf.h"
-#include "tag.h"
 #include "replace-object.h"
 #include "object-file.h"
 #include "object-store-ll.h"
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 460ba7cbaa7..b3f47838580 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -12,7 +12,6 @@
 #include "object-name.h"
 #include "object-file.h"
 #include "object-store-ll.h"
-#include "pack.h"
 #include "pack-bitmap.h"
 #include "log-tree.h"
 #include "graph.h"
diff --git a/builtin/send-pack.c b/builtin/send-pack.c
index 395f2e490d4..0b839f583a0 100644
--- a/builtin/send-pack.c
+++ b/builtin/send-pack.c
@@ -2,7 +2,6 @@
 #include "config.h"
 #include "hex.h"
 #include "pkt-line.h"
-#include "sideband.h"
 #include "run-command.h"
 #include "remote.h"
 #include "connect.h"
diff --git a/commit-graph.c b/commit-graph.c
index e7212400da3..15980cf9492 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -4,7 +4,6 @@
 #include "gettext.h"
 #include "hex.h"
 #include "lockfile.h"
-#include "pack.h"
 #include "packfile.h"
 #include "commit.h"
 #include "object.h"
diff --git a/compat/simple-ipc/ipc-shared.c b/compat/simple-ipc/ipc-shared.c
index e5e1dda8ccd..cb176d966f2 100644
--- a/compat/simple-ipc/ipc-shared.c
+++ b/compat/simple-ipc/ipc-shared.c
@@ -1,8 +1,5 @@
 #include "git-compat-util.h"
 #include "simple-ipc.h"
-#include "strbuf.h"
-#include "pkt-line.h"
-#include "thread-utils.h"
 
 #ifndef SUPPORTS_SIMPLE_IPC
 /*
diff --git a/compat/simple-ipc/ipc-unix-socket.c b/compat/simple-ipc/ipc-unix-socket.c
index b2f4f22ce44..9b3f2cdf8c9 100644
--- a/compat/simple-ipc/ipc-unix-socket.c
+++ b/compat/simple-ipc/ipc-unix-socket.c
@@ -2,7 +2,6 @@
 #include "gettext.h"
 #include "simple-ipc.h"
 #include "strbuf.h"
-#include "pkt-line.h"
 #include "thread-utils.h"
 #include "trace2.h"
 #include "unix-socket.h"
diff --git a/fsmonitor-ipc.c b/fsmonitor-ipc.c
index 153918cf768..45471b5b741 100644
--- a/fsmonitor-ipc.c
+++ b/fsmonitor-ipc.c
@@ -1,5 +1,4 @@
 #include "git-compat-util.h"
-#include "fsmonitor-ll.h"
 #include "gettext.h"
 #include "simple-ipc.h"
 #include "fsmonitor-ipc.h"
diff --git a/http.c b/http.c
index a64005ceb80..3565c4ec611 100644
--- a/http.c
+++ b/http.c
@@ -4,7 +4,6 @@
 #include "http.h"
 #include "config.h"
 #include "pack.h"
-#include "sideband.h"
 #include "run-command.h"
 #include "url.h"
 #include "urlmatch.h"
diff --git a/line-log.c b/line-log.c
index c276ccec549..8ff6ccb7724 100644
--- a/line-log.c
+++ b/line-log.c
@@ -12,7 +12,6 @@
 #include "xdiff-interface.h"
 #include "strbuf.h"
 #include "log-tree.h"
-#include "userdiff.h"
 #include "line-log.h"
 #include "setup.h"
 #include "strvec.h"
diff --git a/merge-ort.c b/merge-ort.c
index 2a0be468505..77ba7f3020c 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -41,7 +41,6 @@
 #include "revision.h"
 #include "sparse-index.h"
 #include "strmap.h"
-#include "submodule.h"
 #include "trace2.h"
 #include "tree.h"
 #include "unpack-trees.h"
diff --git a/notes-utils.c b/notes-utils.c
index 97c031c26ec..08e5dbc6073 100644
--- a/notes-utils.c
+++ b/notes-utils.c
@@ -5,7 +5,6 @@
 #include "gettext.h"
 #include "refs.h"
 #include "notes-utils.h"
-#include "repository.h"
 #include "strbuf.h"
 
 void create_notes_commit(struct repository *r,
diff --git a/ref-filter.c b/ref-filter.c
index 96959a3762c..01b90e325c2 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -29,7 +29,6 @@
 #include "commit-reach.h"
 #include "worktree.h"
 #include "hashmap.h"
-#include "strvec.h"
 
 static struct ref_msg {
 	const char *gone;
diff --git a/remote-curl.c b/remote-curl.c
index 55eefa70f97..7f81bf3fafc 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -11,7 +11,6 @@
 #include "run-command.h"
 #include "pkt-line.h"
 #include "string-list.h"
-#include "sideband.h"
 #include "strvec.h"
 #include "credential.h"
 #include "oid-array.h"
diff --git a/repo-settings.c b/repo-settings.c
index 525f69c0c77..30cd4787627 100644
--- a/repo-settings.c
+++ b/repo-settings.c
@@ -2,7 +2,6 @@
 #include "config.h"
 #include "repository.h"
 #include "midx.h"
-#include "compat/fsmonitor/fsm-listen.h"
 
 static void repo_cfg_bool(struct repository *r, const char *key, int *dest,
 			  int def)
diff --git a/t/helper/test-repository.c b/t/helper/test-repository.c
index c925655c648..0c7c5aa4dd7 100644
--- a/t/helper/test-repository.c
+++ b/t/helper/test-repository.c
@@ -3,7 +3,6 @@
 #include "commit.h"
 #include "environment.h"
 #include "hex.h"
-#include "object-store-ll.h"
 #include "object.h"
 #include "repository.h"
 #include "setup.h"
diff --git a/trace2/tr2_ctr.c b/trace2/tr2_ctr.c
index 87cf9034fba..d3a33715c14 100644
--- a/trace2/tr2_ctr.c
+++ b/trace2/tr2_ctr.c
@@ -1,5 +1,4 @@
 #include "git-compat-util.h"
-#include "thread-utils.h"
 #include "trace2/tr2_tgt.h"
 #include "trace2/tr2_tls.h"
 #include "trace2/tr2_ctr.h"
diff --git a/trace2/tr2_tmr.c b/trace2/tr2_tmr.c
index 31d0e4d1bd1..51f564b07a4 100644
--- a/trace2/tr2_tmr.c
+++ b/trace2/tr2_tmr.c
@@ -1,5 +1,4 @@
 #include "git-compat-util.h"
-#include "thread-utils.h"
 #include "trace2/tr2_tgt.h"
 #include "trace2/tr2_tls.h"
 #include "trace2/tr2_tmr.h"
-- 
gitgitgadget

^ permalink raw reply related

* [PATCH 11/12] treewide: add direct includes currently only pulled in transitively
From: Elijah Newren via GitGitGadget @ 2023-12-03  6:41 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Elijah Newren
In-Reply-To: <pull.1617.git.1701585682.gitgitgadget@gmail.com>

From: Elijah Newren <newren@gmail.com>

The next commit will remove a bunch of unnecessary includes, but to do
so, we need some of the lower level direct includes that files rely on
to be explicitly specified.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 builtin/commit-graph.c      | 1 +
 builtin/for-each-ref.c      | 1 +
 builtin/fsmonitor--daemon.c | 1 +
 commit-graph.c              | 1 +
 4 files changed, 4 insertions(+)

diff --git a/builtin/commit-graph.c b/builtin/commit-graph.c
index c5684342ecf..81a28c6fcdd 100644
--- a/builtin/commit-graph.c
+++ b/builtin/commit-graph.c
@@ -11,6 +11,7 @@
 #include "object-store-ll.h"
 #include "progress.h"
 #include "replace-object.h"
+#include "strbuf.h"
 #include "tag.h"
 #include "trace2.h"
 
diff --git a/builtin/for-each-ref.c b/builtin/for-each-ref.c
index 6235d72f9d3..b5bc700d13c 100644
--- a/builtin/for-each-ref.c
+++ b/builtin/for-each-ref.c
@@ -1,4 +1,5 @@
 #include "builtin.h"
+#include "commit.h"
 #include "config.h"
 #include "gettext.h"
 #include "object.h"
diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
index 7260604534f..9f80b9eaff5 100644
--- a/builtin/fsmonitor--daemon.c
+++ b/builtin/fsmonitor--daemon.c
@@ -12,6 +12,7 @@
 #include "compat/fsmonitor/fsm-health.h"
 #include "compat/fsmonitor/fsm-listen.h"
 #include "fsmonitor--daemon.h"
+#include "repository.h"
 #include "simple-ipc.h"
 #include "khash.h"
 #include "pkt-line.h"
diff --git a/commit-graph.c b/commit-graph.c
index 5bfee53e87b..e7212400da3 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -1,5 +1,6 @@
 #include "git-compat-util.h"
 #include "config.h"
+#include "csum-file.h"
 #include "gettext.h"
 #include "hex.h"
 #include "lockfile.h"
-- 
gitgitgadget


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox