Git development
 help / color / mirror / Atom feed
* Re: [PATCH v7 08/12] test-http-server: add simple authentication
From: Jeff King @ 2023-01-26 10:02 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget
  Cc: git, Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Glen Choo, Victoria Dye,
	Ævar Arnfjörð Bjarmason
In-Reply-To: <b8d3e81b5534148359c7e92807cf1e2795480ddf.1674252531.git.gitgitgadget@gmail.com>

On Fri, Jan 20, 2023 at 10:08:46PM +0000, Matthew John Cheetham via GitGitGadget wrote:

> +struct auth_module {
> +	char *scheme;
> +	char *challenge_params;
> +	struct string_list *tokens;
> +};

This is a really minor nit, but: why is "tokens" a pointer? It's always
initialized, so you never need or want to test it for NULL.

That would make this:

> +	if (create) {
> +		struct auth_module *mod = xmalloc(sizeof(struct auth_module));
> +		mod->scheme = xstrdup(scheme);
> +		mod->challenge_params = NULL;
> +		ALLOC_ARRAY(mod->tokens, 1);
> +		string_list_init_dup(mod->tokens);

simplify to:

  string_list_init_dup(&mod->tokens);

and one does not have to wonder why we use ALLOC_ARRAY() there, but not
when allocating the module itself. :)

Likewise you could skip freeing it, but since the memory is held until
program end anyway, that doesn't happen either way.

Certainly what you have won't behave wrong; I'd consider this more like
a coding style thing.

> +	cat >auth.config <<-EOF &&
> +	[auth]
> +		challenge = no-params
> +		challenge = with-params:foo=\"bar\" p=1
> +		challenge = with-params:foo=\"replaced\" q=1
> +
> +		token = no-explicit-challenge:valid-token
> +		token = no-explicit-challenge:also-valid
> +		token = reset-tokens:these-tokens
> +		token = reset-tokens:will-be-reset
> +		token = reset-tokens:
> +		token = reset-tokens:the-only-valid-one
> +
> +		allowAnonymous = false
> +	EOF
> +
> +	cat >OUT.expected <<-EOF &&
> +	WWW-Authenticate: no-params
> +	WWW-Authenticate: with-params foo="replaced" q=1
> +	WWW-Authenticate: no-explicit-challenge
> +	WWW-Authenticate: reset-tokens
> +
> +	Error: 401 Unauthorized
> +	EOF

OK, so I think now we are getting to the interesting part of what your
custom http-server does compared to something like apache. And the
answer so far is: custom WWW-Authenticate lines.

I think we could do that with mod_headers pretty easily. But presumably
we also want to check that we are getting the correct tokens, generate a
401, etc.

I suspect this could all be done as a CGI wrapping git-http-backend. You
can influence the HTTP response code by sending:

   Status: 401 Authorization Required
   WWW-Authenticate: whatever you want

And likewise you can see what the client sends by putting something like
this in apache.conf:

   SetEnvIf Authorization "(.*)" HTTP_AUTHORIZATION=$1

and then reading $HTTP_AUTHORIZATION as you like. At that point, it
feels like a simple shell or perl script could then decide whether to
return a 401 or not (and if not, then just exec git-http-backend to do
the rest). And the scripts would be simple enough that you could have
individual scripts to implement various schemes, rather than
implementing this configuration scheme. You can control which script is
run based on the URL; see the way we match /broken_smart/, etc, in
t/lib-httpd/apache.conf.

-Peff

^ permalink raw reply

* Re: [PATCH v7 07/12] test-http-server: pass Git requests to http-backend
From: Jeff King @ 2023-01-26  9:37 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget
  Cc: git, Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Glen Choo, Victoria Dye,
	Ævar Arnfjörð Bjarmason
In-Reply-To: <ca9c2787248688cd7d8e20043a6ed75d93654e35.1674252531.git.gitgitgadget@gmail.com>

On Fri, Jan 20, 2023 at 10:08:45PM +0000, Matthew John Cheetham via GitGitGadget wrote:

> +static int is_git_request(struct req *req)
> +{
> +	static regex_t *smart_http_regex;
> +	static int initialized;
> +
> +	if (!initialized) {
> +		smart_http_regex = xmalloc(sizeof(*smart_http_regex));
> +		/*
> +		 * This regular expression matches all dumb and smart HTTP
> +		 * requests that are currently in use, and defined in
> +		 * Documentation/gitprotocol-http.txt.
> +		 *
> +		 */
> +		if (regcomp(smart_http_regex, "^/(HEAD|info/refs|"
> +			    "objects/info/[^/]+|git-(upload|receive)-pack)$",
> +			    REG_EXTENDED)) {
> +			warning("could not compile smart HTTP regex");
> +			smart_http_regex = NULL;
> +		}
> +		initialized = 1;
> +	}
> +
> +	return smart_http_regex &&
> +		!regexec(smart_http_regex, req->uri_path.buf, 0, NULL, 0);
> +}

Assigning NULL to smart_http_regex leaks the earlier allocation. You
could free it, but I have to wonder why it is on the heap in the first
place. Yes, you check for NULL and return 0 if it failed to compile,
but...why would it? It's hard-coded. And if it does fail, wouldn't you
want to fail immediately and loudly, because it means all of the tests
are broken?

I.e., something like this is a bit simpler:

diff --git a/t/helper/test-http-server.c b/t/helper/test-http-server.c
index 14d170e640..8048ba1636 100644
--- a/t/helper/test-http-server.c
+++ b/t/helper/test-http-server.c
@@ -327,28 +327,25 @@ static enum worker_result req__read(struct req *req, int fd)
 
 static int is_git_request(struct req *req)
 {
-	static regex_t *smart_http_regex;
+	static regex_t smart_http_regex;
 	static int initialized;
 
 	if (!initialized) {
-		smart_http_regex = xmalloc(sizeof(*smart_http_regex));
 		/*
 		 * This regular expression matches all dumb and smart HTTP
 		 * requests that are currently in use, and defined in
 		 * Documentation/gitprotocol-http.txt.
 		 *
 		 */
-		if (regcomp(smart_http_regex, "^/(HEAD|info/refs|"
+		if (regcomp(&smart_http_regex, "^/(HEAD|info/refs|"
 			    "objects/info/[^/]+|git-(upload|receive)-pack)$",
 			    REG_EXTENDED)) {
-			warning("could not compile smart HTTP regex");
-			smart_http_regex = NULL;
+			die("could not compile smart HTTP regex");
 		}
 		initialized = 1;
 	}
 
-	return smart_http_regex &&
-		!regexec(smart_http_regex, req->uri_path.buf, 0, NULL, 0);
+	return !regexec(&smart_http_regex, req->uri_path.buf, 0, NULL, 0);
 }
 
 static enum worker_result do__git(struct req *req, const char *user)

> +start_http_server () {
> +	#
> +	# Launch our server into the background in repo_dir.
> +	#
> +	(
> +		cd "$REPO_DIR"
> +		test-http-server --verbose \
> +			--listen=127.0.0.1 \
> +			--port=$GIT_TEST_HTTP_PROTOCOL_PORT \
> +			--reuseaddr \
> +			--pid-file="$PID_FILE" \
> +			"$@" \
> +			2>"$SERVER_LOG" &
> +	)
> +	#
> +	# Give it a few seconds to get started.
> +	#
> +	for k in 0 1 2 3 4
> +	do
> +		if test -f "$PID_FILE"
> +		then
> +			return 0
> +		fi
> +		sleep 1
> +	done

Yuck. This makes the test take a long time to run, since it will almost
always "sleep 1" each time (and it looks like you bring the server up
and down in several tests). Worse, it's at risk of failing racily if it
ever takes more than 5 seconds to start up. That should be uncommon, I'd
think, but could happen on a heavily loaded system.

There's a race-less solution using fifos in lib-git-daemon.sh, where we
wait for the "ready to rumble" line. It's kind of horrific, but it does
work and is battle-tested.

-Peff

^ permalink raw reply related

* Feature Request - GIT config - Reference value of init.defaultBranch in alias
From: Michal Aron @ 2023-01-26  9:33 UTC (permalink / raw)
  To: git

Hello,

I am curious why it is not possible to reference other variables in git config.

Useful scenario:

Global git config containing all useful aliases which often reference
the default main branch. I discovered the config variable
init.defaultBranch and I would like to reference it in the aliases.

1) global config having init.defaultBranch = master
2) global config having alias com = checkout [ init.defaultBranch ]
3) local repos replacing this value e.g. to init.defaultBranch = main
4) using this alias in local repo git com, which will checkout me to
the init.defaultBranch of this repository..

This would allow us to use global aliases independently on the
repositories.. If this has not yet been discussed I believe it is the
time since many repositories are being customized to the company
team/company culture.

Appreciate and thanks, Michal

^ permalink raw reply

* Re: [PATCH v7 06/12] test-http-server: add HTTP request parsing
From: Jeff King @ 2023-01-26  9:30 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget
  Cc: git, Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Glen Choo, Victoria Dye,
	Ævar Arnfjörð Bjarmason
In-Reply-To: <43f1cdcbb82022521558dc649213eb4538364870.1674252531.git.gitgitgadget@gmail.com>

On Fri, Jan 20, 2023 at 10:08:44PM +0000, Matthew John Cheetham via GitGitGadget wrote:

> +#define REQ__INIT { \
> +     .start_line = STRBUF_INIT, \
> +     .uri_path = STRBUF_INIT, \
> +     .query_args = STRBUF_INIT, \
> +     .header_list = STRING_LIST_INIT_NODUP, \
> +     .content_type = NULL, \
> +     .content_length = 0, \
> +     .has_content_length = 0, \
> +}

We declare header_list as nodup, but later we put actual duplicated
strings in it:

> +             hp = strbuf_detach(&h, NULL);
> +             string_list_append(&req->header_list, hp);

So later when we free it:

> +static void req__release(struct req *req)
> +{
> +     strbuf_release(&req->start_line);
> +
> +     strbuf_release(&req->uri_path);
> +     strbuf_release(&req->query_args);
> +
> +     string_list_clear(&req->header_list, 0);
> +}

the strings will be leaked. There are a lot of solutions here, including
setting strdup_strings right before freeing. But it's probably
reasonable to just use INIT_DUP, and then when storing, just do:

  string_list_append(&req->header_list, h.buf);

Since "h" is filled by strbuf_getwholeline(), there's no need to erase
the contents. It should reset the buffer itself (and so you end up
re-using the same buffer, rather than freeing it for each loop).  You
will have to remember to strbuf_release() after the loop, though.

The leak isn't very big, and we hold onto it until the process ends
anyway, but it will probably cause the leak-detector to complain.

(Yet another solution would just be to dump the trace as we parse the
headers, rather than holding them, since that appears to be the only use
of header_list).

> +	/*
> +	 * Read the set of HTTP headers into a string-list.
> +	 */
> +	while (1) {
> +		if (strbuf_getwholeline_fd(&h, fd, '\n') == EOF)
> +			goto done;
> +		strbuf_trim_trailing_newline(&h);
> +
> +		if (!h.len)
> +			goto done; /* a blank line ends the header */
> +
> +		hp = strbuf_detach(&h, NULL);
> +		string_list_append(&req->header_list, hp);
> +
> +		/* also store common request headers as struct req members */
> +		if (skip_iprefix(hp, "Content-Type: ", &hv)) {
> +			req->content_type = hv;

I think this is stricter than necessary. The whitespace after the colon
is optional, but can also be longer than just one space (or could be a
tab). It's probably OK to be picky here since this is just for tests,
but we'd want to make sure we're not this picky on the client side.

> +		} else if (skip_iprefix(hp, "Content-Length: ", &hv)) {
> +			/*
> +			 * Content-Length is always non-negative, but has no
> +			 * upper bound according to RFC 7230 (§3.3.2).
> +			 */
> +			intmax_t len = 0;
> +			if (sscanf(hv, "%"PRIdMAX, &len) != 1 || len < 0 ||
> +			    len == INTMAX_MAX) {
> +				logerror("invalid content-length: '%s'", hv);
> +				result = WR_CLIENT_ERROR;
> +				goto done;
> +			}

We usually avoid sscanf because it's error-checking sucks. For example,
this will accept "123.garbage", but you can't tell because you have no
clue how far it got.  Something like strtoimax() is better. It probably
doesn't matter much since this is test code, though I do think in the
long run it would be nice to add scanf(), etc, to our list of banned
functions (there are one or two other uses currently, though, so that
isn't imminent).

-Peff

^ permalink raw reply

* Re: [PATCH v6 5/6] diff-lib: parallelize run_diff_files for submodules
From: Glen Choo @ 2023-01-26  9:16 UTC (permalink / raw)
  To: Calvin Wan, git
  Cc: Calvin Wan, emilyshaffer, avarab, phillip.wood123, newren,
	jonathantanmy
In-Reply-To: <20230117193041.708692-6-calvinwan@google.com>


Calvin Wan <calvinwan@google.com> writes:

> @@ -226,6 +242,8 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
>  			newmode = ce->ce_mode;
>  		} else {
>  			struct stat st;
> +			unsigned ignore_untracked = 0;
> +			int defer_submodule_status = !!revs->repo;

What is the reasoning behind this condition? I would expect revs->repo
to always be set, and we would always end up deferring.

>  			newmode = ce_mode_from_stat(ce, st.st_mode);
> +			if (defer_submodule_status) {
> +				struct submodule_status_util tmp = {
> +					.changed = changed,
> +					.dirty_submodule = 0,
> +					.ignore_untracked = ignore_untracked,
> +					.newmode = newmode,
> +					.ce = ce,
> +					.path = ce->name,
> +				};
> +				struct string_list_item *item;
> +
> +				item = string_list_append(&submodules, ce->name);
> +				item->util = xmalloc(sizeof(tmp));
> +				memcpy(item->util, &tmp, sizeof(tmp));

(Not a C expert) Since we don't return the string list, I wonder if we
can avoid the memcpy() by using &tmp like so:

  struct string_list_item *item;
  item = string_list_append(&submodules, ce->name);
  item->util = &tmp;

And then when we call string_list_clear(), we wouldn't need to free the
util since we exit the stack frame.

> +test_expect_success 'diff in superproject with submodules respects parallel settings' '
> +	test_when_finished "rm -f trace.out" &&
> +	(
> +		GIT_TRACE=$(pwd)/trace.out git diff &&
> +		grep "1 tasks" trace.out &&
> +		>trace.out &&
> +
> +		git config submodule.diffJobs 8 &&
> +		GIT_TRACE=$(pwd)/trace.out git diff &&
> +		grep "8 tasks" trace.out &&
> +		>trace.out &&
> +
> +		GIT_TRACE=$(pwd)/trace.out git -c submodule.diffJobs=0 diff &&
> +		grep "preparing to run up to [0-9]* tasks" trace.out &&
> +		! grep "up to 0 tasks" trace.out &&
> +		>trace.out
> +	)
> +'
> +

Could we get tests to check that the output of git diff isn't changed by
setting parallelism? This might not be feasible for submodule.diffJobs >
1 due to raciness, but it would be good to see for submodule.diffJobs =
1 at least.

>  test_expect_success 'git diff --raw HEAD' '
>  	hexsz=$(test_oid hexsz) &&
>  	git diff --raw --abbrev=$hexsz HEAD >actual &&
> diff --git a/t/t7506-status-submodule.sh b/t/t7506-status-submodule.sh
> index d050091345..52a82b703f 100755
> --- a/t/t7506-status-submodule.sh
> +++ b/t/t7506-status-submodule.sh
> @@ -412,4 +412,23 @@ test_expect_success 'status with added file in nested submodule (short)' '
>  	EOF
>  '
>  
> +test_expect_success 'status in superproject with submodules respects parallel settings' '
> +	test_when_finished "rm -f trace.out" &&
> +	(
> +		GIT_TRACE=$(pwd)/trace.out git status &&
> +		grep "1 tasks" trace.out &&
> +		>trace.out &&
> +
> +		git config submodule.diffJobs 8 &&
> +		GIT_TRACE=$(pwd)/trace.out git status &&
> +		grep "8 tasks" trace.out &&
> +		>trace.out &&
> +
> +		GIT_TRACE=$(pwd)/trace.out git -c submodule.diffJobs=0 status &&
> +		grep "preparing to run up to [0-9]* tasks" trace.out &&
> +		! grep "up to 0 tasks" trace.out &&
> +		>trace.out
> +	)
> +'
> +

Ditto for "status".

^ permalink raw reply

* Re: [PATCH v6 5/6] diff-lib: parallelize run_diff_files for submodules
From: Glen Choo @ 2023-01-26  9:09 UTC (permalink / raw)
  To: Calvin Wan, git
  Cc: Calvin Wan, emilyshaffer, avarab, phillip.wood123, newren,
	jonathantanmy
In-Reply-To: <20230117193041.708692-6-calvinwan@google.com>

As Jonathan mentioned in [1], I think we should refactor functions out
from the serial implementation in a preparatory patch, then use those
functions to implement the parallel version in this patch. In its
current form, there is a fair amount of duplicated code, which makes it
tricky to review because of the additional overhead of checking what the
duplicated code does and whether we've copied it correcly.

For cleanliness, I'll only point out the duplicated code in this email;
I'll comment on other things I spotted in another one.

[1] https://lore.kernel.org/git/20221128210125.2751300-1-jonathantanmy@google.com/

Calvin Wan <calvinwan@google.com> writes:

> +		for (size_t i = 0; i < submodules.nr; i++) {
> +			struct submodule_status_util *util = submodules.items[i].util;
> +			struct cache_entry *ce = util->ce;
> +			unsigned int oldmode;
> +			const struct object_id *old_oid, *new_oid;
> +
> +			if (!util->changed && !util->dirty_submodule) {
> +				ce_mark_uptodate(ce);
> +				mark_fsmonitor_valid(istate, ce);
> +				if (!revs->diffopt.flags.find_copies_harder)
> +					continue;
> +			}
> +			oldmode = ce->ce_mode;
> +			old_oid = &ce->oid;
> +			new_oid = util->changed ? null_oid() : &ce->oid;
> +			diff_change(&revs->diffopt, oldmode, util->newmode,
> +				    old_oid, new_oid,
> +				    !is_null_oid(old_oid),
> +				    !is_null_oid(new_oid),
> +				    ce->name, 0, util->dirty_submodule);
> +		}
> +	}

The lines from "if (!util->changed && !util->dirty_submodule)" onwards
are copied from earlier in run_diff_files(). This might be refactored
into something like diff_submodule_change().

> +static struct status_task *
> +get_status_task_from_index(struct submodule_parallel_status *sps,
> +			   struct strbuf *err)
> +{
> +	for (; sps->index_count < sps->submodule_names->nr; sps->index_count++) {
> +		struct submodule_status_util *util = sps->submodule_names->items[sps->index_count].util;
> +		struct status_task *task;
> +		struct strbuf buf = STRBUF_INIT;
> +		const char *git_dir;
> +
> +		strbuf_addf(&buf, "%s/.git", util->path);
> +		git_dir = read_gitfile(buf.buf);

This...

> +static int get_next_submodule_status(struct child_process *cp,
> +				     struct strbuf *err, void *data,
> +				     void **task_cb)
> +{
> +	struct submodule_parallel_status *sps = data;
> +	struct status_task *task = get_status_task_from_index(sps, err);
> +
> +	if (!task)
> +		return 0;
> +
> +	child_process_init(cp);
> +	prepare_submodule_repo_env_in_gitdir(&cp->env);
> +
> +	strvec_init(&cp->args);
> +	strvec_pushl(&cp->args, "status", "--porcelain=2", NULL);
> +	if (task->ignore_untracked)
> +		strvec_push(&cp->args, "-uno");
> +
> +	prepare_submodule_repo_env(&cp->env);
> +	cp->git_cmd = 1;

this...

> +static int status_start_failure(struct strbuf *err,
> +				void *cb, void *task_cb)
> +{
> +	struct submodule_parallel_status *sps = cb;
> +	struct status_task *task = task_cb;
> +
> +	sps->result = 1;
> +	strbuf_addf(err,
> +	    _("could not run 'git status --porcelain=2' in submodule %s"),
> +	    task->path);
> +	return 0;
> +}

this...

> +static int status_finish(int retvalue, struct strbuf *err,
> +			 void *cb, void *task_cb)
> +{
> +	struct submodule_parallel_status *sps = cb;
> +	struct status_task *task = task_cb;
> +	struct string_list_item *it =
> +		string_list_lookup(sps->submodule_names, task->path);
> +	struct submodule_status_util *util = it->util;
> +
> +	if (retvalue) {
> +		sps->result = 1;
> +		strbuf_addf(err,
> +		    _("'git status --porcelain=2' failed in submodule %s"),
> +		    task->path);
> +	}

and this are all copied from different parts of is_submodule_modified().
To refactor them out, I think we could combine the first two into
"setup_submodule_status()". The last one could be moved into
"process_submodule_status_result()" or perhaps we could find a way to
combine it into parse_status_porcelain().

^ permalink raw reply

* Re: [PATCH v7 04/12] test-http-server: add stub HTTP server test helper
From: Jeff King @ 2023-01-26  8:58 UTC (permalink / raw)
  To: Matthew John Cheetham via GitGitGadget
  Cc: git, Derrick Stolee, Lessley Dennington, Matthew John Cheetham,
	M Hickford, Jeff Hostetler, Glen Choo, Victoria Dye,
	Ævar Arnfjörð Bjarmason
In-Reply-To: <17c890ee1080abc81267e44a1eaff4609ee41690.1674252531.git.gitgitgadget@gmail.com>

On Fri, Jan 20, 2023 at 10:08:42PM +0000, Matthew John Cheetham via GitGitGadget wrote:

> Introduce a mini HTTP server helper that in the future will be enhanced
> to provide a frontend for the git-http-backend, with support for
> arbitrary authentication schemes.
> 
> Right now, test-http-server is a pared-down copy of the git-daemon that
> always returns a 501 Not Implemented response to all callers.

This may be a dumb question, but I didn't see it raised or answered in
the cover letter or earlier in the thread: what does this custom server
give us that our current use of apache in the tests does not?

I'd imagine the answer is along the lines of: configuring apache to
respond to auth in the way we'd like is hard and/or impossible. And if
so, and if it's just "hard", I'd ask "how hard?".

Because I see a few downsides to introducing a custom server here:

  1. It may or may not behave like real-world servers, which makes the
     test slightly less good. Not that apache can claim to cover all
     real-world behavior, but it's probably closer to reality (and that
     has flushed out interesting bugs and behaviors before).

  2. It's a non-trivial amount of code, doing tricky things like
     daemonizing, socket setup and I/O, pidfiles, and so on.  For
     example, it handles multiple listen addresses, and ipv6. Do we
     really need that? And we take shortcuts around things like CGI
     output buffering. I do see that you tried to reuse some existing
     code, but...

  3. You're reusing parts of git-daemon, which I personally consider to
     be one of the absolute low-points of code quality inside git.git. I
     know that's a subjective statement, but my experience running it
     within GitHub was that there were a lot of rough edges, and we
     ended up rewriting several parts of it. In particular, the
     child-handling is inefficient (I seem to recall that it's quadratic
     in several places) and has odd behaviors (its kill_some_child() is
     basically nonsense, and can starve requests). Probably none of that
     matters for your use case in tests, which is likely doing one
     request at a time.

     But then I'd wonder: do we really need those bits at all, then? In
     fact, would it be sufficient to write the server to handle one
     request at a time, without spawning a worker child at all?

I dunno. I know I am showing up to review quite late in the life of this
patch series, and that probably makes me a bad person to start the
review with "and could you re-do the whole test infrastructure". So if
you want to tell me to get lost, I'd understand. But I had hoped that
one day we could just delete all of daemon.c, and this moves in the
opposite direction.

I think my order of preference (if you care ;) ) is:

  1. Can we do it with apache?

  2. If not, could we do it with a trivial application of some existing
     http server framework? I know that may mean extra dependencies, but
     there's a lot of perl in the test suite already, and it doesn't
     seem too terrible to me to require it for these tests.

  3. If not, can we make the http-server code even more minimal?

>  Makefile                            |   1 +
>  contrib/buildsystems/CMakeLists.txt |  11 +-
>  t/helper/.gitignore                 |   1 +
>  t/helper/test-http-server.c         | 381 ++++++++++++++++++++++++++++
>  4 files changed, 392 insertions(+), 2 deletions(-)
>  create mode 100644 t/helper/test-http-server.c

If we do use this code, here are a few small bits I noticed:

> +static void child_handler(int signo)
> +{
> +	/*
> +	 * Otherwise empty handler because systemcalls will get interrupted
> +	 * upon signal receipt
> +	 * SysV needs the handler to be rearmed
> +	 */
> +	signal(SIGCHLD, child_handler);
> +}

daemon.c has this, too. If we're going to share its child-handling code,
should it maybe just handle this part, too?

> +static int service_loop(struct socketlist *socklist)
> +{
> +	struct pollfd *pfd;
> +	int i;
> +
> +	CALLOC_ARRAY(pfd, socklist->nr);

(Actually, Coverity noticed this, not me).

This pfd is never freed. I know this is copied from daemon.c, but in
that file we never return from the function. Here you do break out of
the loop and try to clean up; you'd want to free(pfd) there.

> +	for (;;) {
> +		int i;
> +		int nr_ready;
> +		int timeout = (pid_file ? 100 : -1);
> +
> +		check_dead_children(first_child, &live_children, loginfo);
> +
> +		nr_ready = poll(pfd, socklist->nr, timeout);
> +		if (nr_ready < 0) {
> +			if (errno != EINTR) {
> +				logerror("Poll failed, resuming: %s",
> +				      strerror(errno));
> +				sleep(1);
> +			}
> +			continue;
> +		}
> +		else if (nr_ready == 0) {
> +			/*
> +			 * If we have a pid_file, then we watch it.
> +			 * If someone deletes it, we shutdown the service.
> +			 * The shell scripts in the test suite will use this.
> +			 */
> +			if (!pid_file || file_exists(pid_file))
> +				continue;
> +			goto shutdown;
> +		}

I wondered how this would work, since removal of the pid file won't
trigger poll(). But it looks like you set the timeout unconditionally in
that case, so we're effectively polling for its removal every 100ms.
It's not beautiful, but it should work reliably.

That also made me wonder about this timeout:

> +		if (skip_prefix(arg, "--timeout=", &v)) {
> +			timeout = atoi(v);
> +			continue;
> +		}

but it is not used. The "timeout" in service_loop shadows the global,
and nobody ever looks at the global (however, it looks like a later
patch adds an alarm() which uses it).

> +		if (skip_prefix(arg, "--max-connections=", &v)) {
> +			max_connections = atoi(v);
> +			if (max_connections < 0)
> +				max_connections = 0; /* unlimited */
> +			continue;
> +		}

I don't think any caller ever uses --max-connections, though. This could
be dropped, and that would simplify service_loop a bit.

> +	if (listen_port == 0)
> +		listen_port = DEFAULT_GIT_PORT;

That's a funny default. Surely "80" or even "8080" would make more
sense. But really, since our purpose is tests, isn't it a
misconfiguration if the test does not tell us which port (which is
generally dynamic based on the test number), and we should bail?

> +	/*
> +	 * If no --listen=<addr> args are given, the setup_named_sock()
> +	 * code will use receive a NULL address and set INADDR_ANY.
> +	 * This exposes both internal and external interfaces on the
> +	 * port.
> +	 *
> +	 * Disallow that and default to the internal-use-only loopback
> +	 * address.
> +	 */
> +	if (!listen_addr.nr)
> +		string_list_append(&listen_addr, "127.0.0.1");

Likewise, it seems like you could probably ditch --listen entirely, and
just always listen on 127.0.0.1, for the purposes of the tests.

-Peff

^ permalink raw reply

* Re: [PATCH v6 6/6] submodule: call parallel code from serial status
From: Glen Choo @ 2023-01-26  8:45 UTC (permalink / raw)
  To: Calvin Wan, git
  Cc: Calvin Wan, emilyshaffer, avarab, phillip.wood123, newren,
	jonathantanmy
In-Reply-To: <kl6lv8ktvhrp.fsf@chooglen-macbookpro.roam.corp.google.com>

Glen Choo <chooglen@google.com> writes:

> Calvin Wan <calvinwan@google.com> writes:
>
>> Remove the serial implementation of status inside of
>> is_submodule_modified since the parallel implementation of status with
>> one job accomplishes the same task.
>
> I see that this is in direct response to Jonathan's earlier comment [1]
> that we should have only one implementation. Thanks, this is helpful.
> Definitely a step in the right direction.
>
> That said, I don't think this patch's position in the series makes
> sense. I would have expected a patch like this to come before 5/6. I.e.
> this series duplicates code in 5/6 and deletes it in 6/6 so that we only
> have one implementation for both serial and parallel submodule status.
>
> Instead, I would have expected we would refactor out the serial
> implementation, then use the refactored code for the parallel
> implementation. Not having duplicated code in 5/6 would shrink the line
> count a lot and make it easier to review.
>
> [1] https://lore.kernel.org/git/20221128210125.2751300-1-jonathantanmy@google.com/

Ah, I realize I completely misunderstood this patch. I thought that this
was deleting code that was duplicated between the serial and parallel
implementations in 5/6 such that both ended up sharing just one copy of
the code.

Instead, this patch deletes the serial implementation altogether and
replaces it with the parallel one. As such, this patch can't come
earlier than 5/6, because we need the parallel implementation to exist
before we can use it.

For reviewability of 5/6, I'd still strongly prefer that we refactor out
functions (I'll leave more specific comments on that patch). We could
still consider replacing the serial implementation with "parallel with a
single job", though I suspect that it will be unnecessary if we do the
refactoring well. I'm also not sure how idiomatic it is to call
run_processes_parallel() with a hardcoded value of 1, but I don't feel
too strongly about that.

^ permalink raw reply

* Re: [PATCH v6 6/6] submodule: call parallel code from serial status
From: Glen Choo @ 2023-01-26  8:09 UTC (permalink / raw)
  To: Calvin Wan, git
  Cc: Calvin Wan, emilyshaffer, avarab, phillip.wood123, newren,
	jonathantanmy
In-Reply-To: <20230117193041.708692-7-calvinwan@google.com>

Calvin Wan <calvinwan@google.com> writes:

> Remove the serial implementation of status inside of
> is_submodule_modified since the parallel implementation of status with
> one job accomplishes the same task.
>
> Combine parse_status_porcelain and parse_status_porcelain_strbuf since
> the only other caller of parse_status_porcelain was in
> is_submodule_modified

I see that this is in direct response to Jonathan's earlier comment [1]
that we should have only one implementation. Thanks, this is helpful.
Definitely a step in the right direction.

That said, I don't think this patch's position in the series makes
sense. I would have expected a patch like this to come before 5/6. I.e.
this series duplicates code in 5/6 and deletes it in 6/6 so that we only
have one implementation for both serial and parallel submodule status.

Instead, I would have expected we would refactor out the serial
implementation, then use the refactored code for the parallel
implementation. Not having duplicated code in 5/6 would shrink the line
count a lot and make it easier to review.

[1] https://lore.kernel.org/git/20221128210125.2751300-1-jonathantanmy@google.com/

^ permalink raw reply

* Re: [PATCH v2] ls-tree: add --sparse-filter-oid argument
From: Elijah Newren @ 2023-01-26  3:25 UTC (permalink / raw)
  To: William Sprent
  Cc: William Sprent via GitGitGadget, git, Eric Sunshine,
	Ævar Arnfjörð Bjarmason, Derrick Stolee,
	Victoria Dye
In-Reply-To: <18c94f70-4adf-1b4a-8777-206804c419e6@unity3d.com>

On Wed, Jan 25, 2023 at 8:16 AM William Sprent <williams@unity3d.com> wrote:
>
> On 25/01/2023 06.11, Elijah Newren wrote:
> > It looks like Ævar and Victoria have both given really good reviews
> > already, but I think I spotted some additional things to comment on.
> >
> > On Mon, Jan 23, 2023 at 3:46 AM William Sprent via GitGitGadget
> > <gitgitgadget@gmail.com> wrote:
> >>
> >> From: William Sprent <williams@unity3d.com>
> >>
> >> There is currently no way to ask git the question "which files would be
> >> part of a sparse checkout of commit X with sparse checkout patterns Y".
> >> One use-case would be that tooling may want know whether sparse checkouts
> >> of two commits contain the same content even if the full trees differ.
> >
> > Could you say more about this usecase?  Why does tooling need or want
> > to know this; won't a checkout of the new commit end up being quick
> > and simple?  (I'm not saying your usecase is bad, just curious that it
> > had never occurred to me, and I'm afraid I'm still not sure what your
> > purpose might be.)
> >
>
> I'm thinking mainly about a monorepo context where there are a number of
> distinct 'units' that can be described with sparse checkout patterns.
> And perhaps there's some tooling that only wants to perform an action if
> the content of a 'unit' changes.

So, you're basically wanting to do something like
   git ls-tree --paths-matching-sparsity-file=<pattern-file> $COMMIT1
>sparse-files
   git ls-tree --paths-matching-sparsity-file=<pattern-file> $COMMIT2
>>sparse-files
   sort sparse-files | uniq >relevant-files
   git diff --name-only $COMMIT1 $COMMIT2 >changed-files
and then checking whether relevant-files and changed-files have a
non-empty intersection?

Would that potentially be better handled by
   git diff --name-only $COMMIT1 $COMMIT2 | git check-ignore
--ignore-file=<pattern-file> --stdin
and seeing whether the output is non-empty?  We'd have to add an
"--ignore-file" option to check-ignore to override reading of
.gitignore files and such, and it'd be slightly confusing because the
documentation talks about "ignored" files rather than "selected"
files, but that's a confusion point that has been with us ever since
the gitignore mechanism was repurposed for sparse checkouts.  Or maybe
we could just add a check-sparsity helper, and then allow it to take
directories in-lieu of patterns.

This seems nicer than opening a can of worms about letting every git
command specify a different set of sparsity rules.

> Depending on the repo, it won't necessarily be quick to check out the
> commit with the given patterns. However, it is more about it being
> inconvenient to have to have a working directory, especially so if you
> want use the tooling in some kind of service or query rapidly about
> different revisions/patterns.
>
> >> Another interesting use-case would be for tooling to use in conjunction
> >> with 'git update-index --index-info'.
> >
> > Sorry, I'm not following.  Could you expound on this a bit?
> >
>
> I was imagining something along the lines of being able to generate new
> tree objects based on what matches the given sparse checkout patterns.
> Not that I have a specific use case for it right now.
>
> I think what I'm trying to evoke with that paragraph is that this
> enables integrations with git that seem interesting and weren't possible
> before.

I'm not sure if it's interesting, frightening, or something else.
Hard to say without better descriptions of usecases, which we can't
have if we don't even have a usecase.  I think I'd just strike this
paragraph.

[...]
> >> +       (*d)->pl.use_cone_patterns = core_sparse_checkout_cone;
> >
> > Hmm, so the behavior still depends upon the current sparse-checkout
> > (or lack thereof), despite the documentation and rationale of your
> > feature as being there to check how a different sparse checkout would
> > behave?
> >
> > I would hate to unconditionally turn cone_patterns off, since that
> > would come with a huge performance penalty for the biggest repos.  But
> > turning it unconditionally on wouldn't be good for the non-cone users.
> > This probably suggests we need something like another flag, or perhaps
> > separate flags for each mode.  Separate flags might provide the
> > benefit of allowing cone mode users to specify directories rather than
> > patterns, which would make it much easier for them to use.
> >
> I used 'core_sparse_checkout_cone' because I wanted to allow for the
> cone mode optimisations, but I also figured that I should respect the
> configuration. It doesn't change how the patterns are parsed in this case.
>
> I agree that it is a bit awkward to have to "translate" the directories
> into patterns when wanting to use cone mode. I can try adding
> '--[no]-cone' flags and see how that feels. Together with Victoria's
> suggestions that would result in having the following flags:
>
> * --scope=(sparse|all)
> * --sparse-patterns-file=<path>
> * --[no]-cone: used together with --sparse-patterns-file to tell git
>    whether to interpret the patterns given as directories (cone) or
>    patterns (no-cone).
>
> Which seems like a lot at first glance. But it allows for passing
> directories instead of patterns for cone mode, and is similar to the
> behaviour of 'sparse-checkout set'.
>
> Does that seem like something that would make sense?

--sparse-patterns-file still implies patterns; I think that would need
some rewording.

More importantly, though, based on your usecase description, I wonder
if you might be better served by either extending the check-ignore
subcommand or adding a similar helper ("check-sparsity"?), rather than
tweaking ls-tree.

^ permalink raw reply

* Re: [PATCH v2 1/3] branch: fix die_if_checked_out() when ignore_current_worktree
From: Rubén Justo @ 2023-01-26  3:07 UTC (permalink / raw)
  To: phillip.wood, Junio C Hamano; +Cc: Git List
In-Reply-To: <1a7fa327-3833-8da3-46d7-60bfe8dae82c@dunelm.org.uk>

On 24-ene-2023 10:35:26, Phillip Wood wrote:

> On 22/01/2023 23:21, Rubén Justo wrote:
> > I tried to maintain the relationship and the role, too.  Just introduce
> > the helper, as Phillip suggested and I think it is a good idea.
> 
> When I suggested adding a helper I was thinking of something like
> 
> static const struct worktree *do_find_shared_symref(struct worktree
> **worktrees,
>  					  const char *symref,
>  					  const char *target,
> 					  int ignore_current)
> {
> 	/*
> 	 * Body moved from find_share_symref() with a couple
> 	 * of lines added to support ignore_current
> 	 /*
> }
> 
> const struct worktree *find_shared_symref(struct worktree **worktrees,
>  					  const char *symref,
>  					  const char *target)
> {
> 	return do_find_shared_symref(worktrees, symref, target, 0)
> }
> 
> void die_if_checked_out(const char *branch, int ignore_current_worktree)
> {
> 	struct worktree **worktrees = get_worktrees();
> 	const struct worktree *wt;
> 
> 	wt = do_find_shared_symref(worktrees, "HEAD", branch,
> 				   ignore_current_worktree);
> 	/* rest unchanged */
> }
> 
> The aim was to avoid changing the public api

I thought about a solution like the one you suggest.  Also another one based on
iterations, something like wt_head_refs()....  I ended up with
is_shared_symref(), it adds some value, I think.

The public API remains unchanged, and I like that the comment for
find_shared_ref(), which is an important note, is moved to is_shared_symref(),
which contains the essential work related to the comment.

die_if_checked_out() needs to iterate (here was the wt_heads_refs()), but my
intuition makes me think it's a good step since we might need another level,
I'm not sure yet but, like "die_if_if_checked_out(allow_if_current)".

I'm going to send a v3 addressing the issues Junio commented, still with the
is_shared_symref().  If the above reasoning doesn't work for you or if the
change as-is introduces any problem, I have no objection to
"do_find_shared_symref()".

Thank you.

^ permalink raw reply

* Re: [PATCH v5 19/19] push: free_refs() the "local_refs" in set_refspecs()
From: Elijah Newren @ 2023-01-26  2:44 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, René Scharfe, Junio C Hamano, Eric Sunshine
In-Reply-To: <patch-v5-19.19-f29500a4abc-20230118T120334Z-avarab@gmail.com>

On Wed, Jan 18, 2023 at 5:08 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> Fix a memory leak that's been with us since this code was added in
> ca02465b413 (push: use remote.$name.push as a refmap, 2013-12-03).
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  builtin/push.c                          | 1 +
>  t/t1416-ref-transaction-hooks.sh        | 1 +
>  t/t2402-worktree-list.sh                | 1 +
>  t/t5504-fetch-receive-strict.sh         | 1 +
>  t/t5523-push-upstream.sh                | 1 +
>  t/t5529-push-errors.sh                  | 2 ++
>  t/t5546-receive-limits.sh               | 2 ++
>  t/t5547-push-quarantine.sh              | 2 ++
>  t/t5606-clone-options.sh                | 1 +
>  t/t5810-proto-disable-local.sh          | 2 ++
>  t/t5813-proto-disable-ssh.sh            | 2 ++
>  t/t7409-submodule-detached-work-tree.sh | 1 +
>  t/t7416-submodule-dash-url.sh           | 2 ++
>  t/t7450-bad-git-dotfiles.sh             | 2 ++
>  14 files changed, 21 insertions(+)
>
> diff --git a/builtin/push.c b/builtin/push.c
> index 60ac8017e52..f48e4c6a856 100644
> --- a/builtin/push.c
> +++ b/builtin/push.c
> @@ -129,6 +129,7 @@ static void set_refspecs(const char **refs, int nr, const char *repo)
>                 } else
>                         refspec_append(&rs, ref);
>         }
> +       free_refs(local_refs);

In the cover letter, you said you took Rene's feedback as a possible
future improvement, but perhaps it's at least calling out in the code
with a TODO comment?


>  }
>
>  static int push_url_of_remote(struct remote *remote, const char ***url_p)
> diff --git a/t/t1416-ref-transaction-hooks.sh b/t/t1416-ref-transaction-hooks.sh
> index 27731722a5b..b32ca798f9f 100755
> --- a/t/t1416-ref-transaction-hooks.sh
> +++ b/t/t1416-ref-transaction-hooks.sh
> @@ -5,6 +5,7 @@ test_description='reference transaction hooks'
>  GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
>  export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
>
> +TEST_PASSES_SANITIZE_LEAK=true
>  . ./test-lib.sh
>
>  test_expect_success setup '
> diff --git a/t/t2402-worktree-list.sh b/t/t2402-worktree-list.sh
> index 79e0fce2d90..9ad9be0c208 100755
> --- a/t/t2402-worktree-list.sh
> +++ b/t/t2402-worktree-list.sh
> @@ -5,6 +5,7 @@ test_description='test git worktree list'
>  GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
>  export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
>
> +TEST_PASSES_SANITIZE_LEAK=true
>  . ./test-lib.sh
>
>  test_expect_success 'setup' '
> diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
> index ac4099ca893..14e8af1f3b7 100755
> --- a/t/t5504-fetch-receive-strict.sh
> +++ b/t/t5504-fetch-receive-strict.sh
> @@ -4,6 +4,7 @@ test_description='fetch/receive strict mode'
>  GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
>  export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
>
> +TEST_PASSES_SANITIZE_LEAK=true
>  . ./test-lib.sh
>
>  test_expect_success 'setup and inject "corrupt or missing" object' '
> diff --git a/t/t5523-push-upstream.sh b/t/t5523-push-upstream.sh
> index fdb42920564..c9acc076353 100755
> --- a/t/t5523-push-upstream.sh
> +++ b/t/t5523-push-upstream.sh
> @@ -4,6 +4,7 @@ test_description='push with --set-upstream'
>  GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
>  export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
>
> +TEST_PASSES_SANITIZE_LEAK=true
>  . ./test-lib.sh
>  . "$TEST_DIRECTORY"/lib-terminal.sh
>
> diff --git a/t/t5529-push-errors.sh b/t/t5529-push-errors.sh
> index ce85fd30ad1..0247137cb36 100755
> --- a/t/t5529-push-errors.sh
> +++ b/t/t5529-push-errors.sh
> @@ -1,6 +1,8 @@
>  #!/bin/sh
>
>  test_description='detect some push errors early (before contacting remote)'
> +
> +TEST_PASSES_SANITIZE_LEAK=true
>  . ./test-lib.sh
>
>  test_expect_success 'setup commits' '
> diff --git a/t/t5546-receive-limits.sh b/t/t5546-receive-limits.sh
> index 0b0e987fdb7..eed3c9d81ab 100755
> --- a/t/t5546-receive-limits.sh
> +++ b/t/t5546-receive-limits.sh
> @@ -1,6 +1,8 @@
>  #!/bin/sh
>
>  test_description='check receive input limits'
> +
> +TEST_PASSES_SANITIZE_LEAK=true
>  . ./test-lib.sh
>
>  # Let's run tests with different unpack limits: 1 and 10000
> diff --git a/t/t5547-push-quarantine.sh b/t/t5547-push-quarantine.sh
> index 1876fb34e51..9f899b8c7d7 100755
> --- a/t/t5547-push-quarantine.sh
> +++ b/t/t5547-push-quarantine.sh
> @@ -1,6 +1,8 @@
>  #!/bin/sh
>
>  test_description='check quarantine of objects during push'
> +
> +TEST_PASSES_SANITIZE_LEAK=true
>  . ./test-lib.sh
>
>  test_expect_success 'create picky dest repo' '
> diff --git a/t/t5606-clone-options.sh b/t/t5606-clone-options.sh
> index cf221e92c4d..27f9f776389 100755
> --- a/t/t5606-clone-options.sh
> +++ b/t/t5606-clone-options.sh
> @@ -4,6 +4,7 @@ test_description='basic clone options'
>  GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
>  export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
>
> +TEST_PASSES_SANITIZE_LEAK=true
>  . ./test-lib.sh
>
>  test_expect_success 'setup' '
> diff --git a/t/t5810-proto-disable-local.sh b/t/t5810-proto-disable-local.sh
> index c1ef99b85c2..862610256fb 100755
> --- a/t/t5810-proto-disable-local.sh
> +++ b/t/t5810-proto-disable-local.sh
> @@ -1,6 +1,8 @@
>  #!/bin/sh
>
>  test_description='test disabling of local paths in clone/fetch'
> +
> +TEST_PASSES_SANITIZE_LEAK=true
>  . ./test-lib.sh
>  . "$TEST_DIRECTORY/lib-proto-disable.sh"
>
> diff --git a/t/t5813-proto-disable-ssh.sh b/t/t5813-proto-disable-ssh.sh
> index 3f084ee3065..2e975dc70ec 100755
> --- a/t/t5813-proto-disable-ssh.sh
> +++ b/t/t5813-proto-disable-ssh.sh
> @@ -1,6 +1,8 @@
>  #!/bin/sh
>
>  test_description='test disabling of git-over-ssh in clone/fetch'
> +
> +TEST_PASSES_SANITIZE_LEAK=true
>  . ./test-lib.sh
>  . "$TEST_DIRECTORY/lib-proto-disable.sh"
>
> diff --git a/t/t7409-submodule-detached-work-tree.sh b/t/t7409-submodule-detached-work-tree.sh
> index 374ed481e9c..574a6fc526e 100755
> --- a/t/t7409-submodule-detached-work-tree.sh
> +++ b/t/t7409-submodule-detached-work-tree.sh
> @@ -13,6 +13,7 @@ TEST_NO_CREATE_REPO=1
>  GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
>  export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
>
> +TEST_PASSES_SANITIZE_LEAK=true
>  . ./test-lib.sh
>
>  test_expect_success 'setup' '
> diff --git a/t/t7416-submodule-dash-url.sh b/t/t7416-submodule-dash-url.sh
> index 3ebd9859814..7cf72b9a076 100755
> --- a/t/t7416-submodule-dash-url.sh
> +++ b/t/t7416-submodule-dash-url.sh
> @@ -1,6 +1,8 @@
>  #!/bin/sh
>
>  test_description='check handling of disallowed .gitmodule urls'
> +
> +TEST_PASSES_SANITIZE_LEAK=true
>  . ./test-lib.sh
>
>  test_expect_success 'setup' '
> diff --git a/t/t7450-bad-git-dotfiles.sh b/t/t7450-bad-git-dotfiles.sh
> index ba1f569bcbb..0d0c3f2c683 100755
> --- a/t/t7450-bad-git-dotfiles.sh
> +++ b/t/t7450-bad-git-dotfiles.sh
> @@ -12,6 +12,8 @@ Such as:
>
>    - symlinked .gitmodules, etc
>  '
> +
> +TEST_PASSES_SANITIZE_LEAK=true
>  . ./test-lib.sh
>  . "$TEST_DIRECTORY"/lib-pack.sh
>
> --
> 2.39.0.1225.g30a3d88132d
>

^ permalink raw reply

* Re: [PATCH v5 18/19] receive-pack: free() the "ref_name" in "struct command"
From: Elijah Newren @ 2023-01-26  2:37 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, René Scharfe, Junio C Hamano, Eric Sunshine
In-Reply-To: <patch-v5-18.19-3c3d48df04b-20230118T120334Z-avarab@gmail.com>

On Wed, Jan 18, 2023 at 4:54 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> Fix a memory leak that's been with us since this code was introduced
> in 575f497456e (Add first cut at "git-receive-pack", 2005-06-29), see
> eb1af2df0b1 (git-receive-pack: start parsing ref update commands,
> 2005-06-29) for the later change that refactored the code to add the
> "ref_name" member.

This should be two sentences, not one.  s/), see/). See/ should do it.

> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  builtin/receive-pack.c                 | 10 ++++++++++
>  t/t5405-send-pack-rewind.sh            |  1 +
>  t/t5406-remote-rejects.sh              |  1 +
>  t/t5507-remote-environment.sh          |  2 ++
>  t/t5522-pull-symlink.sh                |  1 +
>  t/t5527-fetch-odd-refs.sh              |  1 +
>  t/t5560-http-backend-noserver.sh       |  1 +
>  t/t5561-http-backend.sh                |  1 +
>  t/t5562-http-backend-content-length.sh |  2 ++
>  t/t5705-session-id-in-capabilities.sh  |  1 +
>  10 files changed, 21 insertions(+)
>
[...]

Code changes look good.

^ permalink raw reply

* Re: [PATCH v5 17/19] grep API: plug memory leaks by freeing "header_list"
From: Elijah Newren @ 2023-01-26  2:33 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, René Scharfe, Junio C Hamano, Eric Sunshine
In-Reply-To: <patch-v5-17.19-6a8f4a567aa-20230118T120334Z-avarab@gmail.com>

I had the same reaction to this patch that Rene did on v1.  The commit
message of the previous patch should probably be reworked so as to not
mislead.

On Wed, Jan 18, 2023 at 5:15 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> When the "header_list" struct member was added in [1] it wasn't made
> to free the list using loop added for the adjacent "pattern_list"
> member, see [2] for when we started freeing it.
>
> This makes e.g. this command leak-free when run on git.git:
>
>         ./git -P log -1 --color=always --author=A origin/master
>
> 1. 80235ba79ef ("log --author=me --grep=it" should find intersection,
>    not union, 2010-01-17)
> 2. b48fb5b6a95 (grep: free expressions and patterns when done.,
>    2006-09-27)

That paragraph is really hard to parse for me.  Maybe change the above
to something like

"""
When the "header_list" struct member was added in [1], freeing this
field was neglected.  Fix that now, so that commands like

    ./git -P log -1 --color=always --author=A origin/master

will run leak-free.

1. 80235ba79ef ("log --author=me --grep=it" should find intersection,
    not union, 2010-01-17)
"""

> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  grep.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/grep.c b/grep.c
> index a4450df4559..c908535e0d8 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -795,6 +795,7 @@ static void free_grep_pat(struct grep_pat *pattern)
>  void free_grep_patterns(struct grep_opt *opt)
>  {
>         free_grep_pat(opt->pattern_list);
> +       free_grep_pat(opt->header_list);
>
>         if (opt->pattern_expression)
>                 free_pattern_expr(opt->pattern_expression);
> --
> 2.39.0.1225.g30a3d88132d

Actual code change looks good.

^ permalink raw reply

* Re: [PATCH v5 16/19] grep.c: refactor free_grep_patterns()
From: Elijah Newren @ 2023-01-26  2:26 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, René Scharfe, Junio C Hamano, Eric Sunshine
In-Reply-To: <patch-v5-16.19-10959760dfc-20230118T120334Z-avarab@gmail.com>

On Wed, Jan 18, 2023 at 5:35 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> Refactor the free_grep_patterns() function to split out the freeing of
> the "struct grep_pat" it contains, right now we're only freeing the
> "pattern_list", but we should be freeing another member of the same
> type, which we'll do in the subsequent commit.

s/contains, right/contains. Right/

> Let's also replace the "return" if we don't have an
> "opt->pattern_expression" with a conditional call of
> free_pattern_expr().
>
> Before db84376f981 (grep.c: remove "extended" in favor of
> "pattern_expression", fix segfault, 2022-10-11) the pattern here was:
>
>         if (!x)
>                 return;
>         free(y);
>
> But after the cleanup in db84376f981 (which was a narrow segfault fix,
> and thus avoided refactoring this) we ended up with:

For most of your commits, I like the extended history, but in this
case I think it's just a distraction.  I think I'd replace the above
block with just five words:

    While at it, instead of:

>         if (!x)
>                 return;
>         free(x);
>
> Let's instead do:
>
>         if (x)
>                 free(x);

This is slightly confusing, because "if(x) free(x)" can be compressed
to "free(x)".  I think you meant "free_pattern_expr" rather than
"free", which cannot (currently) be similarly compressed.

> This doesn't matter for the subsequent change, but as we're
> refactoring this function let's make it easier to reason about, and to
> extend in the future, i.e. if we start to free free() members that
> come after "pattern_expression" in the "struct grep_opt".

Perhaps just simplify this paragraph to:

This will make it easier to free additional members from
free_grep_patterns() in the future.


>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  grep.c | 14 +++++++++-----
>  1 file changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/grep.c b/grep.c
> index 06eed694936..a4450df4559 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -769,11 +769,11 @@ static void free_pattern_expr(struct grep_expr *x)
>         free(x);
>  }
>
> -void free_grep_patterns(struct grep_opt *opt)
> +static void free_grep_pat(struct grep_pat *pattern)
>  {
>         struct grep_pat *p, *n;
>
> -       for (p = opt->pattern_list; p; p = n) {
> +       for (p = pattern; p; p = n) {
>                 n = p->next;
>                 switch (p->token) {
>                 case GREP_PATTERN: /* atom */
> @@ -790,10 +790,14 @@ void free_grep_patterns(struct grep_opt *opt)
>                 }
>                 free(p);
>         }
> +}
>
> -       if (!opt->pattern_expression)
> -               return;
> -       free_pattern_expr(opt->pattern_expression);
> +void free_grep_patterns(struct grep_opt *opt)
> +{
> +       free_grep_pat(opt->pattern_list);
> +
> +       if (opt->pattern_expression)
> +               free_pattern_expr(opt->pattern_expression);
>  }

I think this last function would read even better as:

+void free_grep_patterns(struct grep_opt *opt)
+{
+       free_grep_pat(opt->pattern_list);
+       free_pattern_expr(opt->pattern_expression);
 }

after modifying free_pattern_expr() to return early if passed a NULL argument.

^ permalink raw reply

* Re: [PATCH] bisect: fix "reset" when branch is checked out elsewhere
From: Rubén Justo @ 2023-01-26  2:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git List, Phillip Wood
In-Reply-To: <xmqqo7qqovp1.fsf@gitster.g>

On 22-ene-2023 18:01:46, Junio C Hamano wrote:

> Rubén Justo <rjusto@gmail.com> writes:
> 
> > Since 1d0fa89 (checkout: add --ignore-other-wortrees, 2015-01-03) we
> > have a safety valve in checkout/switch to prevent the same branch from
> > being checked out simultaneously in multiple worktrees.
> >
> > If a branch is bisected in a worktree while also being checked out in
> > another worktree; when the bisection is finished, checking out the
> > branch back in the current worktree may fail.
> 
> True.  But we should explain why failing is a bad thing here.  After
> all, "git checkout foo" to check out a branch 'foo' that is being
> used in a different worktree linked to the same repository fails,
> and that is a GOOD thing.  Is the logic behind your "may fail and
> that is a bad thing" something like this?
> 
>     When "git bisect reset" goes back to the branch, it used to error
>     out if the same branch is checked out in a different worktree.
>     Since this can happen only after the end-user deliberately checked
>     out the branch twice, erroring out does not contribute to any
>     safety.
> 
> Having said that...
> 
> > @@ -245,7 +245,8 @@ static int bisect_reset(const char *commit)
> >  		struct child_process cmd = CHILD_PROCESS_INIT;
> >  
> >  		cmd.git_cmd = 1;
> > -		strvec_pushl(&cmd.args, "checkout", branch.buf, "--", NULL);
> > +		strvec_pushl(&cmd.args, "checkout", "--ignore-other-worktrees",
> > +				branch.buf, "--", NULL);
> 
> "git bisect reset" does take an arbitrary commit or branch name,
> which may not be the original branch the user was on.  If the
> user did not have any branch checked out twice, can they do
> something like
> 
>     $ git checkout foo
>     $ git bisect start HEAD HEAD~20
>     ... bisect session finds the first bad commit ...
>     $ git bisect reset bar
> 
> where 'foo' is checked out only in this worktree?  What happens if
> 'bar' has been checked out in a different worktree linked to the
> same repository while this bisect was going on?  The current code
> may fail due to the safety "checkout" has, but isn't that exactly
> what we want?  I.e. prevent 'bar' from being checked out twice by
> mistake?  Giving 'bar' on the command line of "bisect reset" is
> likely because the user wants to start working on that branch,
> without necessarily knowing that they already have a worktree that
> checked out the branch elsewhere---in other words, isn't that a lazy
> folks' shorthand for "git bisect reset && git checkout bar"?
> 
> If we loosen the safety only when bisect_reset() receives NULL to
> its commit parameter, i.e. we are going back to the original branch,
> the damage might be limited to narrower use cases, but I still am
> not sure if the loosening is worth it.
> 
> IIUC, the scenario that may be helped would go like this:
> 
>     ... another worktree has 'foo' checked out ...
>     $ git checkout --ignore-other-worktrees foo
>     $ git bisect start HEAD HEAD~20
>     ... bisect session finds the first bad commit ...
>     $ git bisect reset
> 
> The last step wants to go back to 'foo', and it may be more
> convenient if it did not fail to go back to the risky state the user
> originally created.  But doesn't the error message already tell us
> how to go back after this step refuses to recreate such a risky
> state?  It sugests "git bisect reset <commit>" to switch to a
> detached HEAD, so presumably, after seeing the above fail and
> reading the error message, the user could do
> 
>     $ git bisect reset foo^{commit}
> 
> to finish the bisect session and detach the head at 'foo', and then
> the "usual" mantra to recreate the risky state that 'foo' is checked
> out twice can be done, i.e.
> 
>     $ git checkout --ignore-other-worktrees foo
> 
> So, I am not sure if this is a good idea in general.
> 
> Or do I misunderstand why you think "checking out may fail" is a bad
> thing?

Maybe we make it fail for no reason.  Thank you for thinking about this with
depth.

I know you are aware, but for other readers;  In another thread, a bug-fix is
being reviewed that, if accepted, will affect "bisect".  With that fix, the
phrase "checking out may fail" will become "checking out will fail".  But does
it need to fail?

As you said, we want to reject normal check outs of a branch when that
branch is already checked out in other worktrees.  Because of this, sounds
reasonable to leave the implicit 'checkout' in 'bisect reset' to fail in those
cases, and just add some tests to notice if this changes in the future.

But, and this is what makes me think that "checking out will fail" is the wrong
choice for "bisect", while bisecting, with the worktree in a detached HEAD
state, the branch to which "bisect reset" will check out back (BISECT_START),
is still considered checked out in the working tree:

	$ git checkout -b work
	$ git bisect start HEAD HEAD~3
	... bisect detaches the current worktree ...
	$ git worktree add work
	Preparing worktree (checkout out 'work')
	fatal: 'work' is already checked out at ...

So, checking out back to the implicitly checked out branch sounds like it
should not fail.  And should be pretty secure (under the risk accepted by the
user) because we do not allow to normally check out the branch in another
worktree.  We even reject the checkout in the current worktree while bisecting,
but let's leave that for another time.

Note that due to the bug, what we are changing here is the reset on "second
worktrees".  That means, "bisect reset" on the main worktree has been allowed
for some time, we are just making it official.

And this is the less disrupting change in the usage.  Which does not justify
the change but does support it.

This is the reasoning I had and what makes me think that "checking out may
fail" is an inconvenience for the user, without any gain.

Now, if these arguments are reasonable, the next issue is what and how to
allow.  I chose at least strict, but maybe we can do something more
elaborate... just NULL sounds good.

^ permalink raw reply

* Re: [PATCH v5 14/19] builtin/merge.c: free "&buf" on "Your local changes..." error
From: Elijah Newren @ 2023-01-26  1:54 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, René Scharfe, Junio C Hamano, Eric Sunshine
In-Reply-To: <patch-v5-14.19-15e4b8db805-20230118T120334Z-avarab@gmail.com>

On Wed, Jan 18, 2023 at 5:14 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> Plug a memory leak introduced in [1], since that change didn't follow
> the "goto done" pattern introduced in [2] we'd leak the "&buf" memory.
>
> 1. e4cdfe84a0d (merge: abort if index does not match HEAD for trivial
>    merges, 2022-07-23)
> 2. d5a35c114ab (Copy resolve_ref() return value for longer use,
>    2011-11-13)
>
> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  builtin/merge.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/merge.c b/builtin/merge.c
> index 91dd5435c59..2b13124c497 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -1618,7 +1618,8 @@ int cmd_merge(int argc, const char **argv, const char *prefix)
>                                 error(_("Your local changes to the following files would be overwritten by merge:\n  %s"),
>                                       sb.buf);
>                                 strbuf_release(&sb);
> -                               return 2;
> +                               ret = 2;
> +                               goto done;
>                         }
>
>                         /* See if it is really trivial. */
> --
> 2.39.0.1225.g30a3d88132d

Thanks for fixing my bug!

^ permalink raw reply

* Re: [PATCH v5 02/19] bundle.c: don't leak the "args" in the "struct child_process"
From: Elijah Newren @ 2023-01-26  1:38 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, René Scharfe, Junio C Hamano, Eric Sunshine
In-Reply-To: <patch-v5-02.19-9eb758117dc-20230118T120334Z-avarab@gmail.com>

On Wed, Jan 18, 2023 at 5:16 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> Fix a leak that's been here since 7366096de9d (bundle API: change
> "flags" to be "extra_index_pack_args", 2021-09-05), if can't verify
> the bundle we didn't call child_process_clear() to clear the "args".

That's really hard to parse.  Perhaps:

Fix a leak that's been here since 7366096de9d (bundle API: change
"flags" to be "extra_index_pack_args", 2021-09-05).  If we can't verify
the bundle, we didn't call child_process_clear() to clear the "args".

> But rather than doing that let's verify the bundle before we start
> preparing the process we're going to spawn, if we get an error we
> don't need to push anything to the "args".

Also hard to parse.  Perhaps:

But rather than adding an additional child_process_clear() call, let's
verify the bundle before we start
preparing the process we're going to spawn.  If we fail to verify, we
don't need to push anything to the child_process "args".


> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  bundle.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/bundle.c b/bundle.c
> index 4ef7256aa11..9ebb10a8f72 100644
> --- a/bundle.c
> +++ b/bundle.c
> @@ -627,6 +627,10 @@ int unbundle(struct repository *r, struct bundle_header *header,
>              enum verify_bundle_flags flags)
>  {
>         struct child_process ip = CHILD_PROCESS_INIT;
> +
> +       if (verify_bundle(r, header, flags))
> +               return -1;
> +
>         strvec_pushl(&ip.args, "index-pack", "--fix-thin", "--stdin", NULL);
>
>         /* If there is a filter, then we need to create the promisor pack. */
> @@ -638,8 +642,6 @@ int unbundle(struct repository *r, struct bundle_header *header,
>                 strvec_clear(extra_index_pack_args);
>         }
>
> -       if (verify_bundle(r, header, flags))
> -               return -1;
>         ip.in = bundle_fd;
>         ip.no_stdout = 1;
>         ip.git_cmd = 1;
> --
> 2.39.0.1225.g30a3d88132d

^ permalink raw reply

* Re: [PATCH v5 07/19] repack: fix leaks on error with "goto cleanup"
From: Elijah Newren @ 2023-01-26  1:36 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, René Scharfe, Junio C Hamano, Eric Sunshine
In-Reply-To: <patch-v5-07.19-1fac90c306a-20230118T120334Z-avarab@gmail.com>

On Wed, Jan 18, 2023 at 5:10 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> Change cmd_repack() to "goto cleanup" rather than "return ret" on
> error, when we returned we'd potentially skip cleaning up the
> string_lists and other data we'd allocated in this function.

This is hard to parse; the comma followed by "when" suggests you are
only changing things under a certain set of conditions rather than
explaining why you are making an unconditional change.  Perhaps:

In cmd_repack() when we hit an error, replace "return ret" with "goto
cleanup" to ensure we free the necessary data structures.


> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
> ---
>  builtin/repack.c                    | 13 +++++++------
>  t/t6011-rev-list-with-bad-commit.sh |  1 +
>  2 files changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/builtin/repack.c b/builtin/repack.c
> index c1402ad038f..f6493795318 100644
> --- a/builtin/repack.c
> +++ b/builtin/repack.c
> @@ -948,7 +948,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
>
>         ret = start_command(&cmd);
>         if (ret)
> -               return ret;
> +               goto cleanup;
>
>         if (geometry) {
>                 FILE *in = xfdopen(cmd.in, "w");
> @@ -977,7 +977,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
>         fclose(out);
>         ret = finish_command(&cmd);
>         if (ret)
> -               return ret;
> +               goto cleanup;
>
>         if (!names.nr && !po_args.quiet)
>                 printf_ln(_("Nothing new to pack."));
> @@ -1007,7 +1007,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
>                                        &existing_nonkept_packs,
>                                        &existing_kept_packs);
>                 if (ret)
> -                       return ret;
> +                       goto cleanup;
>
>                 if (delete_redundant && expire_to) {
>                         /*
> @@ -1039,7 +1039,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
>                                                &existing_nonkept_packs,
>                                                &existing_kept_packs);
>                         if (ret)
> -                               return ret;
> +                               goto cleanup;
>                 }
>         }
>
> @@ -1115,7 +1115,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
>                 string_list_clear(&include, 0);
>
>                 if (ret)
> -                       return ret;
> +                       goto cleanup;
>         }
>
>         reprepare_packed_git(the_repository);
> @@ -1172,10 +1172,11 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
>                 write_midx_file(get_object_directory(), NULL, NULL, flags);
>         }
>
> +cleanup:
>         string_list_clear(&names, 1);
>         string_list_clear(&existing_nonkept_packs, 0);
>         string_list_clear(&existing_kept_packs, 0);
>         clear_pack_geometry(geometry);
>
> -       return 0;
> +       return ret;
>  }
> diff --git a/t/t6011-rev-list-with-bad-commit.sh b/t/t6011-rev-list-with-bad-commit.sh
> index bad02cf5b83..b2e422cf0f7 100755
> --- a/t/t6011-rev-list-with-bad-commit.sh
> +++ b/t/t6011-rev-list-with-bad-commit.sh
> @@ -2,6 +2,7 @@
>
>  test_description='git rev-list should notice bad commits'
>
> +TEST_PASSES_SANITIZE_LEAK=true
>  . ./test-lib.sh
>
>  # Note:
> --
> 2.39.0.1225.g30a3d88132d

Code changes look fine.

^ permalink raw reply

* Re: [PATCH v2] request-pull: filter out SSH/X.509 tag signatures
From: Junio C Hamano @ 2023-01-26  0:18 UTC (permalink / raw)
  To: Gwyneth Morgan; +Cc: git
In-Reply-To: <20230125234725.3918563-1-gwymor@tilde.club>

Gwyneth Morgan <gwymor@tilde.club> writes:

> git request-pull filters PGP signatures out of the tag message, but not
> SSH or X.509 signatures.
>
> Signed-off-by: Gwyneth Morgan <gwymor@tilde.club>
> ---
>  git-request-pull.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/git-request-pull.sh b/git-request-pull.sh
> index 2d0e44656c..01640a044b 100755
> --- a/git-request-pull.sh
> +++ b/git-request-pull.sh
> @@ -153,7 +153,7 @@ for you to fetch changes up to %H:
>  if test $(git cat-file -t "$head") = tag
>  then
>  	git cat-file tag "$head" |
> -	sed -n -e '1,/^$/d' -e '/^-----BEGIN PGP /q' -e p
> +	sed -n -e '1,/^$/d' -e '/^-----BEGIN \(PGP\|SSH\|SIGNED\) /q' -e p
>  	echo
>  	echo "----------------------------------------------------------------"
>  fi &&

Thanks, queued.

^ permalink raw reply

* [PATCH v2] request-pull: filter out SSH/X.509 tag signatures
From: Gwyneth Morgan @ 2023-01-25 23:47 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Gwyneth Morgan
In-Reply-To: <20230125230117.3915827-1-gwymor@tilde.club>

git request-pull filters PGP signatures out of the tag message, but not
SSH or X.509 signatures.

Signed-off-by: Gwyneth Morgan <gwymor@tilde.club>
---
 git-request-pull.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/git-request-pull.sh b/git-request-pull.sh
index 2d0e44656c..01640a044b 100755
--- a/git-request-pull.sh
+++ b/git-request-pull.sh
@@ -153,7 +153,7 @@ for you to fetch changes up to %H:
 if test $(git cat-file -t "$head") = tag
 then
 	git cat-file tag "$head" |
-	sed -n -e '1,/^$/d' -e '/^-----BEGIN PGP /q' -e p
+	sed -n -e '1,/^$/d' -e '/^-----BEGIN \(PGP\|SSH\|SIGNED\) /q' -e p
 	echo
 	echo "----------------------------------------------------------------"
 fi &&

^ permalink raw reply related

* Re: [PATCH] request-pull: filter out SSH/X.509 tag signatures
From: Gwyneth Morgan @ 2023-01-25 23:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <xmqq8rhqdwxl.fsf@gitster.g>

On 2023-01-25 15:19:34-0800, Junio C Hamano wrote:
> Please sign-off your contribution. 
> cf.  Documentation/SubmittingPatches[[sign-off]]

Oops! I will resend with a sign-off.

> >  git-request-pull.sh | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/git-request-pull.sh b/git-request-pull.sh
> > index 2d0e44656c..01640a044b 100755
> > --- a/git-request-pull.sh
> > +++ b/git-request-pull.sh
> > @@ -153,7 +153,7 @@ for you to fetch changes up to %H:
> >  if test $(git cat-file -t "$head") = tag
> >  then
> >  	git cat-file tag "$head" |
> > -	sed -n -e '1,/^$/d' -e '/^-----BEGIN PGP /q' -e p
> > +	sed -n -e '1,/^$/d' -e '/^-----BEGIN \(PGP\|SSH\|SIGNED\) /q' -e p
> 
> This makes readers debate themselves if being more specific and
> narrow like the posted patch is safer and better, or making it
> looser by just requiring "^-----BEGIN " and making it forward
> looking is sufficient and maintainable.

I could imagine someone having a tag with a line starting that way (not
realizing it's a common pattern for signatures to take) and being
confused at why it's being removed. The likelihood of someone doing
that, and using request-pull with that tag, is pretty low though, so I
don't have a strong preference.

^ permalink raw reply

* Re: [PATCH] request-pull: filter out SSH/X.509 tag signatures
From: Junio C Hamano @ 2023-01-25 23:19 UTC (permalink / raw)
  To: Gwyneth Morgan; +Cc: git
In-Reply-To: <20230125230117.3915827-1-gwymor@tilde.club>

Gwyneth Morgan <gwymor@tilde.club> writes:

> git request-pull filters PGP signatures out of the tag message, but not
> SSH or X.509 signatures.
> ---

Please sign-off your contribution. 
cf.  Documentation/SubmittingPatches[[sign-off]]

>  git-request-pull.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/git-request-pull.sh b/git-request-pull.sh
> index 2d0e44656c..01640a044b 100755
> --- a/git-request-pull.sh
> +++ b/git-request-pull.sh
> @@ -153,7 +153,7 @@ for you to fetch changes up to %H:
>  if test $(git cat-file -t "$head") = tag
>  then
>  	git cat-file tag "$head" |
> -	sed -n -e '1,/^$/d' -e '/^-----BEGIN PGP /q' -e p
> +	sed -n -e '1,/^$/d' -e '/^-----BEGIN \(PGP\|SSH\|SIGNED\) /q' -e p

This makes readers debate themselves if being more specific and
narrow like the posted patch is safer and better, or making it
looser by just requiring "^-----BEGIN " and making it forward
looking is sufficient and maintainable.

If this were signed-off already, I would have said "let's queue it
as-is, while waiting for input from others", but without a sign-off
I am not queuing (yet).

Thanks.

^ permalink raw reply

* [PATCH] request-pull: filter out SSH/X.509 tag signatures
From: Gwyneth Morgan @ 2023-01-25 23:01 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Gwyneth Morgan

git request-pull filters PGP signatures out of the tag message, but not
SSH or X.509 signatures.
---
 git-request-pull.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/git-request-pull.sh b/git-request-pull.sh
index 2d0e44656c..01640a044b 100755
--- a/git-request-pull.sh
+++ b/git-request-pull.sh
@@ -153,7 +153,7 @@ for you to fetch changes up to %H:
 if test $(git cat-file -t "$head") = tag
 then
 	git cat-file tag "$head" |
-	sed -n -e '1,/^$/d' -e '/^-----BEGIN PGP /q' -e p
+	sed -n -e '1,/^$/d' -e '/^-----BEGIN \(PGP\|SSH\|SIGNED\) /q' -e p
 	echo
 	echo "----------------------------------------------------------------"
 fi &&

^ permalink raw reply related

* Re: [PATCH v2] ssh signing: better error message when key not in agent
From: Junio C Hamano @ 2023-01-25 22:22 UTC (permalink / raw)
  To: Eric Sunshine
  Cc: Adam Szkoda, Adam Szkoda via GitGitGadget, git, Phillip Wood,
	Fabian Stelzer
In-Reply-To: <CAPig+cTNk1RvfdFembKcxTOUs0UsiXmz8rsEnab-0fQp-QE3Lg@mail.gmail.com>

Eric Sunshine <sunshine@sunshineco.com> writes:

> On Wed, Jan 25, 2023 at 8:05 AM Adam Szkoda <adaszko@gmail.com> wrote:
>> On Tue, Jan 24, 2023 at 6:52 PM Junio C Hamano <gitster@pobox.com> wrote:
>> > > Range-diff vs v1:
>> > >
>> > >  1:  0ce06076242 < -:  ----------- ssh signing: better error message when key not in agent
>> > >  -:  ----------- > 1:  03dfca79387 ssh signing: better error message when key not in agent
>> >
>> > This is a fairly useless range-diff.
>> >
>> > Even when a range-diff shows the differences in the patches,
>> > mechanically generated range-diff can only show _what_ changed.  It
>> > is helpful to explain the changes in your own words to highlight
>> > _why_ such changes are done, and this place after the "---" line
>> > and the diffstat we see below is the place to do so.
>> >
>> > Does GitGitGadget allow its users to describe the differences since
>> > the previous iteration yourself?
>>
>> No, I don't think it does.   It got generated automatically without
>> giving me an opportunity to edit.
>
> Yes, the user can describe the differences since the previous
> iteration by editing the pull-request's description. Specifically,
> when ready to send a new iteration:
>
> (1) force push the new iteration to the same branch on GitHub
>
> (2) edit the pull-request description; this is the very first
> "comment" on the pull-request page; press the "..." button on that
> comment and choose the "Edit" menu item; revise the text to describe
> the changes since the previous revision, then press the "Update
> comment" button to save
>
> (3) post a "/submit" comment to the pull-request to tell GitGitGadget
> to send the new revision to the Git mailing list

Thanks.  I thought the above would make a good addition to our
documentation set.  Documentation/MyFirstContribution.txt does have
this to say:

    Once you have your branch again in the shape you want following all review
    comments, you can submit again:

    ----
    $ git push -f remotename psuh
    ----

    Next, go look at your pull request against GitGitGadget; you should see the CI
    has been kicked off again. Now while the CI is running is a good time for you
    to modify your description at the top of the pull request thread; it will be
    used again as the cover letter. You should use this space to describe what
    has changed since your previous version, so that your reviewers have some idea
    of what they're looking at. When the CI is done running, you can comment once
    more with `/submit` - GitGitGadget will automatically add a v2 mark to your
    changes.

before it talks about doing the "/submit" again.  Expanding the
above into a bulletted list form like you did might make it easier
to follow through, perhaps?  I dunno.



^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox