git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Jeff King <peff@peff.net>
Cc: Git List <git@vger.kernel.org>,
	Junio C Hamano <gitster@pobox.com>,
	Wink Saville <wink@saville.com>
Subject: Re: [PATCH 02/14] combine-diff: add combine_diff_path_new()
Date: Mon, 13 Jan 2025 16:40:19 +0100	[thread overview]
Message-ID: <Z4Uz43eByZHqW8UK@pks.im> (raw)
In-Reply-To: <20250109083236.GB2748836@coredump.intra.peff.net>

On Thu, Jan 09, 2025 at 03:32:36AM -0500, Jeff King wrote:
> The combine_diff_path struct has variable size, since it embeds both the
> memory allocation for the path field as well as a variable-sized parent
> array. This makes allocating one a bit tricky.
> 
> We have a helper to compute the required size, but it's up to individual
> sites to actually initialize all of the fields. Let's provide a
> constructor function to make that a little nicer. Besides being shorter,
> it also hides away tricky bits like the computation of the "path"
> pointer (which is right after the "parent" flex array).
> 
> As a bonus, using the same constructor everywhere means that we'll
> consistently initialize all parts of the struct. A few code paths left
> the parent array unitialized. This didn't cause any bugs, but we'll be
> able to simplify some code in the next few patches knowing that the
> parent fields have all been zero'd.
> 
> This also gets rid of some questionable uses of "int" to store buffer
> lengths. Though we do use them to allocate, I don't think there are any
> integer overflow vulnerabilities here (the allocation helper promotes
> them to size_t and checks arithmetic for overflow, and the actual memcpy
> of the bytes is done using the possibly-truncated "int" value).
> 
> Sadly we can't use the FLEX_* macros to simplify the allocation here,
> because there are two variable-sized parts to the struct (and those
> macros only handle one).
> 
> Nor can we get stop publicly declaring combine_diff_path_size(). This

s/we get stop/we stop/

> diff --git a/combine-diff.c b/combine-diff.c
> index 641bc92dbd..45548fd438 100644
> --- a/combine-diff.c
> +++ b/combine-diff.c
> @@ -47,22 +47,13 @@ static struct combine_diff_path *intersect_paths(
>  
>  	if (!n) {
>  		for (i = 0; i < q->nr; i++) {
> -			int len;
> -			const char *path;
>  			if (diff_unmodified_pair(q->queue[i]))
>  				continue;
> -			path = q->queue[i]->two->path;
> -			len = strlen(path);
> -			p = xmalloc(combine_diff_path_size(num_parent, len));
> -			p->path = (char *) &(p->parent[num_parent]);
> -			memcpy(p->path, path, len);
> -			p->path[len] = 0;
> -			p->next = NULL;
> -			memset(p->parent, 0,
> -			       sizeof(p->parent[0]) * num_parent);
> -
> -			oidcpy(&p->oid, &q->queue[i]->two->oid);
> -			p->mode = q->queue[i]->two->mode;
> +			p = combine_diff_path_new(q->queue[i]->two->path,
> +						  strlen(q->queue[i]->two->path),
> +						  q->queue[i]->two->mode,
> +						  &q->queue[i]->two->oid,
> +						  num_parent);
>  			oidcpy(&p->parent[n].oid, &q->queue[i]->one->oid);
>  			p->parent[n].mode = q->queue[i]->one->mode;
>  			p->parent[n].status = q->queue[i]->status;
> @@ -1667,3 +1658,24 @@ void diff_tree_combined_merge(const struct commit *commit,
>  	diff_tree_combined(&commit->object.oid, &parents, rev);
>  	oid_array_clear(&parents);
>  }
> +
> +struct combine_diff_path *combine_diff_path_new(const char *path,
> +						size_t path_len,
> +						unsigned int mode,
> +						const struct object_id *oid,
> +						size_t num_parents)
> +{
> +	struct combine_diff_path *p;
> +
> +	p = xmalloc(combine_diff_path_size(num_parents, path_len));
> +	p->path = (char *)&(p->parent[num_parents]);
> +	memcpy(p->path, path, path_len);
> +	p->path[path_len] = 0;
> +	p->next = NULL;
> +	p->mode = mode;
> +	oidcpy(&p->oid, oid);
> +
> +	memset(p->parent, 0, sizeof(p->parent[0]) * num_parents);
> +
> +	return p;
> +}

If I were to write this anew I'd probably use `xcalloc()` instead of
manually `memset()`ing parts of it to zero. But it's a faithful
transplant of the code from `intersect_paths()`, so that's probably
okay.

Patrick

  parent reply	other threads:[~2025-01-13 15:40 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-03 19:28 [BUGREPORT] git diff-tree --cc SEGFAUTs Wink Saville
2025-01-03 20:46 ` Jeff King
2025-01-03 23:34   ` Wink Saville
2025-01-04  0:31     ` Jeff King
2025-01-04  2:55       ` Junio C Hamano
2025-01-04  3:32         ` Jeff King
2025-01-04 18:09           ` Wink Saville
2025-01-05 22:13             ` Wink Saville
2025-01-09  8:27           ` [PATCH 0/14] combine-diff cleanups Jeff King
2025-01-09  8:28             ` [PATCH 01/14] run_diff_files(): delay allocation of combine_diff_path Jeff King
2025-01-09 17:57               ` Junio C Hamano
2025-01-09  8:32             ` [PATCH 02/14] combine-diff: add combine_diff_path_new() Jeff King
2025-01-09 18:05               ` Junio C Hamano
2025-01-13 15:40               ` Patrick Steinhardt [this message]
2025-01-14  9:29                 ` Jeff King
2025-01-09  8:33             ` [PATCH 03/14] tree-diff: clear parent array in path_appendnew() Jeff King
2025-01-09 18:28               ` Junio C Hamano
2025-01-10 10:54                 ` Jeff King
2025-01-09  8:42             ` [PATCH 04/14] combine-diff: use pointer for parent paths Jeff King
2025-01-09 18:49               ` Junio C Hamano
2025-01-09  8:42             ` [PATCH 05/14] diff: add a comment about combine_diff_path.parent.path Jeff King
2025-01-13 15:40               ` Patrick Steinhardt
2025-01-09  8:44             ` [PATCH 06/14] run_diff_files(): de-mystify the size of combine_diff_path struct Jeff King
2025-01-10 16:40               ` Junio C Hamano
2025-01-09  8:46             ` [PATCH 07/14] tree-diff: drop path_appendnew() alloc optimization Jeff King
2025-01-13 15:40               ` Patrick Steinhardt
2025-01-14 10:30                 ` Jeff King
2025-01-09  8:49             ` [PATCH 08/14] tree-diff: pass whole path string to path_appendnew() Jeff King
2025-01-13 15:40               ` Patrick Steinhardt
2025-01-14  9:26                 ` Jeff King
2025-01-09  8:49             ` [PATCH 09/14] tree-diff: inline path_appendnew() Jeff King
2025-01-11  0:41               ` Junio C Hamano
2025-01-09  8:50             ` [PATCH 10/14] combine-diff: drop public declaration of combine_diff_path_size() Jeff King
2025-01-09  8:51             ` [PATCH 11/14] tree-diff: drop list-tail argument to diff_tree_paths() Jeff King
2025-01-18  0:33               ` Junio C Hamano
2025-01-09  8:53             ` [PATCH 12/14] tree-diff: use the name "tail" to refer to list tail Jeff King
2025-01-09  8:54             ` [PATCH 13/14] tree-diff: simplify emit_path() list management Jeff King
2025-01-09  8:57             ` [PATCH 14/14] tree-diff: make list tail-passing more explicit Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z4Uz43eByZHqW8UK@pks.im \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    --cc=wink@saville.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).