git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Han Xin <hanxin.hx@bytedance.com>
Cc: git@vger.kernel.org, xingxin.xx@bytedance.com,
	jonathantanmy@google.com, derrickstolee@github.com
Subject: Re: [PATCH v3 1/2] negotiator/default: avoid stack overflow
Date: Wed, 26 Apr 2023 10:14:20 -0700	[thread overview]
Message-ID: <xmqqedo6tvkj.fsf@gitster.g> (raw)
In-Reply-To: <0e69d70805e6da684e6e17642a1cf0d59a03dfc0.1682513384.git.hanxin.hx@bytedance.com> (Han Xin's message of "Wed, 26 Apr 2023 21:15:03 +0800")

Han Xin <hanxin.hx@bytedance.com> writes:

> mark_common() in negotiator/default.c may overflow the stack due to
> recursive function calls. Avoid this by instead recursing using a
> heap-allocated data structure.
>
> This is the same case as 4654134976f (negotiator/skipping: avoid
> stack overflow, 2022-10-25)
>
> Reported-by: Xin Xing <xingxin.xx@bytedance.com>
> Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
> ---
>  negotiator/default.c | 39 +++++++++++++++++++++++++++++----------
>  1 file changed, 29 insertions(+), 10 deletions(-)
>
> diff --git a/negotiator/default.c b/negotiator/default.c
> index f4b78eb47d..635cdd6483 100644
> --- a/negotiator/default.c
> +++ b/negotiator/default.c
> @@ -55,30 +55,49 @@ static int clear_marks(const char *refname, const struct object_id *oid,
>  static void mark_common(struct negotiation_state *ns, struct commit *commit,
>  		int ancestors_only, int dont_parse)
>  {
> -	if (commit != NULL && !(commit->object.flags & COMMON)) {
> -		struct object *o = (struct object *)commit;
> +	struct prio_queue queue = { NULL };
> +
> +	if (!commit || (commit->object.flags & COMMON))
> +		return;

The original naive recursive marker had a large if block guarded by
the opposite condition around the whole thing, which amounts to the
same as this early return.  Good.

> +	prio_queue_put(&queue, commit);

And the code now uses on-stack priority queue here, and bootstraps
the machinery by placing the first element here.  OK.

> +	if (!ancestors_only) {
> +		commit->object.flags |= COMMON;
>  
> -		if (!ancestors_only)
> -			o->flags |= COMMON;

These two are equivalent, which is good.

> +		if ((commit->object.flags & SEEN) && !(commit->object.flags & POPPED))
> +			ns->non_common_revs--;

Hmph, this is a bit unexpected to duplicate the non_common_revs
counting logic here.  In the original, this piece of code was there
just after we decided to continue digging into the parents, and even
if this patch changes the mechanism with which "digging into the
parents" from recursion to priority queue, it is not obvious why we
can keep doing the decrementing for the current commit we are
looking at, instead of doing that for parents of the commit like
this patch does.  In other words, it is not clear why it needs to be
changed while going from recursive to iterative.

Is it because ancestors_only is not usable inside the loop in the
iterative version?  That is, if ancestors_only is not set, we do
paint the initial commit as COMMON just as the parents we discover
in the loop, but when ancestors_only is set, we need to skip painting
the initial commit as COMMON, so the patch moves that logic?

It may solve the issue of special casing the initial commit, but it
feels backwards in that the resulting loop becomes harder to
understand by making it necessary to process the initial commit
outside the loop only halfway.

It may make it easier to understand if we had another local
variable, "struct commit *skip_commit", that is NULL by default but
is set to the initial commit when ancestors_only is set, and do the
painting with COMMON and counting of non_common_revs all inside the
loop for the current commit that is being processed (instead of the
parents the loop discovered).  I dunno.  It would avoid duplicating
the logic and implements the "ancestors_only, do not paint or count
the initial commit" in a more readable and straight-forward way, no?

Thanks.

> +	}
> +	while ((commit = prio_queue_get(&queue))) {
> +		struct object *o = (struct object *)commit;
>  
>  		if (!(o->flags & SEEN))
>  			rev_list_push(ns, commit, SEEN);
>  		else {
>  			struct commit_list *parents;
>  
> -			if (!ancestors_only && !(o->flags & POPPED))
> -				ns->non_common_revs--;
>  			if (!o->parsed && !dont_parse)
>  				if (repo_parse_commit(the_repository, commit))
> -					return;
> +					continue;
>  
>  			for (parents = commit->parents;
>  					parents;
> -					parents = parents->next)
> -				mark_common(ns, parents->item, 0,
> -					    dont_parse);
> +					parents = parents->next) {
> +				struct commit *p = parents->item;
> +
> +				if (p->object.flags & COMMON)
> +					continue;
> +
> +				p->object.flags |= COMMON;
> +
> +				if ((p->object.flags & SEEN) && !(p->object.flags & POPPED))
> +					ns->non_common_revs--;
> +
> +				prio_queue_put(&queue, parents->item);
> +			}
>  		}
>  	}
> +
> +	clear_prio_queue(&queue);
>  }
>  
>  /*

  reply	other threads:[~2023-04-26 17:14 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-24  2:23 [PATCH v1] negotiator/default.c: avoid stack overflow Han Xin
2023-04-24 14:44 ` Derrick Stolee
2023-04-25  3:02   ` [External] " Han Xin
2023-04-25 13:34     ` Derrick Stolee
2023-04-26  4:05 ` [PATCH v2 0/2] negotiator/default: " Han Xin
2023-04-26  4:05   ` [PATCH v2 1/2] " Han Xin
2023-04-26 11:13     ` Derrick Stolee
2023-04-26 11:40       ` [External] " Han Xin
2023-04-26  4:05   ` [PATCH v2 2/2] negotiator/skipping: fix some problems in mark_common() Han Xin
2023-04-26 11:08     ` Derrick Stolee
2023-04-26 11:55       ` [External] " Han Xin
2023-04-26 13:15   ` [PATCH v2 0/2] negotiator/default: avoid stack overflow Han Xin
2023-04-26 13:15     ` [PATCH v3 1/2] " Han Xin
2023-04-26 17:14       ` Junio C Hamano [this message]
2023-04-26 17:30         ` Derrick Stolee
2023-04-26 17:38           ` Junio C Hamano
2023-04-26 13:15     ` [PATCH v3 2/2] negotiator/skipping: fix some problems in mark_common() Han Xin
2023-05-01 22:11     ` [PATCH v2 0/2] negotiator/default: avoid stack overflow Junio C Hamano
2023-05-02  1:49       ` Derrick Stolee
2023-05-02 15:51         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqedo6tvkj.fsf@gitster.g \
    --to=gitster@pobox.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=hanxin.hx@bytedance.com \
    --cc=jonathantanmy@google.com \
    --cc=xingxin.xx@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).