From: Srinidhi Kaushik <shrinidhi.kaushik@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH v7 1/3] push: add reflog check for "--force-if-includes"
Date: Sun, 27 Sep 2020 17:57:45 +0530 [thread overview]
Message-ID: <20200927122745.GA6641@mail.clickyotomy.dev> (raw)
In-Reply-To: <xmqqwo0ggl65.fsf@gitster.c.googlers.com>
Hi Junio,
On 09/26/2020 16:42, Junio C Hamano wrote:
> Srinidhi Kaushik <shrinidhi.kaushik@gmail.com> writes:
>
> > @@ -2252,11 +2263,11 @@ int is_empty_cas(const struct push_cas_option *cas)
> > /*
> > * Look at remote.fetch refspec and see if we have a remote
> > * tracking branch for the refname there. Fill its current
> > - * value in sha1[].
> > + * value in sha1[], and as a string.
>
> I think the array being referred to was renamed to oid[] sometime
> ago. "and as a string" makes it sound as if sha1[] gets the value
> as 40-hex object name text, but that is not what is being done.
>
> Fill the name of the remote-tracking branch in *dst_refname,
> and the name of the commit object at tis tip in oid[].
>
> perhaps?
Of course, that sounds better; will update.
> > + * The struct "reflog_commit_list" and related helper functions
> > + * for list manipulation are used for collecting commits into a
> > + * list during reflog traversals in "if_exists_or_grab_until()".
>
> Has the name of that function changed since this comment was
> written?
Heh, it sure has. It should have been "check_and_collect_until()".
> > + */
> > +struct reflog_commit_list {
> > + struct commit **items;
>
> Name an array in singular when its primary use is to work on an
> element at a time---that will let you say item[4] to call the 4-th
> item, instead of items[4] that smells awkward.
>
> An array that is used mostly to pass around a collection as a whole
> is easier to think about when given a plural name, though.
Yup.
> > +
> > +/* Get the timestamp of the latest entry. */
> > +static int peek_reflog(struct object_id *o_oid, struct object_id *n_oid,
> > + const char *ident, timestamp_t timestamp,
> > + int tz, const char *message, void *cb_data)
> > +{
> > + timestamp_t *ts = cb_data;
> > + *ts = timestamp;
> > + return 1;
> > +}
>
> The idea is to use a callback that immediately says "no more" to
> grab the data from the first item in the iteration. It feels
> somewhat awkward but because there is no "give us the Nth entry" API
> function, it is the cleanest way we can do this.
I considered using "grab_1st_entry_timestamp()" briefy, but
"peek_reflog" is shorter compared to that.
> > + /* Look-up the commit and append it to the list. */
> > + if ((commit = lookup_commit_reference(the_repository, n_oid)))
> > + add_commit(cb->local_commits, commit);
>
> This is merely a minor naming thing, but if you rename add_commit()
> to append_commit(), you probably do not even need the comment before
> this statement.
Will do.
> > return 0;
> > }
> >
> > +#define MERGE_BASES_BATCH_SIZE 8
>
> Hmph. Do we still need batching?
>
> > +/*
> > + * Iterate through the reflog of the local ref to check if there is an entry
> > + * for the given remote-tracking ref; runs until the timestamp of an entry is
> > + * older than latest timestamp of remote-tracking ref's reflog. Any commits
> > + * are that seen along the way are collected into a list to check if the
> > + * remote-tracking ref is reachable from any of them.
> > + */
> > +static int is_reachable_in_reflog(const char *local, const struct ref *remote)
> > +{
> > + timestamp_t date;
> > + struct commit *commit;
> > + struct commit **chunk;
> > + struct check_and_collect_until_cb_data cb;
> > + struct reflog_commit_list list = { NULL, 0, 0 };
> > + size_t count = 0, batch_size = 0;
> > + int ret = 0;
> > +
> > + commit = lookup_commit_reference(the_repository, &remote->old_oid);
> > + if (!commit)
> > + goto cleanup_return;
> > +
> > + /*
> > + * Get the timestamp from the latest entry
> > + * of the remote-tracking ref's reflog.
> > + */
> > + for_each_reflog_ent_reverse(remote->tracking_ref, peek_reflog, &date);
> > +
> > + cb.remote_commit = commit;
> > + cb.local_commits = &list;
> > + cb.remote_reflog_timestamp = date;
> > + ret = for_each_reflog_ent_reverse(local, check_and_collect_until, &cb);
> > +
> > + /* We found an entry in the reflog. */
> > + if (ret > 0)
> > + goto cleanup_return;
>
> Good. So '1' from the callback is "we found one, no need to look
> further and no need to do merge-base", and '-1' from the callback is
> "we looked at all entries that are young enough to matter and we
> didn't find exact match". Makes sense.
>
> > + /*
> > + * Check if the remote commit is reachable from any
> > + * of the commits in the collected list, in batches.
> > + */
>
> I do not know if batching would help (have you measured it?), but if
> we were to batch, it is more common to arrange the loop like this:
>
> for (chunk = list.items;
> chunk < list.items + list.nr;
> chunk += size) {
> size = list.items + list.nr - chunk;
> if (MERGE_BASES_BATCH_SIZE < size)
> size = MERGE_BASES_BATCH_SIZE;
> ... use chunk[0..size] ...
> chunk += size;
> }
>
> That is, assume that we can grab everything during this round, and
> if that bites off too many, clamp it to the maximum value. If you
> are not comfortable with pointer arithmetic, it is also fine to use
> an auxiliary variable 'count', but ...
Actually, the "for" version looks much cleaner and avoids the use
of "count". However, I think ...
> chunk += size;
... should be skipped because "for ( ... ; chunk += size)" is already
doing it for us; otherwise we would offset 16 entries instead of 8
per iteration, no?
> > + chunk = list.items;
> > + while (count < list.nr) {
> > + batch_size = MERGE_BASES_BATCH_SIZE;
> > +
> > + /* For any leftover entries. */
> > + if ((count + MERGE_BASES_BATCH_SIZE) > list.nr)
> > + batch_size = list.nr - count;
> > +
> > + if ((ret = in_merge_bases_many(commit, batch_size, chunk)))
> > + break;
> > +
> > + chunk += batch_size;
> > + count += MERGE_BASES_BATCH_SIZE;
>
> ... you are risking chunk and count to go out of sync here.
>
> It does not matter within this loop (count will point beyond the end
> of list.item[] while chunk will never go past the array), but future
> developers can be confused into thinking that they can use chunk and
> count interchangeably after this loop exits, and at that point the
> discrepancy may start to matter.
I agree, it should have been "count += batch_size;". But, I think the
"for" version looks cleaner; I will change it to that the next set.
> But all of the above matters if it is a good idea to batch. Does it
> make a difference?
>
> ... goes and looks at in_merge_bases_many() ...
>
> Ah, it probably would.
>
> I thought in_merge_bases_many() would stop early as soon as any of
> the traversal from chunk[] reaches commit, but it uses a rather more
> generic paint_down_to_common() so extra items in chunk[] that are
> topologically older than commit would result in additional traversal
> from commit down to them, which would not contribute much to the end
> result. It may be a good #leftovebit idea for future improvement to
> teach in_merge_bases_many() to use a custom replacement for
> paint_down_to_common() that stops early as soon as we find the
> answer is true.
If we consider the amount of time it takes when "in_merge_bases_many()"
has to be run for all the entries, there isn't much of a difference in
performance between batching and non-batching -- they took about the
same. But, as you said if the remote is reachable in the first few
entries, batching would help with returning early if a descendant is
found.
Making the function stop early when a descendent is found
does sound like a good #leftoverbits idea. :)
Thanks again, for a detailed review.
--
Srinidhi Kaushik
next prev parent reply other threads:[~2020-09-27 12:27 UTC|newest]
Thread overview: 120+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-04 18:51 [PATCH] push: make `--force-with-lease[=<ref>]` safer Srinidhi Kaushik
2020-09-07 15:23 ` Phillip Wood
2020-09-08 15:48 ` Srinidhi Kaushik
2020-09-07 16:14 ` Junio C Hamano
2020-09-08 16:00 ` Srinidhi Kaushik
2020-09-08 21:00 ` Junio C Hamano
2020-09-07 19:45 ` Johannes Schindelin
2020-09-08 15:58 ` Junio C Hamano
2020-09-09 3:40 ` Johannes Schindelin
2020-09-08 16:59 ` Srinidhi Kaushik
2020-09-16 11:55 ` Johannes Schindelin
2020-09-08 19:34 ` Junio C Hamano
2020-09-09 3:44 ` Johannes Schindelin
2020-09-10 10:22 ` Johannes Schindelin
2020-09-10 14:44 ` Srinidhi Kaushik
2020-09-11 22:16 ` Johannes Schindelin
2020-09-14 11:06 ` Srinidhi Kaushik
2020-09-14 20:08 ` Junio C Hamano
2020-09-16 5:31 ` Srinidhi Kaushik
2020-09-16 10:20 ` Johannes Schindelin
2020-09-19 17:48 ` Junio C Hamano
2020-09-10 14:46 ` Junio C Hamano
2020-09-11 22:17 ` Johannes Schindelin
2020-09-14 20:07 ` Junio C Hamano
2020-09-12 15:04 ` [PATCH v2 0/2] push: make "--force-with-lease" safer Srinidhi Kaushik
2020-09-12 15:04 ` [PATCH v2 1/2] push: add "--[no-]force-if-includes" Srinidhi Kaushik
2020-09-12 18:20 ` Junio C Hamano
2020-09-12 21:25 ` Srinidhi Kaushik
2020-09-12 15:04 ` [PATCH v2 2/2] push: enable "forceIfIncludesWithLease" by default Srinidhi Kaushik
2020-09-12 18:22 ` Junio C Hamano
2020-09-12 18:15 ` [PATCH v2 0/2] push: make "--force-with-lease" safer Junio C Hamano
2020-09-12 21:03 ` Srinidhi Kaushik
2020-09-13 14:54 ` [PATCH v3 0/7] push: add "--[no-]force-if-includes" Srinidhi Kaushik
2020-09-13 14:54 ` [PATCH v3 1/7] remote: add reflog check for "force-if-includes" Srinidhi Kaushik
2020-09-14 20:17 ` Junio C Hamano
2020-09-16 10:51 ` Srinidhi Kaushik
2020-09-14 20:31 ` Junio C Hamano
2020-09-14 21:13 ` Junio C Hamano
2020-09-16 12:35 ` Johannes Schindelin
2020-09-19 17:01 ` Srinidhi Kaushik
2020-09-13 14:54 ` [PATCH v3 2/7] transport: add flag for "--[no-]force-if-includes" Srinidhi Kaushik
2020-09-13 14:54 ` [PATCH v3 3/7] send-pack: check ref status for "force-if-includes" Srinidhi Kaushik
2020-09-13 14:54 ` [PATCH v3 4/7] transport-helper: update " Srinidhi Kaushik
2020-09-13 14:54 ` [PATCH v3 5/7] builtin/push: add option "--[no-]force-if-includes" Srinidhi Kaushik
2020-09-16 12:36 ` Johannes Schindelin
2020-09-13 14:54 ` [PATCH v3 6/7] doc: add reference for "--[no-]force-if-includes" Srinidhi Kaushik
2020-09-14 21:01 ` Junio C Hamano
2020-09-16 5:35 ` Srinidhi Kaushik
2020-09-13 14:54 ` [PATCH v3 7/7] t: add tests for "force-if-includes" Srinidhi Kaushik
2020-09-16 12:47 ` [PATCH v3 0/7] push: add "--[no-]force-if-includes" Johannes Schindelin
2020-09-19 17:03 ` [PATCH v4 0/3] " Srinidhi Kaushik
2020-09-19 17:03 ` [PATCH v4 1/3] push: add reflog check for "--force-if-includes" Srinidhi Kaushik
2020-09-19 20:03 ` Junio C Hamano
2020-09-21 8:42 ` Srinidhi Kaushik
2020-09-21 18:48 ` Junio C Hamano
2020-09-23 10:22 ` Srinidhi Kaushik
2020-09-23 16:47 ` Junio C Hamano
2020-09-21 13:19 ` Phillip Wood
2020-09-21 16:12 ` Junio C Hamano
2020-09-21 18:11 ` Junio C Hamano
2020-09-23 10:27 ` Srinidhi Kaushik
2020-09-19 17:03 ` [PATCH v4 2/3] push: parse and set flag " Srinidhi Kaushik
2020-09-19 20:26 ` Junio C Hamano
2020-09-19 17:03 ` [PATCH v4 3/3] t, doc: update tests, reference " Srinidhi Kaushik
2020-09-19 20:42 ` Junio C Hamano
2020-09-23 7:30 ` [PATCH v5 0/3] push: add "--[no-]force-if-includes" Srinidhi Kaushik
2020-09-23 7:30 ` [PATCH v5 1/3] push: add reflog check for "--force-if-includes" Srinidhi Kaushik
2020-09-23 10:18 ` Phillip Wood
2020-09-23 11:26 ` Srinidhi Kaushik
2020-09-23 16:24 ` Junio C Hamano
2020-09-23 16:29 ` Junio C Hamano
2020-09-23 7:30 ` [PATCH v5 2/3] push: parse and set flag " Srinidhi Kaushik
2020-09-23 7:30 ` [PATCH v5 3/3] t, doc: update tests, reference " Srinidhi Kaushik
2020-09-23 10:24 ` Phillip Wood
2020-09-26 10:13 ` [PATCH v6 0/3] push: add "--[no-]force-if-includes" Srinidhi Kaushik
2020-09-26 10:13 ` [PATCH v6 1/3] push: add reflog check for "--force-if-includes" Srinidhi Kaushik
2020-09-26 10:13 ` [PATCH v6 2/3] push: parse and set flag " Srinidhi Kaushik
2020-09-26 10:13 ` [PATCH v6 3/3] t, doc: update tests, reference " Srinidhi Kaushik
2020-09-26 10:21 ` [PATCH v6 0/3] push: add "--[no-]force-if-includes" Srinidhi Kaushik
2020-09-26 11:46 ` [PATCH v7 " Srinidhi Kaushik
2020-09-26 11:46 ` [PATCH v7 1/3] push: add reflog check for "--force-if-includes" Srinidhi Kaushik
2020-09-26 23:42 ` Junio C Hamano
2020-09-27 12:27 ` Srinidhi Kaushik [this message]
2020-09-26 11:46 ` [PATCH v7 2/3] push: parse and set flag " Srinidhi Kaushik
2020-09-26 11:46 ` [PATCH v7 3/3] t, doc: update tests, reference " Srinidhi Kaushik
2020-09-27 14:17 ` [PATCH v8 0/3] push: add "--[no-]force-if-includes" Srinidhi Kaushik
2020-09-27 14:17 ` [PATCH v8 1/3] push: add reflog check for "--force-if-includes" Srinidhi Kaushik
2020-09-27 14:17 ` [PATCH v8 2/3] push: parse and set flag " Srinidhi Kaushik
2020-09-27 14:17 ` [PATCH v8 3/3] t, doc: update tests, reference " Srinidhi Kaushik
2020-09-30 12:54 ` Philip Oakley
2020-09-30 14:27 ` Srinidhi Kaushik
2020-09-28 17:31 ` [PATCH v8 0/3] push: add "--[no-]force-if-includes" Junio C Hamano
2020-09-28 17:46 ` SZEDER Gábor
2020-09-28 19:34 ` Srinidhi Kaushik
2020-09-28 19:51 ` Junio C Hamano
2020-09-28 20:00 ` Junio C Hamano
2020-10-01 8:21 ` [PATCH v9 " Srinidhi Kaushik
2020-10-01 8:21 ` [PATCH v9 1/3] push: add reflog check for "--force-if-includes" Srinidhi Kaushik
2020-10-02 13:52 ` Johannes Schindelin
2020-10-02 14:50 ` Johannes Schindelin
2020-10-02 16:22 ` Junio C Hamano
2020-10-02 15:07 ` Srinidhi Kaushik
2020-10-02 16:41 ` Junio C Hamano
2020-10-02 19:39 ` Srinidhi Kaushik
2020-10-02 20:14 ` Junio C Hamano
2020-10-02 20:58 ` Srinidhi Kaushik
2020-10-02 21:36 ` Junio C Hamano
2020-10-02 16:26 ` Junio C Hamano
2020-10-01 8:21 ` [PATCH v9 2/3] push: parse and set flag " Srinidhi Kaushik
2020-10-01 8:21 ` [PATCH v9 3/3] t, doc: update tests, reference " Srinidhi Kaushik
2020-10-01 15:46 ` [PATCH v9 0/3] push: add "--[no-]force-if-includes" Junio C Hamano
2020-10-01 17:12 ` Junio C Hamano
2020-10-01 17:54 ` Srinidhi Kaushik
2020-10-01 18:32 ` Junio C Hamano
2020-10-02 16:50 ` Junio C Hamano
2020-10-02 19:42 ` Srinidhi Kaushik
2020-10-03 12:10 ` [PATCH v10 " Srinidhi Kaushik
2020-10-03 12:10 ` [PATCH v10 1/3] push: add reflog check for "--force-if-includes" Srinidhi Kaushik
2020-10-03 12:10 ` [PATCH v10 2/3] push: parse and set flag " Srinidhi Kaushik
2020-10-03 12:10 ` [PATCH v10 3/3] t, doc: update tests, reference " Srinidhi Kaushik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200927122745.GA6641@mail.clickyotomy.dev \
--to=shrinidhi.kaushik@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.