From: Atharva Raykar <raykar.ath@gmail.com>
To: Shourya Shukla <periperidip@gmail.com>
Cc: Git List <git@vger.kernel.org>,
Emily Shaffer <emilyshaffer@google.com>,
Jonathan Nieder <jrnieder@gmail.com>,
Junio C Hamano <gitster@pobox.com>,
Christian Couder <christian.couder@gmail.com>,
Kaartic Sivaraam <kaartic.sivaraam@gmail.com>,
Prathamesh Chavan <pc44800@gmail.com>
Subject: Re: [GSoC] [PATCH v2] submodule--helper: run update procedures from C
Date: Tue, 03 Aug 2021 15:37:46 +0530 [thread overview]
Message-ID: <m2zgtyaljh.fsf@gmail.com> (raw)
In-Reply-To: <CACdWUYXhckBkHLPnRDxxb=raAD0=7236jAzvneBLhw8fXvGTMw@mail.gmail.com>
(I am resending this email, because my client mangled the whitespaces
due to a misconfiguration. Please ignore the my previous message.)
Shourya Shukla <periperidip@gmail.com> writes:
> Le lun. 2 août 2021 à 18:36, Atharva Raykar <raykar.ath@gmail.com> a écrit :
>>
>> Add a new submodule--helper subcommand `run-update-procedure` that runs
>> the update procedure if the SHA1 of the submodule does not match what
>> the superproject expects.
>>
>> This is an intermediate change that works towards total conversion of
>> `submodule update` from shell to C.
>>
>> Specific error codes are returned so that the shell script calling the
>> subcommand can take a decision on the control flow, and preserve the
>> error messages across subsequent recursive calls of `cmd_update`.
>>
>> This patch could have been approached differently, by first changing the
>> `is_tip_reachable` and `fetch_in_submodule` shell functions to be
>> `submodule--helper` subcommands, and then following up with a patch that
>> introduces the `run-update-procedure` subcommand. We have not done it
>> like that because those functions are trivial enough to convert directly
>> along with these other changes. This lets us avoid the boilerplate and
>> the cleanup patches that will need to be introduced in following that
>> approach.
>
> I feel that this part is more suitable for a cover letter rather than the commit
> message itself. It is a useful piece of info though.
Okay, that seems right, the message does seem a bit too context-sensitive.
>> This change is more focused on doing a faithful conversion, so for now we
>> are not too concerned with trying to reduce subprocess spawns.
>>
>> Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
>> Mentored-by: Christian Couder <christian.couder@gmail.com>
>> Mentored-by: Shourya Shukla <periperidip@gmail.com>
>> ---
>>
>> Notable changes since v1:
>>
>> * Modified the code structure in
>> submodule--helper.c:run_update_command(), while fixing problems with
>> the translation marks.
>>
>> * Renamed '--sha1' and '--subsha1' options to '--oid' and '--suboid' to
>> since the argument is parsed into an object_id struct, not plain sha1
>> data.
>>
>> * Used option callbacks to parse the SHA1 arguments directly.
>>
>> * Moved the conditional out of 'do_run_update_procedure()'.
>>
>> Feedback required:
>>
>> Ævar felt that it would be clearer to populate the 'fatal' messages
>> after the run_command() operation in 'run_update_command()', to make it
>> more readable [1]. I have attempted something like that here, and it has led
>> to a lot more duplicated 'switch' statements, which feels suboptimal.
>> I'd appreciate suggestions to make it more legible.
>>
>> [1] https://lore.kernel.org/git/87r1fps63r.fsf@evledraar.gmail.com/
>>
>> Fetch-it-Via:
>> git fetch https://github.com/tfidfwastaken/git submodule-run-update-proc-list-2
>>
>> builtin/submodule--helper.c | 253 ++++++++++++++++++++++++++++++++++++
>> git-submodule.sh | 106 +++++----------
>> 2 files changed, 286 insertions(+), 73 deletions(-)
>>
>> diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
>> index d55f6262e9..b9c40324d0 100644
>> --- a/builtin/submodule--helper.c
>> +++ b/builtin/submodule--helper.c
>> @@ -2029,6 +2029,20 @@ struct submodule_update_clone {
>> .max_jobs = 1, \
>> }
>>
>> +struct update_data {
>> + const char *recursive_prefix;
>> + const char *sm_path;
>> + const char *displaypath;
>> + struct object_id oid;
>> + struct object_id suboid;
>> + struct submodule_update_strategy update_strategy;
>> + int depth;
>> + unsigned int force: 1;
>> + unsigned int quiet: 1;
>> + unsigned int nofetch: 1;
>> + unsigned int just_cloned: 1;
>> +};
>> +#define UPDATE_DATA_INIT { .update_strategy = SUBMODULE_UPDATE_STRATEGY_INIT }
>>
>> static void next_submodule_warn_missing(struct submodule_update_clone *suc,
>> struct strbuf *out, const char *displaypath)
>> @@ -2282,6 +2296,175 @@ static int git_update_clone_config(const char *var, const char *value,
>> return 0;
>> }
>> +
>> +static int run_update_command(struct update_data *ud, int subforce)
>> +{
>> + struct child_process cp = CHILD_PROCESS_INIT;
>> + char *oid = oid_to_hex(&ud->oid);
>> + int must_die_on_failure = 0;
>> +
>> + cp.dir = xstrdup(ud->sm_path);
>> + switch (ud->update_strategy.type) {
>> + case SM_UPDATE_CHECKOUT:
>> + cp.git_cmd = 1;
>> + strvec_pushl(&cp.args, "checkout", "-q", NULL);
>
> Would it be possible to add the 'if' statement above just before the
> 'switch' (or after,
> whichever seems okay) since this is common amongst (almost) all the cases?
I'll try it on once, if it makes the code more readable, I'll include it in the
reroll.
>> + if (subforce)
>> + strvec_push(&cp.args, "-f");
>> + break;
>> + case SM_UPDATE_REBASE:
>> + cp.git_cmd = 1;
>> + strvec_push(&cp.args, "rebase");
>> + if (ud->quiet)
>> + strvec_push(&cp.args, "--quiet");
>> + must_die_on_failure = 1;
>> + break;
>> + case SM_UPDATE_MERGE:
>> + cp.git_cmd = 1;
>> + strvec_push(&cp.args, "merge");
>> + if (ud->quiet)
>> + strvec_push(&cp.args, "--quiet");
>> + must_die_on_failure = 1;
>> + break;
>> + case SM_UPDATE_COMMAND:
>> + /* NOTE: this does not handle quoted arguments */
>> + strvec_split(&cp.args, ud->update_strategy.command);
>> + must_die_on_failure = 1;
>> + break;
>> + case SM_UPDATE_UNSPECIFIED:
>> + case SM_UPDATE_NONE:
>> + BUG("update strategy should have been specified");
>> + }
>
> If the original did not bug out, do we need to? The documentation does
> not mention
> this as well:
> https://git-scm.com/docs/git-submodule#Documentation/git-submodule.txt-none
This was how the original shell porcelain did it:
case "$update_module" in
checkout)
command="git checkout $subforce -q"
die_msg="$(eval_gettext "Unable to checkout '\$sha1' in submodule path '\$displaypath'")"
say_msg="$(eval_gettext "Submodule path '\$displaypath': checked out '\$sha1'")"
;;
rebase)
command="git rebase ${GIT_QUIET:+--quiet}"
die_msg="$(eval_gettext "Unable to rebase '\$sha1' in submodule path '\$displaypath'")"
say_msg="$(eval_gettext "Submodule path '\$displaypath': rebased into '\$sha1'")"
must_die_on_failure=yes
;;
merge)
command="git merge ${GIT_QUIET:+--quiet}"
die_msg="$(eval_gettext "Unable to merge '\$sha1' in submodule path '\$displaypath'")"
say_msg="$(eval_gettext "Submodule path '\$displaypath': merged in '\$sha1'")"
must_die_on_failure=yes
;;
!*)
command="${update_module#!}"
die_msg="$(eval_gettext "Execution of '\$command \$sha1' failed in submodule path '\$displaypath'")"
say_msg="$(eval_gettext "Submodule path '\$displaypath': '\$command \$sha1'")"
must_die_on_failure=yes
;;
*)
die "$(eval_gettext "Invalid update mode '$update_module' for submodule path '$path'")"
esac
The fallthrough case used to die, but I noticed that this branch will never get
activated. This is because the 'update-clone' helper will not output any entry
that has the update mode set to 'none', and thus the `while` loop that contains
this code would never run.
Which is why I decided to BUG out on that case, because that state should never
be reached. But I see the source of confusion, and maybe I should have different
BUG messages for SM_UPDATE_UNSPECIFIED and SM_UPDATE_NONE. The latter should
probably say "should have been handled by update-clone".
>> +
>> + strvec_push(&cp.args, oid);
>> +
>> + prepare_submodule_repo_env(&cp.env_array);
>> +
>> + if (run_command(&cp)) {
>> + if (must_die_on_failure) {
>> + switch (ud->update_strategy.type) {
>> + case SM_UPDATE_CHECKOUT:
>> + die(_("Unable to checkout '%s' in submodule path '%s'"),
>> + oid, ud->displaypath);
>> + break;
>> + case SM_UPDATE_REBASE:
>> + die(_("Unable to rebase '%s' in submodule path '%s'"),
>> + oid, ud->displaypath);
>> + break;
>> + case SM_UPDATE_MERGE:
>> + die(_("Unable to merge '%s' in submodule path '%s'"),
>> + oid, ud->displaypath);
>> + break;
>> + case SM_UPDATE_COMMAND:
>> + die(_("Execution of '%s %s' failed in submodule path '%s'"),
>> + ud->update_strategy.command, oid, ud->displaypath);
>> + break;
>> + case SM_UPDATE_UNSPECIFIED:
>> + case SM_UPDATE_NONE:
>> + BUG("update strategy should have been specified");
>> + }
>> + }
>> + /*
>> + * This signifies to the caller in shell that
>> + * the command failed without dying
>> + */
>> + return 1;
>> + }
>> +
>> + switch (ud->update_strategy.type) {
>> + case SM_UPDATE_CHECKOUT:
>> + printf(_("Submodule path '%s': checked out '%s'\n"),
>> + ud->displaypath, oid);
>> + break;
>> + case SM_UPDATE_REBASE:
>> + printf(_("Submodule path '%s': rebased into '%s'\n"),
>> + ud->displaypath, oid);
>> + break;
>> + case SM_UPDATE_MERGE:
>> + printf(_("Submodule path '%s': merged in '%s'\n"),
>> + ud->displaypath, oid);
>> + break;
>> + case SM_UPDATE_COMMAND:
>> + printf(_("Submodule path '%s': '%s %s'\n"),
>> + ud->displaypath, ud->update_strategy.command, oid);
>> + break;
>> + case SM_UPDATE_UNSPECIFIED:
>> + case SM_UPDATE_NONE:
>> + BUG("update strategy should have been specified");
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static int do_run_update_procedure(struct update_data *ud)
>> +{
>> + int subforce = is_null_oid(&ud->suboid) || ud->force;
>> +
>> + if (!ud->nofetch) {
>> + /*
>> + * Run fetch only if `oid` isn't present or it
>> + * is not reachable from a ref.
>> + */
>> + if (!is_tip_reachable(ud->sm_path, &ud->oid))
>> + if (fetch_in_submodule(ud->sm_path, ud->depth, ud->quiet, NULL) &&
>> + !ud->quiet)
>> + fprintf_ln(stderr,
>> + _("Unable to fetch in submodule path '%s'; "
>> + "trying to directly fetch %s:"),
>> + ud->displaypath, oid_to_hex(&ud->oid));
>
> I was wondering if an OID is invalid, will it be counted as
> unreachable and vice-versa?
> If that is the case then that would simplify the work.
Could you elaborate? I'm not sure what you mean by 'invalid' in this context. I
don't think this code will receive any kind of malformed oid--they come from
'update-clone' which handles it correctly.
As far as I can tell, the only way to check if a particular OID is unreachable
is when we check if all the refs cannot find it.
next prev parent reply other threads:[~2021-08-03 10:08 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-22 13:40 [GSoC] [PATCH] submodule--helper: run update procedures from C Atharva Raykar
2021-07-23 9:37 ` Ævar Arnfjörð Bjarmason
2021-07-23 16:59 ` Atharva Raykar
2021-08-04 8:35 ` Atharva Raykar
2021-08-02 13:06 ` [GSoC] [PATCH v2] " Atharva Raykar
2021-08-02 18:50 ` Shourya Shukla
2021-08-03 8:46 ` Atharva Raykar
2021-08-03 10:07 ` Atharva Raykar [this message]
2021-08-13 7:56 ` [GSoC] [PATCH v3] " Atharva Raykar
2021-08-13 18:32 ` Junio C Hamano
2021-08-24 8:58 ` Atharva Raykar
2021-08-24 14:06 ` [PATCH v4] " Atharva Raykar
2021-09-08 0:14 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m2zgtyaljh.fsf@gmail.com \
--to=raykar.ath@gmail.com \
--cc=christian.couder@gmail.com \
--cc=emilyshaffer@google.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jrnieder@gmail.com \
--cc=kaartic.sivaraam@gmail.com \
--cc=pc44800@gmail.com \
--cc=periperidip@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).