git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paul Tan <pyokagan@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Git List <git@vger.kernel.org>,
	Matthieu Moy <Matthieu.Moy@grenoble-inp.fr>
Subject: Re: [RFC/GSoC] Proposal: Make git-pull and git-am builtins
Date: Wed, 25 Mar 2015 13:55:22 +0800	[thread overview]
Message-ID: <CACRoPnTL78ZrqPYYNPkNcW7vConHZXZ66z5P2=3HySrtmZKcrg@mail.gmail.com> (raw)
In-Reply-To: <xmqqwq265jjl.fsf@gitster.dls.corp.google.com>

Hi,

On Wed, Mar 25, 2015 at 2:37 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Paul Tan <pyokagan@gmail.com> writes:
>
>> ..., I propose the following requirements for the rewritten code:
>>
>> 1. No spawning of external git processes. This is to support systems with high
>>    ``fork()`` or process creation overhead, and to reduce redundant IO by
>>    taking advantage of the internal object, index and configuration cache.
>
> I suspect this may probably be too strict in practice.
>
> True, we should never say "run_command_capture()" just to to read
> from "git rev-parse"---we should just call get_sha1() instead.
>
> But for a complex command whose execution itself far outweighs the
> cost of forking, I do not think it is fair to say your project
> failed if you chose to run_command() it.  For example, it may be
> perfectly OK to invoke "git merge" via run_command().

Yes, which is why I proposed writing a baseline using only the
run-command APIs first. Any other optimizations can then be done
selectively after that.

I think it's still good to have the ideal in mind though (and whoops I
forgot to put in the word "ideal" in the text).

>
>> 3. The resulting builtin should not have wildly different behavior or bugs
>>    compared to the shell script.
>
> This on the other hand is way too loose.
>
> The original and the port must behave identically, unless the
> difference is fixing bugs in the original.
>

I was considering that there may be slight behavioral changes when the
rewritten code is modified to take greater advantage of the internal
API, especially since some builtins due to historical issues, may have
duplicated code from the internal API[1].

[1] I'm suspecting that the implementation of --merge-base in
show-branch.c re-implements get_octopus_merge_bases().

>> Potential difficulties
>> =======================
>>
>> Rewriting code may introduce bugs
>> ...
>
> Yes, but that is a reasonable risk you need to manage to gain the
> benefit from this project.
>
>> Of course, the downside of following this too strictly is that if there were
>> any logical bugs in the original code, or if the original code is unclear, the
>> rewritten code would inherit these problems too.
>
> I'd repeat my comment on the 3. above.  Identifying and fixing bugs
> is great, but otherwise don't worry about this too much.
>
> Being bug-to-bug compatible with the original is way better than
> introducing new bugs of an unknown nature.

Well yes, but I was thinking that if there are any glaring errors in
the original source then it would be better to fix these errors during
the rewrite than wasting time writing code that replicates these
errors.

>> Rewritten code may become harder to understand
>> ...
>
> And also it may become harder to modify.
>
> That is the largest problem with any rewrite, and we should spend
> the most effort to avoid it.
>
> A new bugs introduced we can later fix as long as the result is
> understandable and maintainable.
>
>> For the purpose of reducing git's dependencies, the rewritten C code should not
>> depend on other libraries or executables other than what is already available
>> to git builtins.
>
> Perhaps misphrased; see below.

In this case I was thinking of "making git depend on another project".
(e.g, using an external regular expression library). Of course a
balance has to be made in this aspect (thus the use of "should not"),
but git-pull and git-am are relatively simple so there should be no
need for that,

>
>> We can see that the C version requires much more lines compared to the shell
>> pipeline,...
>
> That is something you would solve by introducing reusable code in
> run_command API, isn't it?  That is how various rewrites in the past
> did, and this project should do so too.  You should aim to do this
> project by not just using "what is already available", but adding
> what you discover is a useful reusable pattern into a set of new
> functions in the "already available" API set.

Whoops, forgot to mention that here. (A brief mention was made on this
kind of refactoring in the Development Approach).

Thank you for your review.

  reply	other threads:[~2015-03-25  5:55 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-24 16:37 [RFC/GSoC] Proposal: Make git-pull and git-am builtins Paul Tan
2015-03-24 17:22 ` Paul Tan
2015-03-24 17:31   ` Matthieu Moy
2015-03-24 18:37 ` Junio C Hamano
2015-03-25  5:55   ` Paul Tan [this message]
2015-03-25 17:54     ` Junio C Hamano
2015-03-26  5:02       ` Paul Tan
2015-03-25  9:07 ` [RFC/GSoC v2] " Paul Tan
2015-03-25 20:32 ` [RFC/GSoC] " Sebastian Schuberth

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACRoPnTL78ZrqPYYNPkNcW7vConHZXZ66z5P2=3HySrtmZKcrg@mail.gmail.com' \
    --to=pyokagan@gmail.com \
    --cc=Matthieu.Moy@grenoble-inp.fr \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).