git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Thomas Rast <trast@student.ethz.ch>
To: Felipe Tanus <fotanus@gmail.com>
Cc: <git@vger.kernel.org>
Subject: Re: [GSoC] Improving parallelism
Date: Wed, 21 Mar 2012 13:45:21 +0100	[thread overview]
Message-ID: <87haxit95q.fsf@thomas.inf.ethz.ch> (raw)
In-Reply-To: <CANELHzNc+28ZDiZ69zv3X0DJMf0DTkiZXQD1-32Wsy-=vtWDhw@mail.gmail.com> (Felipe Tanus's message of "Sat, 17 Mar 2012 19:18:09 -0300")

Felipe Tanus <fotanus@gmail.com> writes:

> My proposal will most likely follow one of the proposed idea entitled
> "Improving parallelism in various commands". I'm very used to C
> programming, and pthreads is my friend, so I'm the right guy for this
> job. The downside is that I never looked at the git source code
> before, and I expect the most challenging step from the project is to
> find where parallelism can be further explored. For this, I count on
> my skill in C programming, a good mentor to help me to go through the
> code and evaluate my ideas.
>
> I find the idea of the proposal straight-forward, and no doubts pop up
> in my mind, except on what commands can I work on. The idea described
> in the wiki tells that the commands "git grep --cached" and "git grep
> COMMIT" need this improvement, and most likely "git diff" and "git log
> -p" need too. That is a good start, but if you know already other
> commands that might benefit from this parallelism, please tell me in
> order for me to include in my proposal.

As the ideas page says the steps are (the original wording was that it
would have 2.5 steps, hence "the half-step"):

 0. In preparation (the half-step): identify commands that could benefit
    from parallelism. git grep --cached and git grep COMMIT come to
    mind, but most likely also git diff and git log -p. You can probably
    find more.

 1. Rework the pack access mechanisms to allow the maximum possible
    parallel access.

 2. Rework the commands found in the first step to use parallel pack
    access if possible. Along the way, document the improvements with
    performance tests.

I think (1.) is the most important part simply because without (1.) the
other two are totally meaningless.  So I'd rather you not focus too hard
on the command list.  However, correctly identifying more commands where
pack access is the hotspot, and backing that up with numbers, may be a
good way to show your understanding of the matter.

For further reading, you should start with the discussions surrounding
git-grep threading around

  http://thread.gmane.org/gmane.comp.version-control.git/185932/focus=186217
  http://thread.gmane.org/gmane.comp.version-control.git/186618
  http://thread.gmane.org/gmane.comp.version-control.git/188701/focus=189592

etc.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

  parent reply	other threads:[~2012-03-21 12:45 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-03-17 22:18 [GSoC] Improving parallelism Felipe Tanus
2012-03-18  4:42 ` Nguyen Thai Ngoc Duy
2012-03-18  4:58   ` Felipe Tanus
2012-03-21 12:45 ` Thomas Rast [this message]
2012-03-21 18:06   ` Felipe Tanus

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87haxit95q.fsf@thomas.inf.ethz.ch \
    --to=trast@student.ethz.ch \
    --cc=fotanus@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).