git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Shubham Kanodia <shubham.kanodia10@gmail.com>
To: Han Young <hanyang.tony@bytedance.com>,
	Junio C Hamano <gitster@pobox.com>
Cc: Jonathan Tan <jonathantanmy@google.com>,
	Burke Libbey <burke.libbey@shopify.com>,
	git@vger.kernel.org
Subject: Re: [External] Re: git-blame extremely slow in partial clones due to serial object fetching
Date: Fri, 22 Nov 2024 09:02:08 +0530	[thread overview]
Message-ID: <972d0904-650b-4161-a13c-e3081d55a212@gmail.com> (raw)
In-Reply-To: <CAG1j3zEEN5EJwTsM3q87gCSqXG4+=DZVvcQdDhoj5Epe-S0nPw@mail.gmail.com>



On 21/11/24 8:42 am, Han Young wrote:
> On Thu, Nov 21, 2024 at 7:00 AM Junio C Hamano <gitster@pobox.com> wrote:
>>>   - We could also teach the server to "blame" a file for us and then
>>>     teach the client to stitch together the server's result with the
>>>     local findings, but this is more complicated.
>>
>> Your local lazy repository, if you have anything you have to "stitch
>> together", would have your locally modified contents, and for you to
>> be able to make such modifications, it would also have at least the
>> blobs from HEAD, which you based your modifications on.  So you
>> should be able to locally run "git blame @{u}.." to find lines that
>> your locally modified contents are to be blamed, ask the other side
>> to give you a blame for @{u}, and overlay the former on top of the
>> latter.
>>
> 
> In $DAY_JOB, we modified the server to run blame for the client.
> To deal with changes not yet pushed to the server, we let client
> pack the local only blobs for the blamed file, alone with the local
> only commits that touch that file into one packfile and send a
> 'remote-blame' request to the server.
> 
> Server then unpack the relevant objects into memory
> (by reusing code from git-unpack-objects), run the blame and return
> the result back to the client. This way we avoided running blame both
> twice and interleave the results.
> 
> It works quite well in very large repos, with result caching, the speed
> can be even faster than locally blame on a full repo.

In a large sized partially cloned repo that I have, a `git blame` can 
take several minutes and network roundtrips.

Junio — would it make sense to add an option (and config) for `git 
blame` that limits how far back it looks for fetching blobs? This would 
prevent someone accidently firing several cascading calls as they open 
new files in an editor that does git blame by default (IntelliJ) or 
popular plugins (GitLens for VSCode) that can startup multiple heavy git 
processes and bring a user's system to a crawl.

  reply	other threads:[~2024-11-22  3:32 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-19 20:16 git-blame extremely slow in partial clones due to serial object fetching Burke Libbey
2024-11-20 11:59 ` Manoraj K
2024-11-20 18:52 ` Jonathan Tan
2024-11-20 22:55   ` Junio C Hamano
2024-11-21  3:12     ` [External] " Han Young
2024-11-22  3:32       ` Shubham Kanodia [this message]
2024-11-22  8:29         ` Junio C Hamano
2024-11-22  8:51           ` Shubham Kanodia
2024-11-22 17:55             ` Jonathan Tan
2024-11-25  0:22               ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=972d0904-650b-4161-a13c-e3081d55a212@gmail.com \
    --to=shubham.kanodia10@gmail.com \
    --cc=burke.libbey@shopify.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=hanyang.tony@bytedance.com \
    --cc=jonathantanmy@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).