git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Hostetler <git@jeffhostetler.com>
To: Jonathan Nieder <jrnieder@gmail.com>, Vitaly Arbuzov <vit@uber.com>
Cc: git@vger.kernel.org, Konstantin Khomoutov <kostix@bswap.ru>,
	git-users@googlegroups.com, jonathantanmy@google.com,
	Christian Couder <christian.couder@gmail.com>
Subject: Re: How hard would it be to implement sparse fetching/pulling?
Date: Fri, 1 Dec 2017 11:03:07 -0500	[thread overview]
Message-ID: <e93127b9-d6d6-dcf2-3d58-dc83d68d5d20@jeffhostetler.com> (raw)
In-Reply-To: <20171130200341.GA20640@aiede.mtv.corp.google.com>



On 11/30/2017 3:03 PM, Jonathan Nieder wrote:
> Hi Vitaly,
> 
> Vitaly Arbuzov wrote:
> 
>> Found some details here: https://github.com/jeffhostetler/git/pull/3
>>
>> Looking at commits I see that you've done a lot of work already,
>> including packing, filtering, fetching, cloning etc.
>> What are some areas that aren't complete yet? Do you need any help
>> with implementation?
> 
> That's a great question!  I've filed https://crbug.com/git/2 to track
> this project.  Feel free to star it to get updates there, or to add
> updates of your own.

Thanks!

> 
> As described at https://crbug.com/git/2#c1, currently there are three
> patch series for which review would be very welcome.  Building on top
> of them is welcome as well.  Please make sure to coordinate with
> jeffhost@microsoft.com and jonathantanmy@google.com (e.g. through the
> bug tracker or email).
> 
> One piece of missing functionality that looks intereseting to me: that
> series batches fetches of the missing blobs involved in a "git
> checkout" command:
> 
>   https://public-inbox.org/git/20171121211528.21891-14-git@jeffhostetler.com/
> 
> But if doesn't batch fetches of the missing blobs involved in a "git
> diff <commit> <commit>" command.  That might be a good place to get
> your hands dirty. :)

Jonathan Tan added code in unpack-trees to bulk fetch missing blobs
before a checkout.  This is limited to the missing blobs needed for
the target commit.  We need this to make checkout seamless, but it
does mean that checkout may need online access.

I've also talked about a pre-fetch capability to bulk fetch missing
blobs in advance of some operation.  You could speed up the above
diff command or back-fill all the blobs I might need before going
offline for a while.

You can use the options that were added to rev-list to help with this.
For example:
     git rev-list --objects [--filter=<fs>] --missing=print <commit1>
     git rev-list --objects [--filter=<fs>] --missing=print <c1>..<c2>
And then pipe that into a "git fetch-pack --stdin".

You might experiment with this.


> 
> Thanks,
> Jonathan
> 

Thanks,
Jeff


  reply	other threads:[~2017-12-01 16:03 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-30  3:16 How hard would it be to implement sparse fetching/pulling? Vitaly Arbuzov
2017-11-30 14:24 ` Jeff Hostetler
2017-11-30 17:01   ` Vitaly Arbuzov
2017-11-30 17:44     ` Vitaly Arbuzov
2017-11-30 20:03       ` Jonathan Nieder
2017-12-01 16:03         ` Jeff Hostetler [this message]
2017-12-01 18:16           ` Jonathan Nieder
2017-11-30 23:43       ` Philip Oakley
2017-12-01  1:27         ` Vitaly Arbuzov
2017-12-01  1:51           ` Vitaly Arbuzov
2017-12-01  2:51             ` Jonathan Nieder
2017-12-01  3:37               ` Vitaly Arbuzov
2017-12-02 16:59               ` Philip Oakley
2017-12-01 14:30             ` Jeff Hostetler
2017-12-02 16:30               ` Philip Oakley
2017-12-04 15:36                 ` Jeff Hostetler
2017-12-05 23:46                   ` Philip Oakley
2017-12-02 15:04           ` Philip Oakley
2017-12-01 17:23         ` Jeff Hostetler
2017-12-01 18:24           ` Jonathan Nieder
2017-12-04 15:53             ` Jeff Hostetler
2017-12-02 18:24           ` Philip Oakley
2017-12-05 19:14             ` Jeff Hostetler
2017-12-05 20:07               ` Jonathan Nieder
2017-12-01 15:28       ` Jeff Hostetler
2017-12-01 14:50     ` Jeff Hostetler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e93127b9-d6d6-dcf2-3d58-dc83d68d5d20@jeffhostetler.com \
    --to=git@jeffhostetler.com \
    --cc=christian.couder@gmail.com \
    --cc=git-users@googlegroups.com \
    --cc=git@vger.kernel.org \
    --cc=jonathantanmy@google.com \
    --cc=jrnieder@gmail.com \
    --cc=kostix@bswap.ru \
    --cc=vit@uber.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).