From: Taylor Blau <me@ttaylorr.com>
To: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Cc: Taylor Blau <me@ttaylorr.com>,
git@vger.kernel.org, christian.couder@gmail.com,
jonathantanmy@google.com,
Marc Strapetz <marc.strapetz@syntevo.com>
Subject: Re: Questions about partial clone with '--filter=tree:0'
Date: Wed, 21 Oct 2020 13:31:53 -0400 [thread overview]
Message-ID: <20201021173153.GC1237181@nand.local> (raw)
In-Reply-To: <a4a20c67-4ee3-77b2-8d57-f30843572aa4@syntevo.com>
On Wed, Oct 21, 2020 at 07:10:02PM +0200, Alexandr Miloslavskiy wrote:
> We currently do not intend to use '--filter=tree:0' ourself, but we are
> trying to support all kinds of user repositories with our UI. So we
> basically have these choices:
>
> A) Declare '--filter=tree:0' repos as completely wrong and unsupported
> in out UI, also giving an option to "un-partial" them.
>
> B) Support '--filter=tree:0' repos, but don't support operations such
> as blame and file log
>
> C) Use some magic to efficiently download objects that will be needed
> for a command such as Blame, while keeping the rest of the repository
> partial. This is where the command described in (3) will help a lot.
>
> We would of course prefer (C) if it's reasonably possible.
(C) is probably the most reasonable. If you have a promisor remote which
is missing objects, running 'git blame' etc. will transparently download
whatever objects it is missing.
> Unfortunately this does not work as expected. Try the following steps:
>
> A) Clone repo with '--filter=tree:0'
> $ git clone --bare --filter=tree:0 --branch master
> https://github.com/git/git.git
>
> B) Change filter to 'blob:none'
> $ cd git.git
> $ git config remote.origin.partialclonefilter 'blob:none'
>
> C) fetch
> $ git fetch origin
> Note that there is no 'Receiving objects:' output.
Ah; I would have thought that the server would have sent objects, even
though we have lots of 'have' lines, since we are treating the server as
a promisor remote and might not have the full reachability closure over
the haves.
Jonathan Tan knows better than I do here. Maybe he could chime in.
> > I think what you probably want is a step 1.5 to tell Git "I'm not going
> > to ask for or care about the entirety of my working copy, I really just
> > want objects in path...", and you can do that with sparse checkouts. See
> > https://git-scm.com/docs/git-sparse-checkout for more.
>
> For simplicity of discussion, let's focus on the problem of running
> Blame efficiently in a repo that was cloned with '--filter=tree:0'. In
> order to blame file '/1/2/Foo.txt', we will need the following:
>
> * Trees '/1'
> * Trees '/1/2'
> * Blobs '/1/2/Foo.txt'
>
> All of these will be needed to unknown commit depth. For simplicity,
> the proposed command will download these for all commits. Specifying
> a range of revisions could be nice, but I feel that it's not worth the
> complexity.
>
> Correct me if I'm wrong: I think that sparse checkout will not help to
> achieve the goal?
I see what you're saying. Here sparse-checkout and partial clones
confusingly diverge: what you really want is to say "I want all of the
objects that I need to construct this directory at any point in history"
so that you can run "git blame" on some path within that directory
without the need for a follow-up fetch.
> This is why I suggest a command that will accept paths and send
> requested objects, also forcing server to assume that all of them are
> missing in client's repository.
In any case the '--filter=sparse:<oid>' bit is not recommended for use,
but perhaps this is a convincing use-case. I didn't follow the partial
clone development close enough to know whether this has already been
discussed, but I'm sure that it has.
Thanks,
Taylor
next prev parent reply other threads:[~2020-10-21 17:32 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-20 17:09 Questions about partial clone with '--filter=tree:0' Alexandr Miloslavskiy
2020-10-20 22:29 ` Taylor Blau
2020-10-21 17:10 ` Alexandr Miloslavskiy
2020-10-21 17:31 ` Taylor Blau [this message]
2020-10-21 17:46 ` Alexandr Miloslavskiy
2020-10-26 18:24 ` Jonathan Tan
2020-10-26 18:44 ` Alexandr Miloslavskiy
2020-10-26 19:46 ` Jonathan Tan
2020-10-26 20:08 ` Alexandr Miloslavskiy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201021173153.GC1237181@nand.local \
--to=me@ttaylorr.com \
--cc=alexandr.miloslavskiy@syntevo.com \
--cc=christian.couder@gmail.com \
--cc=git@vger.kernel.org \
--cc=jonathantanmy@google.com \
--cc=marc.strapetz@syntevo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.