All of lore.kernel.org
 help / color / mirror / Atom feed
* Sensible way to see what objects are being fetched just-in-time in a partial clone?
@ 2024-08-26 16:38 Tao Klerks
  2024-08-26 17:28 ` Junio C Hamano
  0 siblings, 1 reply; 4+ messages in thread
From: Tao Klerks @ 2024-08-26 16:38 UTC (permalink / raw)
  To: git

Hi folks,

In working with Partial / Filtered Clone repos, there are situations
where objects get fetched just-in-time - eg during a "git blame", if
you did a "blob:none" filtered clone, you can easily end up with
hundreds of fetches as git iterates backwards through the file
history.

I was trying to write a "git blame optimizer" to pre-fetch all the
suitable blobs, and it wasn't working right, so the "git blame" was
still fetching stuff - but I couldn't see what it was fetching (which
made it hard to investigate the bug in my script).

I did end up getting a list of some just-in-time fetched blobs, by
dumping a list of *all* the object IDs I had locally, before and after
a still-fetching-stuff "git blame" run, and doing a before/after
comparison of the resulting list of objects. To get the list of
objects found locally I did:

git cat-file --batch-check='%(objectname)' --batch-all-objects --unordered

(ref: a conversation with Peff last year:
https://lore.kernel.org/git/20230621064459.GA607974@coredump.intra.peff.net/
)

This was a sucky process though - and I was very surprised that I
couldn't see what was being fetched (what the stdin content to the
just-in-time fetch calls were) with any of the trace env vars that I
was able to find documented: GIT_TRACE, GIT_CURL_VERBOSE,
GIT_TRACE_PERFORMANCE, GIT_TRACE_PACK_ACCESS, GIT_TRACE_PACKET,
GIT_TRACE_PACKFILE, GIT_TRACE_SETUP, GIT_TRACE_SHALLOW

The only thing I could easily see were the *args* passed to nested git
processes.

Is there any way to see what a just-in-time fetch is fetching? Or any
way to see the content passed around on stdin in nested git processes?

Thanks,
Tao

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-08-26 20:37 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-26 16:38 Sensible way to see what objects are being fetched just-in-time in a partial clone? Tao Klerks
2024-08-26 17:28 ` Junio C Hamano
2024-08-26 19:37   ` Tao Klerks
2024-08-26 20:37     ` Python-based fetch optimizer script for "blame" in Partial Clones (was: Re: Sensible way to see what objects are being fetched just-in-time in a partial clone?) Tao Klerks

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.