* Fetch on submodule update @ 2018-08-01 17:18 Robert Dailey 2018-08-01 22:34 ` Jonathan Nieder 2018-08-02 6:08 ` Jonathan Nieder 0 siblings, 2 replies; 6+ messages in thread From: Robert Dailey @ 2018-08-01 17:18 UTC (permalink / raw) To: Git Problem: I want to avoid recursively fetching submodules when I run a `fetch` command, and instead defer that operation to the next `submodule update`. Essentially I want `fetch.recurseSubmodules` to be `false`, and `get submodule update` to do exactly what it does with the `--remote` option, but still use the SHA1 of the submodule instead of updating to the tip of the specified branch in the git modules config. I hope that makes sense. The reason for this ask is to improve/streamline workflow in parent repositories. There are cases where I want to quickly fetch only the parent repository, even if a submodule changes, to perform some changes that do not require the submodule itself (yet). Then at a later time, do `submodule update` and have it automatically fetch when the SHA1 it's updating to does not exist (because the former fetch operation for the submodule was skipped). For my case, it's very slow to wait on submodules to recursively fetch when I only wanted to fetch the parent repo for the specific task I plan to do. Is this possible right now through some variation of configuration? ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Fetch on submodule update 2018-08-01 17:18 Fetch on submodule update Robert Dailey @ 2018-08-01 22:34 ` Jonathan Nieder 2018-08-02 6:08 ` Jonathan Nieder 1 sibling, 0 replies; 6+ messages in thread From: Jonathan Nieder @ 2018-08-01 22:34 UTC (permalink / raw) To: Robert Dailey; +Cc: Git, Stefan Beller Hi, Robert Dailey wrote: > Problem: I want to avoid recursively fetching submodules when I run a > `fetch` command, and instead defer that operation to the next > `submodule update`. Essentially I want `fetch.recurseSubmodules` to be > `false`, and `get submodule update` to do exactly what it does with > the `--remote` option, but still use the SHA1 of the submodule instead > of updating to the tip of the specified branch in the git modules > config. > > I hope that makes sense. The reason for this ask is to > improve/streamline workflow in parent repositories. There are cases > where I want to quickly fetch only the parent repository, even if a > submodule changes, to perform some changes that do not require the > submodule itself (yet). Then at a later time, do `submodule update` > and have it automatically fetch when the SHA1 it's updating to does > not exist (because the former fetch operation for the submodule was > skipped). For my case, it's very slow to wait on submodules to > recursively fetch when I only wanted to fetch the parent repo for the > specific task I plan to do. > > Is this possible right now through some variation of configuration? Can you say more about the overall workflow? This seems quite different from what we've been designing --recurse-submodules around: - avoiding the end user ever having to use the "git submodule" command, except to add, remove, or reconfigure submodules - treating the whole codebase as something like one project, so that "git checkout --recurse-submodules <commit>" always checks out the same state More details about the application would help with better understanding whether it can fit into this framework, or whether it's a case where you'd want to set "submodule.recurse" to false to have more manual control. Thanks and hope that helps, Jonathan ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Fetch on submodule update 2018-08-01 17:18 Fetch on submodule update Robert Dailey 2018-08-01 22:34 ` Jonathan Nieder @ 2018-08-02 6:08 ` Jonathan Nieder 2018-08-06 14:45 ` Robert Dailey 1 sibling, 1 reply; 6+ messages in thread From: Jonathan Nieder @ 2018-08-02 6:08 UTC (permalink / raw) To: Robert Dailey; +Cc: Git, Stefan Beller Hi again, Robert Dailey wrote: > Problem: I want to avoid recursively fetching submodules when I run a > `fetch` command, and instead defer that operation to the next > `submodule update`. Essentially I want `fetch.recurseSubmodules` to be > `false`, and `get submodule update` to do exactly what it does with > the `--remote` option, but still use the SHA1 of the submodule instead > of updating to the tip of the specified branch in the git modules > config. I think I misread this the first time. I got distracted by your mention of the --remote option, but you mentioned you want to use the SHA-1 of the submodule listed, so that was silly of me. I think you'll find that "git fetch --no-recurse-submodules" and "git submodule update" do exactly what you want. "git submodule update" does perform a fetch (unless you pass --no-fetch). Let me know how it goes. :) I'd still be interested in hearing more about the nature of the submodules involved --- maybe `submodule.fetchJobs` would help, or maybe this is a workflow where a tool that transparently fetches submodules on demand like https://gerrit.googlesource.com/gitfs/+/master/docs/design.md would be useful (I'm not recommending using slothfs for this today, since it's read-only, but it illustrates the idea). Thanks, Jonathan ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Fetch on submodule update 2018-08-02 6:08 ` Jonathan Nieder @ 2018-08-06 14:45 ` Robert Dailey 2018-08-06 15:41 ` Jonathan Nieder 0 siblings, 1 reply; 6+ messages in thread From: Robert Dailey @ 2018-08-06 14:45 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Git, Stefan Beller On Thu, Aug 2, 2018 at 1:08 AM, Jonathan Nieder <jrnieder@gmail.com> wrote: > I think I misread this the first time. I got distracted by your > mention of the --remote option, but you mentioned you want to use the > SHA-1 of the submodule listed, so that was silly of me. > > I think you'll find that "git fetch --no-recurse-submodules" and "git > submodule update" do exactly what you want. "git submodule update" > does perform a fetch (unless you pass --no-fetch). > > Let me know how it goes. :) > > I'd still be interested in hearing more about the nature of the > submodules involved --- maybe `submodule.fetchJobs` would help, or > maybe this is a workflow where a tool that transparently fetches > submodules on demand like > https://gerrit.googlesource.com/gitfs/+/master/docs/design.md would be > useful (I'm not recommending using slothfs for this today, since it's > read-only, but it illustrates the idea). Hi thanks for your response, sorry I am a bit late getting back with you. Maybe my workflow is dated, because I'm still used to treating submodules as distinctly separated and independent things. I realize submodule recursion is becoming more inherent in many high level git commands, but outside of git there are separation issues that make this workflow doomed to be non-seamless. For example, pull requests will never offer the same uniformity: You will still have 1 pull request per submodule. There's also the issue of log audits: You cannot use blame, log, bisect, or other "diagnostic" commands to introspect into submodules "as if" they were subtree or something of the like (i.e. truly part of the DAG). A more realistic example of one of the common questions I still can't answer easily is: "How do you determine which commit in a submodule made it into which release of the software?" In the case where the parent repository has the annotated tags (representing software release milestones), and the submodule is just a common library (which does not have those tags and has no release cycle). Anyway, none of these issues are particularly related but they do contribute to the answer to your question regarding my workflow and use cases. The list goes on but I hope you get the idea. Some of the more functional issues are performance related: I am aware enough, at times, that I can save time (in both local operations and network overhead) by skipping submodules. For example, if I know that I'm merging mainline branches, I do not need to mess with the submodules (I can fetch, merge, commit, push from the parent repo without messing with the submodules. This saves me time). If `fetchJobs` was also `updateJobs`, i.e. you could update submodules in parallel too, that might make this less of an issue. Think of repositories [like boost][1] that have (I think) over a hundred sibling submodules: Fetching 8 in parallel *and* doing `submodule update` in parallel 8 times might also speed things up. There's also `git status`, that if it recurses into submodules, is also significantly slow in the boost case (I'm not sure if it is parallelized). Again, none of this is particularly related, but just to give you more context on the "why" for my ask. Sorry if I'm dragging this out too far. The TLDR is that I do prefer the manual control. Automatic would be great if submodules were treated as integrated in a similar manner to subtree, but it's not there. I wasn't aware that `submodule update` did a fetch, because sometimes if I do that, I get errors saying SHA1 is not present (because the submodule did not get fetched). Granted I haven't seen this in a while, so maybe the fetch on submodule update is a newer feature. Do you know what triggers the fetch on update without --remote? Is it the missing SHA1 that triggers it, or is it fetching unconditionally? Thanks for confirming it behaves as I already wanted. And as you can tell, I'm also happy to further discuss motivation / use cases / details related to overall usage of submodules if you'd like. I'm happy to help however I can! [1]: https://github.com/boostorg/boost ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Fetch on submodule update 2018-08-06 14:45 ` Robert Dailey @ 2018-08-06 15:41 ` Jonathan Nieder 2018-08-06 15:44 ` Robert Dailey 0 siblings, 1 reply; 6+ messages in thread From: Jonathan Nieder @ 2018-08-06 15:41 UTC (permalink / raw) To: Robert Dailey; +Cc: Git, Stefan Beller Robert Dailey wrote: > Automatic would be > great if submodules were treated as integrated in a similar manner to > subtree, but it's not there. I wasn't aware that `submodule update` > did a fetch, because sometimes if I do that, I get errors saying SHA1 > is not present (because the submodule did not get fetched). Granted I > haven't seen this in a while, so maybe the fetch on submodule update > is a newer feature. Do you know what triggers the fetch on update > without --remote? Is it the missing SHA1 that triggers it, or is it > fetching unconditionally? Thanks for this and the rest of the context you sent. It's very helpful. The relevant code in git-submodule.sh is # Run fetch only if $sha1 isn't present or it # is not reachable from a ref. is_tip_reachable "$sm_path" "$sha1" || fetch_in_submodule "$sm_path" $depth || say "$(eval_gettext "Unable to fetch in submodule path '\$displaypath'")" # Now we tried the usual fetch, but $sha1 may # not be reachable from any of the refs is_tip_reachable "$sm_path" "$sha1" || fetch_in_submodule "$sm_path" $depth "$sha1" || die "$(eval_gettext "Fetched in submodule path '\$displaypath', but it did not contain \$sha1. Direct fetching of that commit failed.")" The fallback to fetching by SHA-1 was introduced in v2.8.0-rc0~9^2 (submodule: try harder to fetch needed sha1 by direct fetching sha1, 2018-02-23). Jonathan ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Fetch on submodule update 2018-08-06 15:41 ` Jonathan Nieder @ 2018-08-06 15:44 ` Robert Dailey 0 siblings, 0 replies; 6+ messages in thread From: Robert Dailey @ 2018-08-06 15:44 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Git, Stefan Beller On Mon, Aug 6, 2018 at 10:41 AM, Jonathan Nieder <jrnieder@gmail.com> wrote: > Robert Dailey wrote: > >> Automatic would be >> great if submodules were treated as integrated in a similar manner to >> subtree, but it's not there. I wasn't aware that `submodule update` >> did a fetch, because sometimes if I do that, I get errors saying SHA1 >> is not present (because the submodule did not get fetched). Granted I >> haven't seen this in a while, so maybe the fetch on submodule update >> is a newer feature. Do you know what triggers the fetch on update >> without --remote? Is it the missing SHA1 that triggers it, or is it >> fetching unconditionally? > > Thanks for this and the rest of the context you sent. It's very > helpful. > > The relevant code in git-submodule.sh is > > # Run fetch only if $sha1 isn't present or it > # is not reachable from a ref. > is_tip_reachable "$sm_path" "$sha1" || > fetch_in_submodule "$sm_path" $depth || > say "$(eval_gettext "Unable to fetch in submodule path '\$displaypath'")" > > # Now we tried the usual fetch, but $sha1 may > # not be reachable from any of the refs > is_tip_reachable "$sm_path" "$sha1" || > fetch_in_submodule "$sm_path" $depth "$sha1" || > die "$(eval_gettext "Fetched in submodule path '\$displaypath', but it did not contain \$sha1. Direct fetching of that commit failed.")" > > The fallback to fetching by SHA-1 was introduced in v2.8.0-rc0~9^2 > (submodule: try harder to fetch needed sha1 by direct fetching sha1, > 2018-02-23). Yep, that's the root cause; I was basing my concerns on a legacy issue. I just had avoided using `update` when I expected a fetch, so I never saw the issue again, and thus didn't realize it was corrected. Very helpful. Thanks again! ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-08-06 15:44 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-08-01 17:18 Fetch on submodule update Robert Dailey 2018-08-01 22:34 ` Jonathan Nieder 2018-08-02 6:08 ` Jonathan Nieder 2018-08-06 14:45 ` Robert Dailey 2018-08-06 15:41 ` Jonathan Nieder 2018-08-06 15:44 ` Robert Dailey
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).