Git development

Git development
 help / color / mirror / Atom feed

* [RFC-PATCHv2] submodules: add a background story
From: Stefan Beller @ 2017-02-09  2:08 UTC (permalink / raw)
  Cc: git, bmwill, Stefan Beller

Just like gitmodules(5), gitattributes(5), gitcredentials(7),
gitnamespaces(7), gittutorial(7), we'd like to provide some background
on submodules, which is not specific to the `submodule` command, but
elaborates on the background and its intended usage.

Add gitsubmodules(7), that explains the states, structure and usage of
submodules.

Signed-off-by: Stefan Beller <sbeller@google.com>
---

This would replace the last patch of  sb/submodule-doc, though it's still
RFC. In this revision I took care of the technical details (i.e. proper
formatting, spelling), and only slight rewording of the text.

The main issue persists; see bottom of the patch:

  SAMPLE WORKFLOWS (RFC/TODO)
  ---------------------------
  
  Do we need
  
  * an opinionated way to check for a specific state of a submodule
  * (submodule helper to be plumbing?)
  * expose the design mistake of having the (name->path) mapping inside the
    working tree, i.e. never remove a name from the submodule config even when
    the submodule doesn't exist any more.
    
Any opinion on these would be welcome!
Thanks,
Stefan

 Documentation/Makefile          |   1 +
 Documentation/git-submodule.txt |  36 ++------
 Documentation/gitsubmodules.txt | 194 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 200 insertions(+), 31 deletions(-)
 create mode 100644 Documentation/gitsubmodules.txt

diff --git a/Documentation/Makefile b/Documentation/Makefile
index b43d66eae6..325c4735a7 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -31,6 +31,7 @@ MAN7_TXT += giteveryday.txt
 MAN7_TXT += gitglossary.txt
 MAN7_TXT += gitnamespaces.txt
 MAN7_TXT += gitrevisions.txt
+MAN7_TXT += gitsubmodules.txt
 MAN7_TXT += gittutorial-2.txt
 MAN7_TXT += gittutorial.txt
 MAN7_TXT += gitworkflows.txt
diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index 4a4cede144..d38aa2d53a 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -24,37 +24,7 @@ DESCRIPTION
 -----------
 Inspects, updates and manages submodules.
 
-A submodule allows you to keep another Git repository in a subdirectory
-of your repository. The other repository has its own history, which does not
-interfere with the history of the current repository. This can be used to
-have external dependencies such as third party libraries for example.
-
-When cloning or pulling a repository containing submodules however,
-these will not be checked out by default; the 'init' and 'update'
-subcommands will maintain submodules checked out and at
-appropriate revision in your working tree.
-
-Submodules are composed from a so-called `gitlink` tree entry
-in the main repository that refers to a particular commit object
-within the inner repository that is completely separate.
-A record in the `.gitmodules` (see linkgit:gitmodules[5]) file at the
-root of the source tree assigns a logical name to the submodule and
-describes the default URL the submodule shall be cloned from.
-The logical name can be used for overriding this URL within your
-local repository configuration (see 'submodule init').
-
-Submodules are not to be confused with remotes, which are other
-repositories of the same project; submodules are meant for
-different projects you would like to make part of your source tree,
-while the history of the two projects still stays completely
-independent and you cannot modify the contents of the submodule
-from within the main project.
-If you want to merge the project histories and want to treat the
-aggregated whole as a single project from then on, you may want to
-add a remote for the other project and use the 'subtree' merge strategy,
-instead of treating the other project as a submodule. Directories
-that come from both projects can be cloned and checked out as a whole
-if you choose to go that route.
+For more information about submodules, see linkgit:gitsubmodules[5]
 
 COMMANDS
 --------
@@ -420,6 +390,10 @@ This file should be formatted in the same way as `$GIT_DIR/config`. The key
 to each submodule url is "submodule.$name.url".  See linkgit:gitmodules[5]
 for details.
 
+SEE ALSO
+--------
+linkgit:gitsubmodules[1], linkgit:gitmodules[1].
+
 GIT
 ---
 Part of the linkgit:git[1] suite
diff --git a/Documentation/gitsubmodules.txt b/Documentation/gitsubmodules.txt
new file mode 100644
index 0000000000..3369d55ae9
--- /dev/null
+++ b/Documentation/gitsubmodules.txt
@@ -0,0 +1,194 @@
+gitsubmodules(7)
+================
+
+NAME
+----
+gitsubmodules - information about submodules
+
+SYNOPSIS
+--------
+$GIT_DIR/config, .gitmodules
+
+------------------
+git submodule
+------------------
+
+DESCRIPTION
+-----------
+
+A submodule allows you to keep another Git repository in a subdirectory
+of your repository. The other repository has its own history, which does not
+interfere with the history of the current repository. This can be used to
+have external dependencies such as third party libraries for example.
+
+Submodules are composed from a so-called `gitlink` tree entry
+in the main repository that refers to a particular commit object
+within the inner repository that is completely separate.
+A record in the `.gitmodules` (see linkgit:gitmodules[5]) file at the
+root of the source tree assigns a logical name to the submodule and
+describes the default URL the submodule shall be cloned from.
+The logical name can be used for overriding this URL within your
+local repository configuration (see 'submodule init').
+
+Submodules are not to be confused with remotes, which are other
+repositories of the same project; submodules are meant for
+different projects you would like to make part of your source tree,
+while the history of the two projects still stays completely
+independent and you cannot modify the contents of the submodule
+from within the main project.
+If you want to merge the project histories and want to treat the
+aggregated whole as a single project from then on, you may want to
+add a remote for the other project and use the 'subtree' merge strategy,
+instead of treating the other project as a submodule. Directories
+that come from both projects can be cloned and checked out as a whole
+if you choose to go that route.
+
+When cloning or pulling a repository containing submodules however,
+the submodules will not be checked out by default; You need to instruct
+'clone' to recurse into submodules. The 'init' and 'update' subcommands
+of 'git submodule' will maintain submodules checked out and at an
+appropriate revision in your working tree.
+
+WHEN TO USE
+-----------
+
+Submodules, repositories inside other repositories,
+can be used for different use cases:
+
+* To have finer grained access control.
+  The design principles of Git do not allow for partial repositories to be
+  checked out or transferred. A repository is the smallest unit that a user
+  can be given access to. Submodules are separate repositories, such that
+  you can restrict access to parts of your project via the use of submodules.
+
+* To decouple Git histories.
+  Decoupling histories has different benefits.
+
+** When you want to use a (third party) library tied to a specific version.
+   Using submodules for a library allows you to have a clean history for
+   your own project and only updating the library in the submodule when needed.
+
+** In its current form Git scales up poorly for very large repositories that
+   change a lot, as the history grows very large. For that you may want to look
+   at shallow clone, sparse checkout or git-lfs.
+   However you can also use submodules to e.g. hold large binary assets
+   and these repositories are then shallowly cloned such that you do not
+   have a large history locally.
+
+STATES
+------
+
+When working with submodules, you can think of them as in a state machine.
+So each submodule can be in a different state, the following indicators are used:
+
+* the existence of the setting of 'submodule.<name>.url' in the
+  superprojects configuration
+* the existence of the submodules working tree within the
+  working tree of the superproject
+* the existence of the submodules git directory within the superprojects
+  git directory at $GIT_DIR/modules/<name> or within the submodules working
+  tree
+
+      State      URL config        working tree     git dir
+      -----------------------------------------------------
+      uninitialized    no               no           no
+      initialized     yes               no           no
+      populated       yes              yes          yes
+      depopulated     yes               no          yes
+      deinitialized    no               no          yes
+      uninteresting    no              yes          yes
+
+      invalid          no              yes           no
+      invalid         yes              yes           no
+      -----------------------------------------------------
+
+The first six states can be reached by normal git usage, the latter two are
+only shown for completeness to show all possible eight states with 3 binary
+indicators. The states in detail:
+
+uninitialized::
+The uninitialized state is the default state if no
+'--recurse-submodules' / '--recursive'. An empty directory will be put in
+the working tree as a place holder, such that you are reminded of the
+existence of the submodule.
+---
+To transition into the initialized state
+you can use 'git submodule init', which copies the presets from the
+.gitmodules file into the config.
+
+initialized::
+Users transitioned from the uninitialized state to this state via
+'git submodule init', which preset the URL configuration. As these URLs
+may not be desired in certain scenarios, this state allows to change the
+URLs.  For example in a corporate environment you may want to run
+
+    sed -i s/example.org/$internal-mirror/ .git/config
++
+before proceeding to populate the submodules.
+
+populated::
+In the populated state you have the submodule fully available, i.e. the git
+directory exists as well the working tree exists. In this state you can work
+with the submodule, just like with any other repository.
+
+depopulated::
+In this state you still have the git directory around, but the working tree
+is gone.  For example when the superproject checks out a revision that doesn't
+have the submodule, the state may change to depopulated.
+
+deinitialized::
+The git directory is still there, but the user is no longer interested in the
+submodule as indicated by the missing URL configuration.
+
+invalid::
+When there is no git directory for a submodule, then there is something
+seriously wrong with the submodule.
+
+INNER WORKINGS
+--------------
+
+Generally a submodule can be considered its own autonomous repository,
+that has a worktree and a git directory at split places.
+
+The superproject only records the commit sha1 in its tree, such that
+any other information, e.g. where to obtain a copy from, is not recorded
+in the core data structures of Git. The porcelain layer of Git however
+makes use of the .gitmodules file that gives strong hints where and how
+to obtain a copy of the submodules git repository from.
+
+On the location of the git directory
+------------------------------------
+
+Since v1.7.7 of Git, the git directory of submodules is stored inside the
+superprojects git directory at $GIT_DIR/modules/<submodule-name>
+This location allows for the working tree to be non existent while keeping
+the history around. So we can use git-rm on a submodule without loosing
+information that may only be local.
+
+In the future we may see git-checkout that can checkout submodules and
+revisions that do not contain the submodule can still be checked out without
+having to drop the submodules git directory.
+
+It is also possible to imagine a future in which a bare repository still
+contains its submodules inside the modules sub directory, such that you can
+get a full clone including submodules from that bare repository, the URLs
+as configured or given in the .gitmodules would only be used as a backup.
+
+SAMPLE WORKFLOWS (RFC/TODO)
+---------------------------
+
+Do we need
+
+* an opinionated way to check for a specific state of a submodule
+* (submodule helper to be plumbing?)
+* expose the design mistake of having the (name->path) mapping inside the
+  working tree, i.e. never remove a name from the submodule config even when
+  the submodule doesn't exist any more.
+
+SEE ALSO
+--------
+linkgit:git-submodule[1], linkgit:gitmodules[1].
+
+GIT
+---
+Part of the linkgit:git[1] suite
-- 
2.12.0.rc0.1.g018cb5e6f4


^ permalink raw reply related

* Re: What's cooking in git.git (Feb 2017, #02; Mon, 6)
From: Junio C Hamano @ 2017-02-09  5:09 UTC (permalink / raw)
  To: Jeff King; +Cc: Siddharth Kannan, git
In-Reply-To: <20170209034657.qbkzbbzuvjpxl422@sigill.intra.peff.net>

Jeff King <peff@peff.net> writes:

> On Mon, Feb 06, 2017 at 02:34:08PM -0800, Junio C Hamano wrote:
>
>> * sk/parse-remote-cleanup (2017-02-06) 1 commit
>>   (merged to 'next' on 2017-02-06 at 6ec89f72d5)
>>  + parse-remote: remove reference to unused op_prep
>> 
>>  Code clean-up.
>> 
>>  Will merge to 'master'.
>
> Hrm. Are the functions in git-parse-remote.sh part of the public API?
> That is, do we expect third-party scripts to do:
>
>   . "$(git rev-parse --exec)/git-parse-remote.sh
>   error_on_missing_default_upstream "$a" "$b" "$c" "$d"
>
> ? If so, then they may be surprised by the change in function signature.
>
> I generally think of git-sh-setup as the one that external scripts would
> use. There _is_ a manpage for git-parse-remote, but it doesn't list any
> functions. So maybe they're all fair game for changing?
>
> I just didn't see any discussion of this in the original patch thread,
> so I wanted to make sure we were making that decision consciously, and
> not accidentally. :)

Ummm, yes, I admit that this was accidental.  I didn't really think
of parse-remote as an externally visible and supported interface,
but users have tendency to break our expectations, so, I dunno.

^ permalink raw reply

* Re: [PATCH v2 2/2] grep: use '/' delimiter for paths
From: Junio C Hamano @ 2017-02-09  5:14 UTC (permalink / raw)
  To: Jeff King; +Cc: Stefan Hajnoczi, Brandon Williams, git
In-Reply-To: <20170209035839.wqsh6ibgnmxyjusi@sigill.intra.peff.net>

Jeff King <peff@peff.net> writes:

>   master:a:a:a:a:a:a:a:a:a:a:a
>
> I think there are 2^(n-1) possible paths (each colon can be a real colon
> or a slash). Though I guess if you walk the trees as you go, you only
> have to examine at most "n" paths to find the first-level tree, and then
> at most "n-1" paths at the second level, and so on.
>
> Unless you really do have ambiguous trees, in which case you have to
> walk down multiple paths.
>
> It certainly would not be the first combinatoric explosion you can
> convince Git to perform. But it does seem like a lot of complication for
> something as simple as path lookups.

That is true, and we may want to avoid the implementation complexity
of the backtracking name resolution.  If you are on the other hand
worried about the runtime cost, it will be an issue to begin with
only for those who do "git grep -e pattern HEAD:t/perf", which is an
unnatural way to do "git grep -e pattern HEAD -- t/perf", and the
output from the latter won't have such an issue, so...

^ permalink raw reply

* Re: [PATCH 1/2] refs.c: add resolve_ref_submodule()
From: Michael Haggerty @ 2017-02-09  5:20 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy, git; +Cc: Junio C Hamano
In-Reply-To: <20170208113144.8201-2-pclouds@gmail.com>

On 02/08/2017 12:31 PM, Nguyễn Thái Ngọc Duy wrote:
> This is basically the extended version of resolve_gitlink_ref() where we
> have access to more info from the underlying resolve_ref_recursively() call.
> 
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
>  refs.c | 20 ++++++++++++++------
>  refs.h |  3 +++
>  2 files changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/refs.c b/refs.c
> index cd36b64ed9..02e35d83f3 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -1325,18 +1325,18 @@ const char *resolve_ref_unsafe(const char *refname, int resolve_flags,
>  				       resolve_flags, sha1, flags);
>  }
>  
> -int resolve_gitlink_ref(const char *submodule, const char *refname,
> -			unsigned char *sha1)
> +const char *resolve_ref_submodule(const char *submodule, const char *refname,
> +				  int resolve_flags, unsigned char *sha1,
> +				  int *flags)
>  {
>  	size_t len = strlen(submodule);
>  	struct ref_store *refs;
> -	int flags;
>  
>  	while (len && submodule[len - 1] == '/')
>  		len--;
>  
>  	if (!len)
> -		return -1;
> +		return NULL;
>  
>  	if (submodule[len]) {
>  		/* We need to strip off one or more trailing slashes */
> @@ -1349,9 +1349,17 @@ int resolve_gitlink_ref(const char *submodule, const char *refname,
>  	}
>  
>  	if (!refs)
> -		return -1;
> +		return NULL;
> +
> +	return resolve_ref_recursively(refs, refname, resolve_flags, sha1, flags);
> +}
> +
> +int resolve_gitlink_ref(const char *submodule, const char *refname,
> +			unsigned char *sha1)
> +{
> +	int flags;
>  
> -	if (!resolve_ref_recursively(refs, refname, 0, sha1, &flags) ||
> +	if (!resolve_ref_submodule(submodule, refname, 0, sha1, &flags) ||
>  	    is_null_sha1(sha1))
>  		return -1;
>  	return 0;
> diff --git a/refs.h b/refs.h
> index 9fbff90e79..74542468d8 100644
> --- a/refs.h
> +++ b/refs.h
> @@ -88,6 +88,9 @@ int peel_ref(const char *refname, unsigned char *sha1);
>   */
>  int resolve_gitlink_ref(const char *submodule, const char *refname,
>  			unsigned char *sha1);
> +const char *resolve_ref_submodule(const char *submodule, const char *refname,
> +				  int resolve_flags, unsigned char *sha1,
> +				  int *flags);

This function is the analog of resolve_ref_unsafe(); i.e., it returns a
pointer to either a static buffer or a pointer into the refname
argument. Therefore, I think it should have "unsafe" in its name. And/or
maybe there should be a safe version of the function analogous to
resolve_refdup().

Moreover, this function has inherited the code for stripping trailing
slashes from the submodule name. I have the feeling that this is a wart,
not a feature, and that it would be sad to see it spread. How about
moving the slash-stripping code to resolve_gitlink_ref() and making
resolve_ref_submodule() assume that its submodule name is already clean?

It would be nice to have a docstring here.

I also have some higher-level concerns about the approach of this patch
series, which I'll write about in a comment to patch 2/2.

Michael


^ permalink raw reply

* Re: [PATCH v2 2/2] grep: use '/' delimiter for paths
From: Jeff King @ 2017-02-09  5:20 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Stefan Hajnoczi, Brandon Williams, git
In-Reply-To: <xmqqtw84rlna.fsf@gitster.mtv.corp.google.com>

On Wed, Feb 08, 2017 at 09:14:17PM -0800, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> >   master:a:a:a:a:a:a:a:a:a:a:a
> >
> > I think there are 2^(n-1) possible paths (each colon can be a real colon
> > or a slash). Though I guess if you walk the trees as you go, you only
> > have to examine at most "n" paths to find the first-level tree, and then
> > at most "n-1" paths at the second level, and so on.
> >
> > Unless you really do have ambiguous trees, in which case you have to
> > walk down multiple paths.
> >
> > It certainly would not be the first combinatoric explosion you can
> > convince Git to perform. But it does seem like a lot of complication for
> > something as simple as path lookups.
> 
> That is true, and we may want to avoid the implementation complexity
> of the backtracking name resolution.  If you are on the other hand
> worried about the runtime cost, it will be an issue to begin with
> only for those who do "git grep -e pattern HEAD:t/perf", which is an
> unnatural way to do "git grep -e pattern HEAD -- t/perf", and the
> output from the latter won't have such an issue, so...

I thought your point was to move it into the get_sha1() parser (so that
while the form is only generated by "git grep", it can be accepted by
any git command). That exposes it in a lot of places, including ones
which are network accessible to things like gitweb (or GitHub, of
course, which is my concern).

Even without the runtime cost, though, I think the general complexity
makes it an ugly path to go down (e.g., handling ambiguous cases). I
wouldn't want to have to write the documentation for it. :)

(I _do_ think Stefan's proposed direction is worth it simply because the
result is easier to read, but I agree the whole thing can be avoided by
using pathspecs, as you've noted).

-Peff

^ permalink raw reply

* Re: [PATCH 2/2] worktree.c: use submodule interface to access refs from another worktree
From: Michael Haggerty @ 2017-02-09  6:07 UTC (permalink / raw)
  To: Nguyễn Thái Ngọc Duy, git; +Cc: Junio C Hamano
In-Reply-To: <20170208113144.8201-3-pclouds@gmail.com>

On 02/08/2017 12:31 PM, Nguyễn Thái Ngọc Duy wrote:
> The patch itself is relatively simple: manual parsing code is replaced
> with a call to resolve_ref_submodule(). The manual parsing code must die
> because only refs/files-backend.c should do that. Why submodule here is
> a more interesting question.
> 
> From an outside look, any .git/worktrees/foo is seen as a "normal"
> repository. You can set GIT_DIR to it and have access to everything,
> even shared things that are not literally inside that directory, like
> object db or shared refs.
> 
> On top of that, linked worktrees point to those directories with ".git"
> files. These two make a linked worktree's path "X" a "submodule" (*) (**)
> because X/.git is a file that points to a repository somewhere.
> 
> As such, we can just linked worktree's path as a submodule. We just need
> to make sure they are unique because they are used to lookup submodule
> refs store.
> 
> Main worktree is a a bit trickier. If we stand at a linked worktree, we
> may still need to peek into main worktree's HEAD, for example. We can
> treat main worktree's path as submodule as well since git_path_submodule()
> can tolerate ".git" dirs, in addition to ".git" files.
> 
> The constraint is, if main worktree is X, then the git repo must be at
> X/.git. If the user separates .git repo far away and tell git to point
> to it via GIT_DIR or something else, then the "main worktree as submodule"
> trick fails. Within multiple worktree context, I think we can limit
> support to "standard" layout, at least for now.
> 
> (*) The differences in sharing object database and refs between
> submodules and linked worktrees don't really matter in this context.
> 
> (**) At this point, we may want to rename refs *_submodule API to
> something more neutral, maybe s/_submodule/_remote/

It is unquestionably a good goal to avoid parsing references outside of
`refs/files-backend.c`. But I'm not a fan of this approach.

There are two meanings of the concept of a "ref store", and I think this
change muddles them:

1. The references that happen to be *physically* stored in a particular
   location, for example the `refs/bisect/*` references in a worktree.

2. The references that *logically* should be considered part of a
   particular repository. This might require stitching together
   references from multiple sources, for example `HEAD` and
   `refs/bisect` from a worktree's own directory with other
   references from the main repository.

Either of these concepts can be implemented via the `ref_store` abstraction.

The `ref_store` for a submodule should represent the references
logically visible from the submodule. The main program shouldn't care
whether the references are stored in a single physical location or
spread across multiple locations (for example, if the submodule were
itself a linked worktree).

The `ref_store` that you want here for a worktree is not the worktree's
*logical* `ref_store`. You want the worktree's *physical* `ref_store`.
Mixing logical and physical reference stores together is a bad idea
(even if we were willing to ignore the fact that worktrees are not
submodules in the accepted sense of the word).

The point of my `submodule-hash` branch [1] was to separate these
concepts better by breaking the current 1:1 connection between
`ref_store`s and submodules. This would allow `ref_store`s to be created
for other purposes, such as to represent worktree refs. If you want the
*logical* `ref_store` for a submodule, you access it through the
`submodule_ref_stores` table. If you want the *physical* `ref_store` for
a worktree, you should access it through a different table.

I think the best solution would be to expose the concept of `ref_store`
in the public refs API. Then users of submodules would essentially do

    struct ref_store *refs = get_submodule_refs(submodule_path);
    ... resolve_ref_recursively(refs, refname, 0, sha1, &flags) ...
    ... for_each_ref(refs, fn, cb_data) ...

whereas for a worktree you'd have to look up the `ref_store` instance
somewhere else (or maybe keep it as part of some worktree structure, if
there is one) but you would use it via the same API.

Michael

[1] https://github.com/mhagger/git

> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
>  branch.c   |  3 +-
>  worktree.c | 99 +++++++++++++++-----------------------------------------------
>  worktree.h |  2 +-
>  3 files changed, 27 insertions(+), 77 deletions(-)
> 
> diff --git a/branch.c b/branch.c
> index b955d4f316..db5843718f 100644
> --- a/branch.c
> +++ b/branch.c
> @@ -354,7 +354,8 @@ int replace_each_worktree_head_symref(const char *oldref, const char *newref)
>  	for (i = 0; worktrees[i]; i++) {
>  		if (worktrees[i]->is_detached)
>  			continue;
> -		if (strcmp(oldref, worktrees[i]->head_ref))
> +		if (worktrees[i]->head_ref &&
> +		    strcmp(oldref, worktrees[i]->head_ref))
>  			continue;
>  
>  		if (set_worktree_head_symref(get_worktree_git_dir(worktrees[i]),
> diff --git a/worktree.c b/worktree.c
> index d633761575..25e5bc9a3e 100644
> --- a/worktree.c
> +++ b/worktree.c
> @@ -19,54 +19,24 @@ void free_worktrees(struct worktree **worktrees)
>  	free (worktrees);
>  }
>  
> -/*
> - * read 'path_to_ref' into 'ref'.  Also if is_detached is not NULL,
> - * set is_detached to 1 (0) if the ref is detached (is not detached).
> - *
> - * $GIT_COMMON_DIR/$symref (e.g. HEAD) is practically outside $GIT_DIR so
> - * for linked worktrees, `resolve_ref_unsafe()` won't work (it uses
> - * git_path). Parse the ref ourselves.
> - *
> - * return -1 if the ref is not a proper ref, 0 otherwise (success)
> - */
> -static int parse_ref(char *path_to_ref, struct strbuf *ref, int *is_detached)
> -{
> -	if (is_detached)
> -		*is_detached = 0;
> -	if (!strbuf_readlink(ref, path_to_ref, 0)) {
> -		/* HEAD is symbolic link */
> -		if (!starts_with(ref->buf, "refs/") ||
> -				check_refname_format(ref->buf, 0))
> -			return -1;
> -	} else if (strbuf_read_file(ref, path_to_ref, 0) >= 0) {
> -		/* textual symref or detached */
> -		if (!starts_with(ref->buf, "ref:")) {
> -			if (is_detached)
> -				*is_detached = 1;
> -		} else {
> -			strbuf_remove(ref, 0, strlen("ref:"));
> -			strbuf_trim(ref);
> -			if (check_refname_format(ref->buf, 0))
> -				return -1;
> -		}
> -	} else
> -		return -1;
> -	return 0;
> -}
> -
>  /**
> - * Add the head_sha1 and head_ref (if not detached) to the given worktree
> + * Update head_sha1, head_ref and is_detached of the given worktree
>   */
> -static void add_head_info(struct strbuf *head_ref, struct worktree *worktree)
> +static void add_head_info(struct worktree *wt)
>  {
> -	if (head_ref->len) {
> -		if (worktree->is_detached) {
> -			get_sha1_hex(head_ref->buf, worktree->head_sha1);
> -		} else {
> -			resolve_ref_unsafe(head_ref->buf, 0, worktree->head_sha1, NULL);
> -			worktree->head_ref = strbuf_detach(head_ref, NULL);
> -		}
> -	}
> +	int flags;
> +	const char *target;
> +
> +	target = resolve_ref_submodule(wt->path, "HEAD",
> +				       RESOLVE_REF_READING,
> +				       wt->head_sha1, &flags);
> +	if (!target)
> +		return;
> +
> +	if (flags & REF_ISSYMREF)
> +		wt->head_ref = xstrdup(target);
> +	else
> +		wt->is_detached = 1;
>  }
>  
>  /**
> @@ -77,9 +47,7 @@ static struct worktree *get_main_worktree(void)
>  	struct worktree *worktree = NULL;
>  	struct strbuf path = STRBUF_INIT;
>  	struct strbuf worktree_path = STRBUF_INIT;
> -	struct strbuf head_ref = STRBUF_INIT;
>  	int is_bare = 0;
> -	int is_detached = 0;
>  
>  	strbuf_add_absolute_path(&worktree_path, get_git_common_dir());
>  	is_bare = !strbuf_strip_suffix(&worktree_path, "/.git");
> @@ -91,13 +59,10 @@ static struct worktree *get_main_worktree(void)
>  	worktree = xcalloc(1, sizeof(*worktree));
>  	worktree->path = strbuf_detach(&worktree_path, NULL);
>  	worktree->is_bare = is_bare;
> -	worktree->is_detached = is_detached;
> -	if (!parse_ref(path.buf, &head_ref, &is_detached))
> -		add_head_info(&head_ref, worktree);
> +	add_head_info(worktree);
>  
>  	strbuf_release(&path);
>  	strbuf_release(&worktree_path);
> -	strbuf_release(&head_ref);
>  	return worktree;
>  }
>  
> @@ -106,8 +71,6 @@ static struct worktree *get_linked_worktree(const char *id)
>  	struct worktree *worktree = NULL;
>  	struct strbuf path = STRBUF_INIT;
>  	struct strbuf worktree_path = STRBUF_INIT;
> -	struct strbuf head_ref = STRBUF_INIT;
> -	int is_detached = 0;
>  
>  	if (!id)
>  		die("Missing linked worktree name");
> @@ -127,19 +90,14 @@ static struct worktree *get_linked_worktree(const char *id)
>  	strbuf_reset(&path);
>  	strbuf_addf(&path, "%s/worktrees/%s/HEAD", get_git_common_dir(), id);
>  
> -	if (parse_ref(path.buf, &head_ref, &is_detached) < 0)
> -		goto done;
> -
>  	worktree = xcalloc(1, sizeof(*worktree));
>  	worktree->path = strbuf_detach(&worktree_path, NULL);
>  	worktree->id = xstrdup(id);
> -	worktree->is_detached = is_detached;
> -	add_head_info(&head_ref, worktree);
> +	add_head_info(worktree);
>  
>  done:
>  	strbuf_release(&path);
>  	strbuf_release(&worktree_path);
> -	strbuf_release(&head_ref);
>  	return worktree;
>  }
>  
> @@ -334,8 +292,6 @@ const struct worktree *find_shared_symref(const char *symref,
>  					  const char *target)
>  {
>  	const struct worktree *existing = NULL;
> -	struct strbuf path = STRBUF_INIT;
> -	struct strbuf sb = STRBUF_INIT;
>  	static struct worktree **worktrees;
>  	int i = 0;
>  
> @@ -345,6 +301,10 @@ const struct worktree *find_shared_symref(const char *symref,
>  
>  	for (i = 0; worktrees[i]; i++) {
>  		struct worktree *wt = worktrees[i];
> +		const char *symref_target;
> +		unsigned char sha1[20];
> +		int flags;
> +
>  		if (wt->is_bare)
>  			continue;
>  
> @@ -359,25 +319,14 @@ const struct worktree *find_shared_symref(const char *symref,
>  			}
>  		}
>  
> -		strbuf_reset(&path);
> -		strbuf_reset(&sb);
> -		strbuf_addf(&path, "%s/%s",
> -			    get_worktree_git_dir(wt),
> -			    symref);
> -
> -		if (parse_ref(path.buf, &sb, NULL)) {
> -			continue;
> -		}
> -
> -		if (!strcmp(sb.buf, target)) {
> +		symref_target = resolve_ref_submodule(wt->path, symref, 0,
> +						      sha1, &flags);
> +		if ((flags & REF_ISSYMREF) && !strcmp(symref_target, target)) {
>  			existing = wt;
>  			break;
>  		}
>  	}
>  
> -	strbuf_release(&path);
> -	strbuf_release(&sb);
> -
>  	return existing;
>  }
>  
> diff --git a/worktree.h b/worktree.h
> index 6bfb985203..5ea5e503fb 100644
> --- a/worktree.h
> +++ b/worktree.h
> @@ -4,7 +4,7 @@
>  struct worktree {
>  	char *path;
>  	char *id;
> -	char *head_ref;
> +	char *head_ref;		/* NULL if HEAD is broken or detached */
>  	char *lock_reason;	/* internal use */
>  	unsigned char head_sha1[20];
>  	int is_detached;
> 


^ permalink raw reply

* Re: [PATCH 2/2] worktree.c: use submodule interface to access refs from another worktree
From: Junio C Hamano @ 2017-02-09  6:55 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Nguyễn Thái Ngọc Duy, git
In-Reply-To: <37fe2024-0378-a974-a28d-18a89d3e2312@alum.mit.edu>

Michael Haggerty <mhagger@alum.mit.edu> writes:

> There are two meanings of the concept of a "ref store", and I think this
> change muddles them:
>
> 1. The references that happen to be *physically* stored in a particular
>    location, for example the `refs/bisect/*` references in a worktree.
>
> 2. The references that *logically* should be considered part of a
>    particular repository. This might require stitching together
>    references from multiple sources, for example `HEAD` and
>    `refs/bisect` from a worktree's own directory with other
>    references from the main repository.
>
> Either of these concepts can be implemented via the `ref_store` abstraction.
> ...
> The `ref_store` that you want here for a worktree is not the worktree's
> *logical* `ref_store`. You want the worktree's *physical* `ref_store`.
> Mixing logical and physical reference stores together is a bad idea
> (even if we were willing to ignore the fact that worktrees are not
> submodules in the accepted sense of the word).

I am not quite sure what mental model you are suggesting as a
preferred solution.  We can

 - represent a set of refs stored for a particular worktree
   (i.e. HEAD, refs/bisect, and refs/worktree/<name>, iirc), as
   bunch of ref_stores;

 - represent a set of refs shared across a set of worktrees that
   share the primary one, as another ref_store;

 - a caller who wants to get a "logical" view of a single worktree
   user can pick one of the first kind and union that with the
   second one, and represent the result as a (synthetic) ref_store.

The third one is "stitching together from multiple sources".  By
"mixing logical and physical is a bad idea", do you mean that the
same abstraction "ref_store" should not be used for the first two
(which are physical) and the third one (which is logical)?  Do you
want to call the first two "physical_ref_store"and the last one
"ref_store" and keep them distinct?

For the purpose of anchoring objects in the object store shared by
multiple worktrees, we can either iterate over all the ref_stores
of the first two kind, or iterate over all the ref_stores of the
third kind for all worktrees.  The latter of course is less
efficient as the enumeration

	for worktree in all worktrees:
		for ref in get_ref_store(worktree)
			mark tip of ref reachable

will work on all the shared refs multiple times, but as an
abstraction that may be simpler.  The alternative of working at the
physical level would be more efficient

	for worktree in all worktrees:
		for ref in get_ref_store_specific_to_worktree(worktree):
	        	mark tip of ref reachable
	for ref in get_ref_store_shared_across_worktrees():
        	mark tip of ref reachable

but this consumer now _knows_ how the logical ref_store of a
worktree is constructed (i.e. by combining the two ref_stores),
which appears as a layering violation.

I am however not sure if these issues are what you are driving at,
and what exact design you are suggesting.

^ permalink raw reply

* Re: "disabling bitmap writing, as some objects are not being packed"?
From: David Turner @ 2017-02-08 19:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Duy Nguyen, Git Mailing List, Jeff King
In-Reply-To: <xmqqtw84wpag.fsf@gitster.mtv.corp.google.com>

On Wed, 2017-02-08 at 09:44 -0800, Junio C Hamano wrote:
> Duy Nguyen <pclouds@gmail.com> writes:
> 
> > On second thought, perhaps gc.autoDetach should default to false if
> > there's no tty, since its main point it to stop breaking interactive
> > usage. That would make the server side happy (no tty there).
> 
> Sounds like an idea, but wouldn't that keep the end-user coming over
> the network waiting after accepting a push until the GC completes, I
> wonder.  If an impatient user disconnects, would that end up killing
> an ongoing GC?  etc.

Regardless, it's impolite to keep the user waiting. So, I think we
should just not write the "too many unreachable loose objects" message
if auto-gc is on.  Does that sound OK?

^ permalink raw reply

* Re: [PATCH 2/2] worktree.c: use submodule interface to access refs from another worktree
From: Michael Haggerty @ 2017-02-09  8:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nguyễn Thái Ngọc Duy, git
In-Reply-To: <xmqqpoirsvin.fsf@gitster.mtv.corp.google.com>

On 02/09/2017 07:55 AM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
> 
>> There are two meanings of the concept of a "ref store", and I think this
>> change muddles them:
>>
>> 1. The references that happen to be *physically* stored in a particular
>>    location, for example the `refs/bisect/*` references in a worktree.
>>
>> 2. The references that *logically* should be considered part of a
>>    particular repository. This might require stitching together
>>    references from multiple sources, for example `HEAD` and
>>    `refs/bisect` from a worktree's own directory with other
>>    references from the main repository.
>>
>> Either of these concepts can be implemented via the `ref_store` abstraction.
>> ...
>> The `ref_store` that you want here for a worktree is not the worktree's
>> *logical* `ref_store`. You want the worktree's *physical* `ref_store`.
>> Mixing logical and physical reference stores together is a bad idea
>> (even if we were willing to ignore the fact that worktrees are not
>> submodules in the accepted sense of the word).
> 
> I am not quite sure what mental model you are suggesting as a
> preferred solution.  We can
> 
>  - represent a set of refs stored for a particular worktree
>    (i.e. HEAD, refs/bisect, and refs/worktree/<name>, iirc), as
>    bunch of ref_stores;
> 
>  - represent a set of refs shared across a set of worktrees that
>    share the primary one, as another ref_store;
> 
>  - a caller who wants to get a "logical" view of a single worktree
>    user can pick one of the first kind and union that with the
>    second one, and represent the result as a (synthetic) ref_store.
> 
> The third one is "stitching together from multiple sources".  By
> "mixing logical and physical is a bad idea", do you mean that the
> same abstraction "ref_store" should not be used for the first two
> (which are physical) and the third one (which is logical)?  Do you
> want to call the first two "physical_ref_store"and the last one
> "ref_store" and keep them distinct?

The existing `ref_store` abstraction, I think, is capable of
representing either kind of reference store. The stitching together to
get the "logical" view of a worktree should probably happen within the
refs code rather than forcing callers to deal with it. But yes, I think
that code should put together a compound `ref_store` object that
delegates to multiple underlying `ref_store` objects as you've described.

Which kind of `ref_store *` you have in your hand would depend on where
you got it. If you call the hypothetical `get_submodule_refs()`
function, you would get a `ref_store *` representing the references that
are logically visible from that submodule. There might be a separate
`get_worktree_specific_refs()` that returns a `ref_store *` representing
the worktree-specific references physically stored for the worktree. But
maybe the latter is not even necessary; see below.

> For the purpose of anchoring objects in the object store shared by
> multiple worktrees, we can either iterate over all the ref_stores
> of the first two kind, or iterate over all the ref_stores of the
> third kind for all worktrees.  The latter of course is less
> efficient as the enumeration
> 
> 	for worktree in all worktrees:
> 		for ref in get_ref_store(worktree)
> 			mark tip of ref reachable
> 
> will work on all the shared refs multiple times, but as an
> abstraction that may be simpler.  The alternative of working at the
> physical level would be more efficient
> 
> 	for worktree in all worktrees:
> 		for ref in get_ref_store_specific_to_worktree(worktree):
> 	        	mark tip of ref reachable
> 	for ref in get_ref_store_shared_across_worktrees():
>         	mark tip of ref reachable
> 
> but this consumer now _knows_ how the logical ref_store of a
> worktree is constructed (i.e. by combining the two ref_stores),
> which appears as a layering violation.
> 
> I am however not sure if these issues are what you are driving at,
> and what exact design you are suggesting.

Reachability is a special case, because it needs all of the references
that refer to a particular object store, even though the reference names
might overlap. I personally think that reachability roots should be
requested via a new refs API call separate from `for_each_rawref()` (or
whatever is used now). Internally it would be implemented much like your
second "efficient" algorithm, but the implementation would be within the
refs code, and the caller could remain ignorant of those details.

Externally, it might not even want to pass the caller the real reference
names (I assume that callers mainly only use the reference names for
diagnostic messages). For example, it might want to report references
`HEAD` and `refs/bisect/bad` in worktree `foo` under the pseudonyms
`worktree/foo/HEAD` and `worktree/foo/refs/bisect/bad`, so that they can
be distinguished from any homonyms in the main repo and in other
worktrees. If you ask for the reachability roots while in a worktree, it
would either automatically crawl up to the main repository and across to
sibling worktrees to get the full set of reachability roots, or maybe it
would refuse to run at all (if we want to require such commands to be
executed from the main repo).

I don't know exactly who would be the consumers of the reachability
roots, so maybe there are some problems with this suggestion.

Michael

^ permalink raw reply

* Re: GSoC 2017: application open, deadline = February 9, 2017
From: Christian Couder @ 2017-02-09  9:42 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Jeff King, git, Pranit Bauva, Lars Schneider,
	Carlos Martín Nieto, Johannes Schindelin, Thomas Gummerer,
	Siddharth Kannan
In-Reply-To: <vpq37fowx5q.fsf@anie.imag.fr>

On Wed, Feb 8, 2017 at 3:54 PM, Matthieu Moy
<Matthieu.Moy@grenoble-inp.fr> wrote:
> Jeff King <peff@peff.net> writes:
>
>> On Mon, Jan 23, 2017 at 04:02:02PM +0100, Matthieu Moy wrote:
>>
>>> * We need to write the application, i.e. essentially polish and update
>>>   the text here: https://git.github.io/SoC-2016-Org-Application/ and
>>>   update the list of project ideas and microprojects :
>>>   https://git.github.io/SoC-2017-Ideas/
>>>   https://git.github.io/SoC-2016-Microprojects/
>>
>> That can be done incrementally by people who care (especially mentors)
>> over the next week or so, and doesn't require any real admin
>> coordination. If it happens and the result looks good, then the
>> application process is pretty straightforward.
>>
>> If it doesn't, then we probably ought not to participate in GSoC.
>
> OK, it seems the last message did not raise a lot of enthousiasm (unless
> I missed some off-list discussion at Git-Merge?).

I think having 2 possible mentors or co-mentors still shows some
enthousiasm even if I agree it's unfortunate there is not more
enthousiasm.

> The application deadline is tomorrow. I think it's time to admit that we
> won't participate this year, unless someone steps in really soon.

Someone steps in to do what exactly?

I just had a look and the microproject and idea pages for this year are ok.
They are not great sure, but not much worse than the previous years.
What should probably be done is to remove project ideas where is no
"possible mentor" listed.
But I am reluctant to do that as I don't know what Dscho would be ok to mentor.

Also please note that you sent this email just the day before the deadline.
I know that you sent a previous email three weeks ago, but people
easily forget this kind of deadline when they are not often reminded.
(And there is a school vacation is France right now so I am having a
vacation in Alps with unfortunately quite bad Internet access.)

> If we don't participate, I'll add a disclaimer at the top of the
> SoC-related pages on git.github.io to make sure students don't waste
> time preparing an application.

Please submit our application like this.

Thanks,
Christian.

^ permalink raw reply

* Re: [PATCH] rev-parse --git-path: fix output when running in a subdirectory
From: Duy Nguyen @ 2017-02-09  9:48 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Git Mailing List, Junio C Hamano
In-Reply-To: <50fe3ea3302c40f4c96eaa5a568837e3334f9dc4.1486555851.git.johannes.schindelin@gmx.de>

On Wed, Feb 8, 2017 at 7:17 PM, Johannes Schindelin
<johannes.schindelin@gmx.de> wrote:
> In addition to making git_path() aware of certain file names that need
> to be handled differently e.g. when running in worktrees, the commit
> 557bd833bb (git_path(): be aware of file relocation in $GIT_DIR,
> 2014-11-30) also snuck in a new option for `git rev-parse`:
> `--git-path`.
>
> On the face of it, there is no obvious bug in that commit's diff: it
> faithfully calls git_path() on the argument and prints it out, i.e. `git
> rev-parse --git-path <filename>` has the same precise behavior as
> calling `git_path("<filename>")` in C.
>
> The problem lies deeper, much deeper. In hindsight (which is always
> unfair), implementing the .git/ directory discovery in
> `setup_git_directory()` by changing the working directory may have
> allowed us to avoid passing around a struct that contains information
> about the current repository, but it bought us many, many problems.

Relevant thread in the past [1] which fixes both --git-path and
--git-common-dir. I think the author dropped it somehow (or forgot
about it, I know I did). Sorry can't comment on that thread, or this
patch, yet.

[1] http://public-inbox.org/git/1464261556-89722-1-git-send-email-rappazzo@gmail.com/
-- 
Duy

^ permalink raw reply

* Re: Automatically Add .gitignore Files
From: Duy Nguyen @ 2017-02-09 10:03 UTC (permalink / raw)
  To: Thangalin; +Cc: Git Mailing List
In-Reply-To: <CAANrE7rmUZcJkw+thMczv3D=7sqcUHBsorzvEZgYg=6AEfrU=w@mail.gmail.com>

On Thu, Feb 9, 2017 at 2:05 AM, Thangalin <thangalin@gmail.com> wrote:
> I frequently forget to add .gitignore files when creating new .gitignore files.
>
> I'd like to request a command-line option to always add .gitignore
> files (or, more generally, always add files that match a given file
> specification).
>
> Replicate
>
> 0. git init ...
> 1. echo "*.bak" >> .gitignore
> 2. touch file.txt
> 3. git add file.txt
> 4. git commit -a -m "..."
> 5. git push origin master
>
> Expected Results
>
> The .gitignore file is also added to the repository. (This is probably
> the 80% use case.)

This is a general problem to new files, not .gitignore alone. Can we
accomplish something with some hook? At the least I think we should be
able to detect that .gitignore is not detected and abort, prompting
the user to add it. It's easier to customize too, and we don't have to
cook ".gitignore" in the code.

I'm not sure if we tell the hook "this is with -m option" though..
-- 
Duy

^ permalink raw reply

* Re: GSoC 2017: application open, deadline = February 9, 2017
From: Matthieu Moy @ 2017-02-09 10:15 UTC (permalink / raw)
  To: Christian Couder
  Cc: Jeff King, git, Pranit Bauva, Lars Schneider,
	Carlos Martín Nieto, Johannes Schindelin, Thomas Gummerer,
	Siddharth Kannan
In-Reply-To: <CAP8UFD3aygSf5U2abnpCfRzEf-hH5fSNuzFBBtgCjSQC3F8c5A@mail.gmail.com>

Christian Couder <christian.couder@gmail.com> writes:

> On Wed, Feb 8, 2017 at 3:54 PM, Matthieu Moy
> <Matthieu.Moy@grenoble-inp.fr> wrote:
>> Jeff King <peff@peff.net> writes:
>>
>>> On Mon, Jan 23, 2017 at 04:02:02PM +0100, Matthieu Moy wrote:
>>>
>>>> * We need to write the application, i.e. essentially polish and update
>>>>   the text here: https://git.github.io/SoC-2016-Org-Application/ and
>>>>   update the list of project ideas and microprojects :
>>>>   https://git.github.io/SoC-2017-Ideas/
>>>>   https://git.github.io/SoC-2016-Microprojects/
>>>
>>> That can be done incrementally by people who care (especially mentors)
>>> over the next week or so, and doesn't require any real admin
>>> coordination. If it happens and the result looks good, then the
>>> application process is pretty straightforward.
>>>
>>> If it doesn't, then we probably ought not to participate in GSoC.
>>
>> OK, it seems the last message did not raise a lot of enthousiasm (unless
>> I missed some off-list discussion at Git-Merge?).
>
> I think having 2 possible mentors or co-mentors still shows some
> enthousiasm even if I agree it's unfortunate there is not more
> enthousiasm.

A non-quoted but yet important part of my initial email was:

| So, as much as possible, I'd like to avoid being an org admin this
| year. It's not a lot of work (much, much less than being a mentor!),
| but if I manage to get some time to work for Git, I'd rather do that
| on coding and reviewing this year.

and for now, no one stepped in to admin.

Other non-negligible sources of work are reviewing microprojects and
applications. Having a few more messages in this thread would have been
a good hint that we had volunteers to do that.

> Someone steps in to do what exactly?

First we need an admin. Then as you said a bit of janitoring work on
the web pages.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply

* Re: GSoC 2017: application open, deadline = February 9, 2017
From: Siddharth Kannan @ 2017-02-09 10:28 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Christian Couder, Jeff King, git, Pranit Bauva, Lars Schneider,
	Carlos Martín Nieto, Johannes Schindelin, Thomas Gummerer
In-Reply-To: <vpqzihvpt41.fsf@anie.imag.fr>

On 9 February 2017 at 15:45, Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> wrote:
>
> A non-quoted but yet important part of my initial email was:
>
> | So, as much as possible, I'd like to avoid being an org admin this
> | year. It's not a lot of work (much, much less than being a mentor!),
> | but if I manage to get some time to work for Git, I'd rather do that
> | on coding and reviewing this year.
>
> and for now, no one stepped in to admin.

I would like to point everyone to this reply from Jeff King on the
original post: [1]
(In case this was lost in the midst of other emails) It sounds like
Jeff King is okay
with taking up the "admin" role.

    I do not mind doing the administrative stuff.  But the real work is in
    polishing up the ideas list and microprojects page. So I think the first
    step, if people are interested in GSoC, is not just to say "yes, let's
    do it", but to actually flesh out these pages:

>
>> Someone steps in to do what exactly?
>
> First we need an admin. Then as you said a bit of janitoring work on
> the web pages.


[1]: https://public-inbox.org/git/20170125204504.ebw2sa4uokfwwfnt@sigill.intra.peff.net/

-- 

Best Regards,

- Siddharth.

^ permalink raw reply

* Re: GSoC 2017: application open, deadline = February 9, 2017
From: Christian Couder @ 2017-02-09 10:45 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Jeff King, git, Pranit Bauva, Lars Schneider,
	Carlos Martín Nieto, Johannes Schindelin, Thomas Gummerer,
	Siddharth Kannan
In-Reply-To: <vpqzihvpt41.fsf@anie.imag.fr>

On Thu, Feb 9, 2017 at 11:15 AM, Matthieu Moy
<Matthieu.Moy@grenoble-inp.fr> wrote:
> Christian Couder <christian.couder@gmail.com> writes:
>
>> On Wed, Feb 8, 2017 at 3:54 PM, Matthieu Moy
>> <Matthieu.Moy@grenoble-inp.fr> wrote:
>>> Jeff King <peff@peff.net> writes:
>>>
>>>> On Mon, Jan 23, 2017 at 04:02:02PM +0100, Matthieu Moy wrote:
>>>>
>>>>> * We need to write the application, i.e. essentially polish and update
>>>>>   the text here: https://git.github.io/SoC-2016-Org-Application/ and
>>>>>   update the list of project ideas and microprojects :
>>>>>   https://git.github.io/SoC-2017-Ideas/
>>>>>   https://git.github.io/SoC-2016-Microprojects/
>>>>
>>>> That can be done incrementally by people who care (especially mentors)
>>>> over the next week or so, and doesn't require any real admin
>>>> coordination. If it happens and the result looks good, then the
>>>> application process is pretty straightforward.
>>>>
>>>> If it doesn't, then we probably ought not to participate in GSoC.
>>>
>>> OK, it seems the last message did not raise a lot of enthousiasm (unless
>>> I missed some off-list discussion at Git-Merge?).
>>
>> I think having 2 possible mentors or co-mentors still shows some
>> enthousiasm even if I agree it's unfortunate there is not more
>> enthousiasm.
>
> A non-quoted but yet important part of my initial email was:
>
> | So, as much as possible, I'd like to avoid being an org admin this
> | year. It's not a lot of work (much, much less than being a mentor!),
> | but if I manage to get some time to work for Git, I'd rather do that
> | on coding and reviewing this year.
>
> and for now, no one stepped in to admin.

Well Peff wrote in reply to your email:

> I did co-admin last year and the year before, but I made Matthieu do all
> the work. :)
>
> I do not mind doing the administrative stuff. But the real work is in
> polishing up the ideas list and microprojects page.

So I thought Peff would be ok to be the admin (do "the administrative stuff").

> Other non-negligible sources of work are reviewing microprojects and
> applications. Having a few more messages in this thread would have been
> a good hint that we had volunteers to do that.

I don't think emails in this thread is what really counts.
I worked on the Idea page starting some months ago, and as I wrote I
reviewed it again and found it not too bad.

>> Someone steps in to do what exactly?
>
> First we need an admin. Then as you said a bit of janitoring work on
> the web pages.

About the janitoring part, as I previously said I am reluctant to do
that as I don't know what Dscho would be ok to mentor.
And I also think it's not absolutely necessary to do it before
applying as an org.

If you just want Peff or someone else to apply, then please just say
it and hopefully Peff will do it and be the admin.

^ permalink raw reply

* Re: GSoC 2017: application open, deadline = February 9, 2017
From: Christian Couder @ 2017-02-09 10:49 UTC (permalink / raw)
  To: Siddharth Kannan
  Cc: Matthieu Moy, Jeff King, git, Pranit Bauva, Lars Schneider,
	Carlos Martín Nieto, Johannes Schindelin, Thomas Gummerer
In-Reply-To: <CAN-3QhotVm-LmOJ4cuKCa2txYxFJMHY1aqbX1GznieQx57AR+A@mail.gmail.com>

On Thu, Feb 9, 2017 at 11:28 AM, Siddharth Kannan
<kannan.siddharth12@gmail.com> wrote:
> On 9 February 2017 at 15:45, Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> wrote:
>>
>> A non-quoted but yet important part of my initial email was:
>>
>> | So, as much as possible, I'd like to avoid being an org admin this
>> | year. It's not a lot of work (much, much less than being a mentor!),
>> | but if I manage to get some time to work for Git, I'd rather do that
>> | on coding and reviewing this year.
>>
>> and for now, no one stepped in to admin.
>
> I would like to point everyone to this reply from Jeff King on the
> original post: [1]
> (In case this was lost in the midst of other emails) It sounds like
> Jeff King is okay
> with taking up the "admin" role.
>
>     I do not mind doing the administrative stuff.  But the real work is in
>     polishing up the ideas list and microprojects page. So I think the first
>     step, if people are interested in GSoC, is not just to say "yes, let's
>     do it", but to actually flesh out these pages:

Yeah it was also my impression based on the above that Peff would be
ok to take up the admin role.

Now if he doesn't want for some reason to take it, then I am ok with
us not applying, but again it would have been better to be clearer
about that before the eve of the deadline.

^ permalink raw reply

* Re: [PATCH 2/2] worktree.c: use submodule interface to access refs from another worktree
From: Duy Nguyen @ 2017-02-09 11:59 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Git Mailing List, Junio C Hamano, Stefan Beller
In-Reply-To: <37fe2024-0378-a974-a28d-18a89d3e2312@alum.mit.edu>

On Thu, Feb 9, 2017 at 1:07 PM, Michael Haggerty <mhagger@alum.mit.edu> wrote:
> It is unquestionably a good goal to avoid parsing references outside of
> `refs/files-backend.c`. But I'm not a fan of this approach.

Yes. But in this context it was more of a guinea pig. I wanted
something simple enough to code up show we can see what the approach
looked like. Good thing I did it.

>
> There are two meanings of the concept of a "ref store", and I think this
> change muddles them:
>
> 1. The references that happen to be *physically* stored in a particular
>    location, for example the `refs/bisect/*` references in a worktree.
>
> 2. The references that *logically* should be considered part of a
>    particular repository. This might require stitching together
>    references from multiple sources, for example `HEAD` and
>    `refs/bisect` from a worktree's own directory with other
>    references from the main repository.
>
> Either of these concepts can be implemented via the `ref_store` abstraction.
>
> The `ref_store` for a submodule should represent the references
> logically visible from the submodule. The main program shouldn't care
> whether the references are stored in a single physical location or
> spread across multiple locations (for example, if the submodule were
> itself a linked worktree).
>
> The `ref_store` that you want here for a worktree is not the worktree's
> *logical* `ref_store`. You want the worktree's *physical* `ref_store`.

Yep.

> Mixing logical and physical reference stores together is a bad idea
> (even if we were willing to ignore the fact that worktrees are not
> submodules in the accepted sense of the word).
>
> ...
>
> I think the best solution would be to expose the concept of `ref_store`
> in the public refs API. Then users of submodules would essentially do
>
>     struct ref_store *refs = get_submodule_refs(submodule_path);
>     ... resolve_ref_recursively(refs, refname, 0, sha1, &flags) ...
>     ... for_each_ref(refs, fn, cb_data) ...
>
> whereas for a worktree you'd have to look up the `ref_store` instance
> somewhere else (or maybe keep it as part of some worktree structure, if
> there is one) but you would use it via the same API.

Oh I was going to reply to Stefan about his comment to my (**)
footnote. Something along the this line

"Ideally we would introduce a new set of api, maybe with refs_ prefix,
that takes a refs_store. Then submodule people can get a ref store
somewhere and pass to it. Worktree people get maybe some other refs
store for it. The "old" api like for_each_ref() is a thin wrapper
around it, just like read_cache() vs read_index(&the_index). If the
*_submodule does not see much use, we might as well kill it and use
the generic refs_*".

If I didn't misunderstood anything else, then I think we're on the same page.

Now I need to see if I can get there in a reasonable time frame (so I
can fix my "gc in worktree" problem properly) or I would need
something temporary but not so hacky. I'll try to make this new api
and see how it works out. If you think I should not do it right away,
for whatever reason, stop me now.
-- 
Duy

^ permalink raw reply

* Re: GSoC 2017: application open, deadline = February 9, 2017
From: Matthieu Moy @ 2017-02-09 12:11 UTC (permalink / raw)
  To: Christian Couder
  Cc: Jeff King, git, Pranit Bauva, Lars Schneider,
	Carlos Martín Nieto, Johannes Schindelin, Thomas Gummerer,
	Siddharth Kannan
In-Reply-To: <CAP8UFD1V=WD-EHkBkAVET9ztvsHZr_S5GVBWrQ6F1e0LwJoksQ@mail.gmail.com>

Christian Couder <christian.couder@gmail.com> writes:

> Well Peff wrote in reply to your email:
>
>> I did co-admin last year and the year before, but I made Matthieu do all
>> the work. :)
>>
>> I do not mind doing the administrative stuff. But the real work is in
>> polishing up the ideas list and microprojects page.
>
> So I thought Peff would be ok to be the admin (do "the administrative
> stuff").

There are several things the admins need to do:

1) "administrative stuff" about money with Conservancy (aka SFC). As I
   understand it, really not much to do since Google and Conservancy
   work directly with each other for most stuff.

2) Filling-in the application, i.e. essentially copy-past from the
   website.

3) Then, make sure things that must happen do happen (reviewing
   applications, start online or offline discussions when needed, ...).

Last year Peff did 1) and I did most of 2+3). My understanding of Peff's
reply was "OK to continue doing 1)".

I think you (Christian) could do 2+3). It's much, much less work than
being a mentor. Honnestly I felt like I did nothing and then Peff said I
did all the work :o). I can help, but as I said I'm really short in time
budget and I'd like to spend it more on coding+reviewing.

> I don't think emails in this thread is what really counts.
> I worked on the Idea page starting some months ago, and as I wrote I
> reviewed it again and found it not too bad.

OK, so giving up now seems unfair to you indeed.

I created a Git organization and invited you + Peff as admins. I'll
start cut-and-pasting to show my good faith ;-).

> About the janitoring part, as I previously said I am reluctant to do
> that as I don't know what Dscho would be ok to mentor.
> And I also think it's not absolutely necessary to do it before
> applying as an org.

Right.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply

* Re: GSoC 2017: application open, deadline = February 9, 2017
From: Matthieu Moy @ 2017-02-09 12:22 UTC (permalink / raw)
  To: Christian Couder
  Cc: Jeff King, git, Pranit Bauva, Lars Schneider,
	Carlos Martín Nieto, Johannes Schindelin, Thomas Gummerer,
	Siddharth Kannan
In-Reply-To: <vpqlgtfmun0.fsf@anie.imag.fr>

Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> writes:

> I created a Git organization and invited you + Peff as admins. I'll
> start cut-and-pasting to show my good faith ;-).

I created this page based on last year's:

https://git.github.io/SoC-2017-Org-Application/

I filled-in the "org profile". "Org application" is still TODO.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply

* Re: GSoC 2017: application open, deadline = February 9, 2017
From: Pranit Bauva @ 2017-02-09 12:18 UTC (permalink / raw)
  To: Jeff King
  Cc: Matthieu Moy, git, Lars Schneider, Christian Couder,
	Carlos Martín Nieto, Johannes Schindelin, Thomas Gummerer
In-Reply-To: <20170125204504.ebw2sa4uokfwwfnt@sigill.intra.peff.net>

Hey everyone,

On Thu, Jan 26, 2017 at 2:15 AM, Jeff King <peff@peff.net> wrote:
> I do not mind doing the administrative stuff.  But the real work is in
> polishing up the ideas list and microprojects page. So I think the first
> step, if people are interested in GSoC, is not just to say "yes, let's
> do it", but to actually flesh out these pages:

I will help with adding more ideas to the microprojects list. But
since I am not quite familiar with the whole code base, I will need
some help with verifying those whether they are in the scope or not. I
am not sure whether I would be able to help with actual project ideas
but I will try. I will do it within a week or so.

Regards,
Pranit Bauva

^ permalink raw reply

* Re: What's cooking in git.git (Feb 2017, #02; Mon, 6)
From: Johannes Schindelin @ 2017-02-09 12:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <xmqqzihzymn3.fsf@gitster.mtv.corp.google.com>

Hi Junio,

On Mon, 6 Feb 2017, Junio C Hamano wrote:

> * sf/putty-w-args (2017-02-01) 5 commits
>  - SQUASH???
>  - connect: Add the envvar GIT_SSH_VARIANT and ssh.variant config
>  - git_connect(): factor out SSH variant handling
>  - connect: rename tortoiseplink and putty variables
>  - connect: handle putty/plink also in GIT_SSH_COMMAND
> 
>  The command line options for ssh invocation needs to be tweaked for
>  some implementations of SSH (e.g. PuTTY plink wants "-P <port>"
>  while OpenSSH wants "-p <port>" to specify port to connect to), and
>  the variant was guessed when GIT_SSH environment variable is used
>  to specify it.  Extend the guess to the command specified by the
>  newer GIT_SSH_COMMAND and also core.sshcommand configuration
>  variable, and give an escape hatch for users to deal with
>  misdetected cases.
> 
>  Stalled?
>  cf. <alpine.DEB.2.20.1702012319460.3496@virtualbox>

The latest messages in that thread are

- your claim that you never said correctness is pused to a back seat (when
  an earlier, detailed mail listed four priorities of your patch review,
  none of which is said correctness, so I did not bother to answer), and

- my answer that suggested to take a break because the conversation turned
  less rational: I had to point out that your objection was not really
  valid in this case.

I now see that you added a SQUASH commit (that was news to me, thank you
very much), and that you seem to still insist that the code should prepare
for possible future changes in the config settings that may actually never
materialize. (And that would have to be handled at a different point, as I
had pointed out, so that suggested preparation would most likely not help
at all.)

In short: unless I read any convincing argument in favor of said SQUASH
commit, I will remain convinced that v3, as submitted, is actually the
best way forward.

Thank you for your attention,
Johannes

^ permalink raw reply

* [PATCH 0/5] Store submodules in a hash, not a linked list
From: Michael Haggerty @ 2017-02-09 13:26 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Nguyễn Thái Ngọc Duy, Stefan Beller,
	Johannes Schindelin, David Turner, git, Michael Haggerty

I have mentioned this patch series on the mailing list a couple of
time [1,2] but haven't submitted it before. I just rebased it to
current master. It is available from my Git fork [3] as branch
"submodule-hash".

The first point of this patch series is to optimize submodule
`ref_store` lookup by storing the `ref_store`s in a hashmap rather
than a linked list. But a more interesting second point is to weaken
the 1:1 relationship between submodules and `ref_stores` a little bit
more.

A `files_ref_store` would be perfectly happy to represent, say, the
references *physically* stored in a linked worktree (e.g., `HEAD`,
`refs/bisect/*`, etc) even though that is not the complete collection
of refs that are *logically* visible from that worktree (which
includes references from the main repository, too). But the old code
was confusing the two things by storing "submodule" in every
`ref_store` instance.

So push the submodule attribute down to the `files_ref_store` class
(but continue to let the `ref_store`s be looked up by submodule).

The last commit is relatively orthogonal to the others; it simplifies
read_loose_refs() by calling resolve_ref_recursively() directly using
the `ref_store` instance that it already has in hand, rather than
indirectly via the public wrappers.

Michael

[1] http://public-inbox.org/git/341999fc-4496-b974-c117-c18a2fca1358@alum.mit.edu/
[2] http://public-inbox.org/git/37fe2024-0378-a974-a28d-18a89d3e2312@alum.mit.edu/
[3] https://github.com/mhagger/git

Michael Haggerty (5):
  refs: store submodule ref stores in a hashmap
  refs: push the submodule attribute down
  register_ref_store(): new function
  files_ref_store::submodule: use NULL for the main repository
  read_loose_refs(): read refs using resolve_ref_recursively()

 refs.c               | 93 ++++++++++++++++++++++++++++++++++------------------
 refs/files-backend.c | 77 +++++++++++++++++++++++++------------------
 refs/refs-internal.h | 37 ++++++++-------------
 3 files changed, 122 insertions(+), 85 deletions(-)

-- 
2.9.3

^ permalink raw reply

* Re: Fwd: Possibly nicer pathspec syntax?
From: Duy Nguyen @ 2017-02-09 13:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, Git Mailing List
In-Reply-To: <xmqqy3xgwpiq.fsf@gitster.mtv.corp.google.com>

On Thu, Feb 9, 2017 at 12:39 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Duy Nguyen <pclouds@gmail.com> writes:
>
>> On Wed, Feb 8, 2017 at 12:12 PM, Linus Torvalds
>> <torvalds@linux-foundation.org> wrote:
>>> Two-patch series to follow.
>>
>> glossary-content.txt update for both patches would be nice.
>
> I am no longer worried about it as I saw somebody actually sent
> follow-up patches on this, but I want to pick your brain on one
> thing that is related to this codepath.
>
> We have PATHSPEC_PREFER_CWD and PATHSPEC_PREFER_FULL bits in flags,
> added at fc12261fea ("parse_pathspec: add PATHSPEC_PREFER_{CWD,FULL}
> flags", 2013-07-14), and I think the intent is some commands when
> given no pathspec work on all paths in the current subdirectory
> while others work on the full tree, regardless of where you are.
> "grep" is in the former camp, "log" is in the latter.  And there is
> a check to catch a bug in a caller that sets both.
>
> I am wondering about this hunk (this is from the original commit
> that added it):
>
>         if (!entry) {
>                 static const char *raw[2];
>
> +               if (flags & PATHSPEC_PREFER_FULL)
> +                       return;
> +
> +               if (!(flags & PATHSPEC_PREFER_CWD))
> +                       die("BUG: PATHSPEC_PREFER_CWD requires arguments");
> +
>                 pathspec->items = item = xmalloc(sizeof(*item));
>                 memset(item, 0, sizeof(*item));
>                 item->match = prefix;
>                 ... returns a single entry pathspec to cover cwd ...
>
> The BUG message is given when
>
>  - The command got no pathspec from the caller; and
>  - PATHSPEC_PREFER_FULL is not set; and
>  - PATHSPEC_PREFER_CWD is NOT set.
>
> but the message says that the caller must have args when it sets
> prefer-cwd.  Is this a simple typo?  If so what should it say?
>
>         die("BUG: one of PATHSPEC_PREFER_FULL or _CWD must be set");

Without reading through your next mail, I'd say "BUG: this command
requires arguments".

> Does this third possibility (i.e. a caller is allowed to pass
> "flags" that does not prefer either) exist to support a command
> where the caller MUST have at least one pathspec?  If that were the
> case, this wouldn't be a BUG but an end-user error, e.g.
>
>         die("at least one pathspec element is required");

Or this. Yes. I might have just been defensive at then and kept the
third option open.

> If you know offhand which callers pass neither of the two
> PATHSPEC_PREFER_* bits and remember for what purpose you allowed
> them to do so, please remind me.  I'll keep digging in the meantime.

I don't usually remember what I ate yesterday and this commit was from
2013 :D But I'll see if your findings spark anything in my brain.
-- 
Duy

^ permalink raw reply

* [PATCH 3/5] register_ref_store(): new function
From: Michael Haggerty @ 2017-02-09 13:27 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Nguyễn Thái Ngọc Duy, Stefan Beller,
	Johannes Schindelin, David Turner, git, Michael Haggerty
In-Reply-To: <cover.1486629195.git.mhagger@alum.mit.edu>

Move the responsibility for registering the ref_store for a submodule
from base_ref_store_init() to a new function, register_ref_store(). Call
the latter from ref_store_init().

This means that base_ref_store_init() can lose its submodule argument,
further weakening the 1:1 relationship between ref_stores and
submodules.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs.c               | 19 +++++++++++++------
 refs/files-backend.c |  2 +-
 refs/refs-internal.h | 15 ++++++++++-----
 3 files changed, 24 insertions(+), 12 deletions(-)

diff --git a/refs.c b/refs.c
index 723b4be..6012f67 100644
--- a/refs.c
+++ b/refs.c
@@ -1395,11 +1395,8 @@ static struct ref_store *main_ref_store;
 /* A hashmap of ref_stores, stored by submodule name: */
 static struct hashmap submodule_ref_stores;
 
-void base_ref_store_init(struct ref_store *refs,
-			 const struct ref_storage_be *be,
-			 const char *submodule)
+void register_ref_store(struct ref_store *refs, const char *submodule)
 {
-	refs->be = be;
 	if (!submodule) {
 		if (main_ref_store)
 			die("BUG: main_ref_store initialized twice");
@@ -1416,18 +1413,28 @@ void base_ref_store_init(struct ref_store *refs,
 	}
 }
 
+void base_ref_store_init(struct ref_store *refs,
+			 const struct ref_storage_be *be)
+{
+	refs->be = be;
+}
+
 struct ref_store *ref_store_init(const char *submodule)
 {
 	const char *be_name = "files";
 	struct ref_storage_be *be = find_ref_storage_backend(be_name);
+	struct ref_store *refs;
 
 	if (!be)
 		die("BUG: reference backend %s is unknown", be_name);
 
 	if (!submodule || !*submodule)
-		return be->init(NULL);
+		refs = be->init(NULL);
 	else
-		return be->init(submodule);
+		refs = be->init(submodule);
+
+	register_ref_store(refs, submodule);
+	return refs;
 }
 
 struct ref_store *lookup_ref_store(const char *submodule)
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 6ed7e13..794b88c 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -980,7 +980,7 @@ static struct ref_store *files_ref_store_create(const char *submodule)
 	struct files_ref_store *refs = xcalloc(1, sizeof(*refs));
 	struct ref_store *ref_store = (struct ref_store *)refs;
 
-	base_ref_store_init(ref_store, &refs_be_files, submodule);
+	base_ref_store_init(ref_store, &refs_be_files);
 
 	refs->submodule = submodule ? xstrdup(submodule) : "";
 
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 97f275b..73281f5 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -481,7 +481,7 @@ struct ref_store;
  * Initialize the ref_store for the specified submodule, or for the
  * main repository if submodule == NULL. These functions should call
  * base_ref_store_init() to initialize the shared part of the
- * ref_store and to record the ref_store for later lookup.
+ * ref_store.
  */
 typedef struct ref_store *ref_store_init_fn(const char *submodule);
 
@@ -630,12 +630,17 @@ struct ref_store {
 };
 
 /*
- * Fill in the generic part of refs for the specified submodule and
- * add it to our collection of reference stores.
+ * Register the specified ref_store to be the one that should be used
+ * for submodule (or the main repository if submodule is NULL). It is
+ * a fatal error to call this function twice for the same submodule.
+ */
+void register_ref_store(struct ref_store *refs, const char *submodule);
+
+/*
+ * Fill in the generic part of refs.
  */
 void base_ref_store_init(struct ref_store *refs,
-			 const struct ref_storage_be *be,
-			 const char *submodule);
+			 const struct ref_storage_be *be);
 
 /*
  * Create, record, and return a ref_store instance for the specified
-- 
2.9.3


^ permalink raw reply related

* [PATCH 1/5] refs: store submodule ref stores in a hashmap
From: Michael Haggerty @ 2017-02-09 13:26 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Nguyễn Thái Ngọc Duy, Stefan Beller,
	Johannes Schindelin, David Turner, git, Michael Haggerty
In-Reply-To: <cover.1486629195.git.mhagger@alum.mit.edu>

Aside from scaling better, this means that the submodule name needn't be
stored in the ref_store instance anymore (which will be changed in a
moment). This, in turn, will help loosen the strict 1:1 relationship
between ref_stores and submodules.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
---
 refs.c               | 61 ++++++++++++++++++++++++++++++++++++++++------------
 refs/refs-internal.h |  6 ------
 2 files changed, 47 insertions(+), 20 deletions(-)

diff --git a/refs.c b/refs.c
index cd36b64..50d192c 100644
--- a/refs.c
+++ b/refs.c
@@ -3,6 +3,7 @@
  */
 
 #include "cache.h"
+#include "hashmap.h"
 #include "lockfile.h"
 #include "refs.h"
 #include "refs/refs-internal.h"
@@ -1357,11 +1358,42 @@ int resolve_gitlink_ref(const char *submodule, const char *refname,
 	return 0;
 }
 
+struct submodule_hash_entry
+{
+	struct hashmap_entry ent; /* must be the first member! */
+
+	struct ref_store *refs;
+
+	/* NUL-terminated name of submodule: */
+	char submodule[FLEX_ARRAY];
+};
+
+static int submodule_hash_cmp(const void *entry, const void *entry_or_key,
+			      const void *keydata)
+{
+	const struct submodule_hash_entry *e1 = entry, *e2 = entry_or_key;
+	const char *submodule = keydata;
+
+	return strcmp(e1->submodule, submodule ? submodule : e2->submodule);
+}
+
+static struct submodule_hash_entry *alloc_submodule_hash_entry(
+		const char *submodule, struct ref_store *refs)
+{
+	size_t len = strlen(submodule);
+	struct submodule_hash_entry *entry = malloc(sizeof(*entry) + len + 1);
+
+	hashmap_entry_init(entry, strhash(submodule));
+	entry->refs = refs;
+	memcpy(entry->submodule, submodule, len + 1);
+	return entry;
+}
+
 /* A pointer to the ref_store for the main repository: */
 static struct ref_store *main_ref_store;
 
-/* A linked list of ref_stores for submodules: */
-static struct ref_store *submodule_ref_stores;
+/* A hashmap of ref_stores, stored by submodule name: */
+static struct hashmap submodule_ref_stores;
 
 void base_ref_store_init(struct ref_store *refs,
 			 const struct ref_storage_be *be,
@@ -1373,16 +1405,17 @@ void base_ref_store_init(struct ref_store *refs,
 			die("BUG: main_ref_store initialized twice");
 
 		refs->submodule = "";
-		refs->next = NULL;
 		main_ref_store = refs;
 	} else {
-		if (lookup_ref_store(submodule))
+		refs->submodule = xstrdup(submodule);
+
+		if (!submodule_ref_stores.tablesize)
+			hashmap_init(&submodule_ref_stores, submodule_hash_cmp, 20);
+
+		if (hashmap_put(&submodule_ref_stores,
+				alloc_submodule_hash_entry(submodule, refs)))
 			die("BUG: ref_store for submodule '%s' initialized twice",
 			    submodule);
-
-		refs->submodule = xstrdup(submodule);
-		refs->next = submodule_ref_stores;
-		submodule_ref_stores = refs;
 	}
 }
 
@@ -1402,17 +1435,17 @@ struct ref_store *ref_store_init(const char *submodule)
 
 struct ref_store *lookup_ref_store(const char *submodule)
 {
-	struct ref_store *refs;
+	struct submodule_hash_entry *entry;
 
 	if (!submodule || !*submodule)
 		return main_ref_store;
 
-	for (refs = submodule_ref_stores; refs; refs = refs->next) {
-		if (!strcmp(submodule, refs->submodule))
-			return refs;
-	}
+	if (!submodule_ref_stores.tablesize)
+		hashmap_init(&submodule_ref_stores, submodule_hash_cmp, 20);
 
-	return NULL;
+	entry = hashmap_get_from_hash(&submodule_ref_stores,
+				      strhash(submodule), submodule);
+	return entry ? entry->refs : NULL;
 }
 
 struct ref_store *get_ref_store(const char *submodule)
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 25444cf..4ed5f89 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -634,12 +634,6 @@ struct ref_store {
 	 * reference store:
 	 */
 	const char *submodule;
-
-	/*
-	 * Submodule reference store instances are stored in a linked
-	 * list using this pointer.
-	 */
-	struct ref_store *next;
 };
 
 /*
-- 
2.9.3


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox