git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/2] push checks for unpushed remotes in submodules
@ 2011-06-26 18:29 Fredrik Gustafsson
  2011-06-26 18:29 ` [RFC 1/2] test whether " Fredrik Gustafsson
  2011-06-26 18:29 ` [RFC 2/2] Don't push a repository with unpushed submodules Fredrik Gustafsson
  0 siblings, 2 replies; 14+ messages in thread
From: Fredrik Gustafsson @ 2011-06-26 18:29 UTC (permalink / raw)
  To: git; +Cc: gitster, iveqy, hvoigt, jens.lehmann

When working with submodules it is easy to end up in a state when submodule
commits required by the super-project only is present locally. This is most
often a human error (although technical errors such as connection failure
can be a reason).

This patch-series tries to prevent pushing a super-project if not all (by
the super-project used) submodules are pushed first. This will prevent the
human error of forgetting to push submodules before pushing the
super-project.

This is a RFC-series, please consider:
* Is this going into the right direction?
* Should we use a new flag similar to REF_STATUS_REJECT_NONFASTFORWARD?

Future improvements could be:
* Make git status aware of unpushed submodules.
* Show a list of unpushed submodules when a push is denied.

Fredrik Gustafsson (1):
  Don't push a repository with unpushed submodules

Heiko Voigt (1):
  test whether push checks for unpushed remotes in submodules

 submodule.c                    |   79 ++++++++++++++++++++++++++++++++++++++++
 submodule.h                    |    1 +
 t/t5531-deep-submodule-push.sh |   68 ++++++++++++++++++++++++++++++++++
 transport.c                    |    8 ++++
 4 files changed, 156 insertions(+), 0 deletions(-)

-- 
1.7.6.rc3.2.g0185a

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC 1/2] test whether push checks for unpushed remotes in submodules
  2011-06-26 18:29 [RFC 0/2] push checks for unpushed remotes in submodules Fredrik Gustafsson
@ 2011-06-26 18:29 ` Fredrik Gustafsson
  2011-06-26 18:29 ` [RFC 2/2] Don't push a repository with unpushed submodules Fredrik Gustafsson
  1 sibling, 0 replies; 14+ messages in thread
From: Fredrik Gustafsson @ 2011-06-26 18:29 UTC (permalink / raw)
  To: git; +Cc: gitster, iveqy, hvoigt, jens.lehmann

From: Heiko Voigt <hvoigt@hvoigt.net>

These tests are used to extend push to check whether all recorded
submodule commits have also been pushed.

Signed-off-by: Heiko Voigt <hvoigt@hvoigt.net>
---
 t/t5531-deep-submodule-push.sh |   46 +++++++++++++++++++++++++++++++++++++++-
 1 files changed, 45 insertions(+), 1 deletions(-)

diff --git a/t/t5531-deep-submodule-push.sh b/t/t5531-deep-submodule-push.sh
index faa2e96..0b55466 100755
--- a/t/t5531-deep-submodule-push.sh
+++ b/t/t5531-deep-submodule-push.sh
@@ -28,8 +28,52 @@ test_expect_success setup '
 test_expect_success push '
 	(
 		cd work &&
-		git push ../pub.git master
+		git push -f ../pub.git master
 	)
 '
 
+test_expect_failure 'push fails if submodule has no remote' '
+	(
+		cd work/gar/bage &&
+		>junk2 &&
+		git add junk2 &&
+		git commit -m "Second junk"
+	) &&
+	(
+		cd work &&
+		git add gar/bage &&
+		git commit -m "Second commit for gar/bage" &&
+		test_must_fail git push ../pub.git master
+	)
+'
+
+test_expect_failure 'push fails if submodule commit not on remote' '
+	(
+		cd work/gar &&
+		git clone --bare bage ../../submodule.git &&
+		cd bage &&
+		git remote add origin ../../../submodule.git &&
+		git fetch &&
+		>junk3 &&
+		git add junk3 &&
+		git commit -m "Third junk"
+	) &&
+	(
+		cd work &&
+		git add gar/bage &&
+		git commit -m "Third commit for gar/bage" &&
+		test_must_fail git push ../pub.git master
+	)
+'
+
+test_expect_failure 'push succeeds after commit was pushed to remote' '
+	(
+		cd work/gar/bage &&
+		git push origin master
+	) &&
+	(
+		cd work &&
+		git push ../pub.git master
+	)
+'
 test_done
-- 
1.7.6.rc3.2.g0185a

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC 2/2] Don't push a repository with unpushed submodules
  2011-06-26 18:29 [RFC 0/2] push checks for unpushed remotes in submodules Fredrik Gustafsson
  2011-06-26 18:29 ` [RFC 1/2] test whether " Fredrik Gustafsson
@ 2011-06-26 18:29 ` Fredrik Gustafsson
  2011-06-28 18:29   ` Junio C Hamano
  2011-06-28 22:06   ` Marc Branchaud
  1 sibling, 2 replies; 14+ messages in thread
From: Fredrik Gustafsson @ 2011-06-26 18:29 UTC (permalink / raw)
  To: git; +Cc: gitster, iveqy, hvoigt, jens.lehmann

When working with submodules it is easy to forget to push a submodule
to the server but pushing a super-project that contains a commit for
that submodule. The result is that the superproject points at a
submodule commit that is not avaliable on the server.

Check that all submodule commits that are about to be pushed are present
on a remote of the submodule and require forcing if that is not the
case.

This does not guarantee that all submodules a super-project needs will be
available on the server. In that case both the super-project and the
submodules would need an atomic push. This does however prevent the
human error of forgetting to push a submodule.

Signed-off-by: Fredrik Gustafsson <iveqy@iveqy.com>
Mentored-by: Jens Lehmann <Jens.Lehmann@web.de>
Mentored-by: Heiko Voigt <hvoigt@hvoigt.net>
---
 submodule.c                    |   79 ++++++++++++++++++++++++++++++++++++++++
 submodule.h                    |    1 +
 t/t5531-deep-submodule-push.sh |   30 ++++++++++++++--
 transport.c                    |    8 ++++
 4 files changed, 115 insertions(+), 3 deletions(-)

diff --git a/submodule.c b/submodule.c
index b6dec70..fe1daae 100644
--- a/submodule.c
+++ b/submodule.c
@@ -308,6 +308,85 @@ void set_config_fetch_recurse_submodules(int value)
 	config_fetch_recurse_submodules = value;
 }
 
+static int is_submodule_commit_pushed(const char *path, const unsigned char sha1[20])
+{
+	int is_pushed = 0;
+	if (!add_submodule_odb(path) && lookup_commit_reference(sha1)) {
+		struct child_process cp;
+		const char *argv[] = {"rev-list", NULL, "--not", "--remotes", "-n", "1" , NULL};
+		struct strbuf buf = STRBUF_INIT;
+
+		argv[1] = sha1_to_hex(sha1);
+		memset(&cp, 0, sizeof(cp));
+		cp.argv = argv;
+		cp.env = local_repo_env;
+		cp.git_cmd = 1;
+		cp.no_stdin = 1;
+		cp.out = -1;
+		cp.dir = path;
+		if (!run_command(&cp) && !strbuf_read(&buf, cp.out, 41))
+			is_pushed = 1;
+		close(cp.out);
+		strbuf_release(&buf);
+	}
+	return is_pushed;
+}
+
+static void submodule_collect_unpushed_cb(struct diff_queue_struct *q,
+					 struct diff_options *options,
+					 void *data)
+{
+	int i;
+	int *found_unpushed_submodule = data;
+
+	for (i = 0; i < q->nr; i++) {
+		struct diff_filepair *p = q->queue[i];
+		if (!S_ISGITLINK(p->two->mode))
+			continue;
+		if (!is_submodule_commit_pushed(p->two->path, p->two->sha1)) {
+			*found_unpushed_submodule = 1;
+			break;
+		}
+	}
+}
+
+int check_for_unpushed_submodule_commits(unsigned char new_sha1[20])
+{
+	struct rev_info rev;
+	struct commit *commit;
+	const char *argv[] = {NULL, NULL, "--not", "--remotes", NULL};
+	int argc = ARRAY_SIZE(argv) - 1;
+	char *sha1_copy;
+	int found_unpushed_submodule = 0;
+
+	init_revisions(&rev, NULL);
+	sha1_copy = xstrdup(sha1_to_hex(new_sha1));
+	argv[1] = sha1_copy;
+	setup_revisions(argc, argv, &rev, NULL);
+	if (prepare_revision_walk(&rev))
+		die("revision walk setup failed");
+
+	while ((commit = get_revision(&rev)) && !found_unpushed_submodule) {
+		struct commit_list *parent = commit->parents;
+		while (parent) {
+			struct diff_options diff_opts;
+			diff_setup(&diff_opts);
+			DIFF_OPT_SET(&diff_opts, RECURSIVE);
+			diff_opts.output_format |= DIFF_FORMAT_CALLBACK;
+			diff_opts.format_callback = submodule_collect_unpushed_cb;
+			diff_opts.format_callback_data = &found_unpushed_submodule;
+			if (diff_setup_done(&diff_opts) < 0)
+				die("diff_setup_done failed");
+			diff_tree_sha1(parent->item->object.sha1, commit->object.sha1, "", &diff_opts);
+			diffcore_std(&diff_opts);
+			diff_flush(&diff_opts);
+			parent = parent->next;
+		}
+	}
+	free(sha1_copy);
+	return found_unpushed_submodule;
+}
+
 static int is_submodule_commit_present(const char *path, unsigned char sha1[20])
 {
 	int is_present = 0;
diff --git a/submodule.h b/submodule.h
index 5350b0d..0a4d395 100644
--- a/submodule.h
+++ b/submodule.h
@@ -29,5 +29,6 @@ int fetch_populated_submodules(int num_options, const char **options,
 unsigned is_submodule_modified(const char *path, int ignore_untracked);
 int merge_submodule(unsigned char result[20], const char *path, const unsigned char base[20],
 		    const unsigned char a[20], const unsigned char b[20]);
+int check_for_unpushed_submodule_commits(unsigned char sha1[20]);
 
 #endif
diff --git a/t/t5531-deep-submodule-push.sh b/t/t5531-deep-submodule-push.sh
index 0b55466..0ef943a 100755
--- a/t/t5531-deep-submodule-push.sh
+++ b/t/t5531-deep-submodule-push.sh
@@ -32,7 +32,7 @@ test_expect_success push '
 	)
 '
 
-test_expect_failure 'push fails if submodule has no remote' '
+test_expect_success 'push fails if submodule has no remote' '
 	(
 		cd work/gar/bage &&
 		>junk2 &&
@@ -47,7 +47,7 @@ test_expect_failure 'push fails if submodule has no remote' '
 	)
 '
 
-test_expect_failure 'push fails if submodule commit not on remote' '
+test_expect_success 'push fails if submodule commit not on remote' '
 	(
 		cd work/gar &&
 		git clone --bare bage ../../submodule.git &&
@@ -66,7 +66,7 @@ test_expect_failure 'push fails if submodule commit not on remote' '
 	)
 '
 
-test_expect_failure 'push succeeds after commit was pushed to remote' '
+test_expect_success 'push succeeds after commit was pushed to remote' '
 	(
 		cd work/gar/bage &&
 		git push origin master
@@ -76,4 +76,28 @@ test_expect_failure 'push succeeds after commit was pushed to remote' '
 		git push ../pub.git master
 	)
 '
+
+test_expect_success 'push fails when commit on multiple branches if one branch has no remote' '
+	(
+		cd work/gar/bage &&
+		>junk4 &&
+		git add junk4 &&
+		git commit -m "Fourth junk"
+	) &&
+	(
+		cd work &&
+		git branch branch2 &&
+		git add gar/bage &&
+		git commit -m "Fourth commit for gar/bage" &&
+		git checkout branch2 &&
+		(
+			cd gar/bage &&
+			git checkout HEAD~1
+		) &&
+		>junk1 &&
+		git add junk1 &&
+		git commit -m "First junk" &&
+		test_must_fail git push ../pub.git
+	)
+'
 test_done
diff --git a/transport.c b/transport.c
index c9c8056..d83595b 100644
--- a/transport.c
+++ b/transport.c
@@ -10,6 +10,7 @@
 #include "refs.h"
 #include "branch.h"
 #include "url.h"
+#include "submodule.h"
 
 /* rsync support */
 
@@ -1041,6 +1042,13 @@ int transport_push(struct transport *transport,
 			flags & TRANSPORT_PUSH_MIRROR,
 			flags & TRANSPORT_PUSH_FORCE);
 
+		if(!(flags & TRANSPORT_PUSH_FORCE) && !is_bare_repository()) {
+			struct ref *ref = remote_refs;
+			for (; ref; ref = ref->next)
+				if(!is_null_sha1(ref->new_sha1) && check_for_unpushed_submodule_commits(ref->new_sha1))
+					die("There's unpushed submodules, aborting. Use -f to force a push");
+		}
+
 		push_ret = transport->push_refs(transport, remote_refs, flags);
 		err = push_had_errors(remote_refs);
 		ret = push_ret | err;
-- 
1.7.6.rc3.2.g0185a

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [RFC 2/2] Don't push a repository with unpushed submodules
  2011-06-26 18:29 ` [RFC 2/2] Don't push a repository with unpushed submodules Fredrik Gustafsson
@ 2011-06-28 18:29   ` Junio C Hamano
  2011-06-28 19:30     ` Heiko Voigt
  2011-06-28 22:06   ` Marc Branchaud
  1 sibling, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2011-06-28 18:29 UTC (permalink / raw)
  To: Fredrik Gustafsson; +Cc: git, gitster, hvoigt, jens.lehmann

Fredrik Gustafsson <iveqy@iveqy.com> writes:

> +static int is_submodule_commit_pushed(const char *path, const unsigned char sha1[20])
> +{
> +	int is_pushed = 0;
> +	if (!add_submodule_odb(path) && lookup_commit_reference(sha1)) {
> +		struct child_process cp;
> +		const char *argv[] = {"rev-list", NULL, "--not", "--remotes", "-n", "1" , NULL};
> +		struct strbuf buf = STRBUF_INIT;
> +
> +		argv[1] = sha1_to_hex(sha1);
> +		memset(&cp, 0, sizeof(cp));
> +		cp.argv = argv;
> +		cp.env = local_repo_env;
> +		cp.git_cmd = 1;
> +		cp.no_stdin = 1;
> +		cp.out = -1;
> +		cp.dir = path;
> +		if (!run_command(&cp) && !strbuf_read(&buf, cp.out, 41))
> +			is_pushed = 1;
> +		close(cp.out);
> +		strbuf_release(&buf);
> +	}

What if

 (1) you are binding somebody else's project as your own submodule, you do
     not make any local changes (you won't be pushing them out anyway),
     and you do not have remote tracking branches in that submodule
     project?

 (2) you have a project you can push to that is bound as a submodule, you
     have pushed the commit that is bound in the superproject's tree to
     that submodule project, but you do not have remote tracking branches
     in that submodule project?

 (3) you have a project you can push to that is bound as a submodule, you
     have multiple remotes defined (e.g. one for the public server you
     intend for this check to be in effect, the other is just for your
     private backup), and you have pushed the necessary commit to your
     backup but not to the public one?

The above check would fail in cases (1) and (2) even though it does not
have to. It would succeed in case (3), but people would not be able to use
the superproject as the necessary commit is not there but only on your
work and backup repositories.

What am I missing?

I am not sure if the check imposed on the client end is such a great
idea. I suspect that it would depend on the superproject which submodule
commit must exist "out there" for others to fetch. If you assume a closed
environment where all the superprojects and necessary submodule projects
are housed at a central location everybody pushes into and have tight
control over how project participants clone and check out the
superprojects and submodules, you do not have to worry about any of the
above, but then the server-side can perform the check when it accepts a
push (and you would already be doing other checks there in your hooks
anyway in the industrial setup, I would guess).

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC 2/2] Don't push a repository with unpushed submodules
  2011-06-28 18:29   ` Junio C Hamano
@ 2011-06-28 19:30     ` Heiko Voigt
  2011-06-28 20:43       ` Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Heiko Voigt @ 2011-06-28 19:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Fredrik Gustafsson, git, jens.lehmann

Hi,

On Tue, Jun 28, 2011 at 11:29:15AM -0700, Junio C Hamano wrote:
> Fredrik Gustafsson <iveqy@iveqy.com> writes:
> 
> > +static int is_submodule_commit_pushed(const char *path, const unsigned char sha1[20])
> > +{
> > +	int is_pushed = 0;
> > +	if (!add_submodule_odb(path) && lookup_commit_reference(sha1)) {
> > +		struct child_process cp;
> > +		const char *argv[] = {"rev-list", NULL, "--not", "--remotes", "-n", "1" , NULL};
> > +		struct strbuf buf = STRBUF_INIT;
> > +
> > +		argv[1] = sha1_to_hex(sha1);
> > +		memset(&cp, 0, sizeof(cp));
> > +		cp.argv = argv;
> > +		cp.env = local_repo_env;
> > +		cp.git_cmd = 1;
> > +		cp.no_stdin = 1;
> > +		cp.out = -1;
> > +		cp.dir = path;
> > +		if (!run_command(&cp) && !strbuf_read(&buf, cp.out, 41))
> > +			is_pushed = 1;
> > +		close(cp.out);
> > +		strbuf_release(&buf);
> > +	}
> 
> What if
> 
>  (1) you are binding somebody else's project as your own submodule, you do
>      not make any local changes (you won't be pushing them out anyway),
>      and you do not have remote tracking branches in that submodule
>      project?

In this scenario the superproject can not be cloned that way that it
would contain the submodule right? I would consider this a rather exotic
way to work since pushing means to share your work somehow. This way you
would share your work in the superproject but not from the submodule?
How about not doing the is-pushed check when the submodule has no
remotes? If we assume that only people having remote tracking branches
want to share them via push this would allow your usecase.

>  (2) you have a project you can push to that is bound as a submodule, you
>      have pushed the commit that is bound in the superproject's tree to
>      that submodule project, but you do not have remote tracking branches
>      in that submodule project?

This could also be solved with the above proposal.

>  (3) you have a project you can push to that is bound as a submodule, you
>      have multiple remotes defined (e.g. one for the public server you
>      intend for this check to be in effect, the other is just for your
>      private backup), and you have pushed the necessary commit to your
>      backup but not to the public one?
> 
> The above check would fail in cases (1) and (2) even though it does not
> have to. It would succeed in case (3), but people would not be able to use
> the superproject as the necessary commit is not there but only on your
> work and backup repositories.
> 
> What am I missing?
> 
> I am not sure if the check imposed on the client end is such a great
> idea.

This check is solely meant as a convenience security measure. It should
and can not enforce a tight check whether a superproject (including its
submodules) can be cloned/checked out at all times. But it ensures that
a developer has pushed his submodule commits "somewhere" which is enough
in practice.

One typical scenario is that people are working together using shared
remotes. In this scenario this patch provides a consistency check which
catches typical mistakes.

If you fork a project you might change or add a new url for a submodule
locally since you cannot directly push to upstream. This is situation 3
in your above description. All people working with you know which url
you are using for the submodule. In this situation the check helps you
and can not be enforced on the remote side since only the client knows
which remotes the submodule has.

Maybe we should provide a configuration option for this check so that
people who know what they are doing could switch it off?

> I suspect that it would depend on the superproject which submodule
> commit must exist "out there" for others to fetch. If you assume a closed
> environment where all the superprojects and necessary submodule projects
> are housed at a central location everybody pushes into and have tight
> control over how project participants clone and check out the
> superprojects and submodules, you do not have to worry about any of the
> above, but then the server-side can perform the check when it accepts a
> push (and you would already be doing other checks there in your hooks
> anyway in the industrial setup, I would guess).

As mentioned above a check on the remote end is only applicable if you
have a certain defined remote for the submodule in a superproject. This
also has to be in an environment which has control over all
projects/submodules. The presented solution does not just cover that but
also the case where you fork and use different remotes than upstream.

Cheers Heiko

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC 2/2] Don't push a repository with unpushed submodules
  2011-06-28 19:30     ` Heiko Voigt
@ 2011-06-28 20:43       ` Junio C Hamano
  2011-06-28 21:59         ` Fredrik Gustafsson
                           ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Junio C Hamano @ 2011-06-28 20:43 UTC (permalink / raw)
  To: Heiko Voigt; +Cc: Fredrik Gustafsson, git, jens.lehmann

Heiko Voigt <hvoigt@hvoigt.net> writes:

>> What if
>> 
>>  (1) you are binding somebody else's project as your own submodule, you do
>>      not make any local changes (you won't be pushing them out anyway),
>>      and you do not have remote tracking branches in that submodule
>>      project?
>
> In this scenario the superproject can not be cloned that way that it
> would contain the submodule right? I would consider this a rather exotic
> way to work since pushing means to share your work somehow.

Sorry, I don't follow. Isn't this the classical example of an el-cheapo
router firmware project (i.e. superproject) binding unmodified Linux
kernel project as one of its submodules without you having any push
privilege to Linus's repository, which was one of the original examples
used in the very initial submodule discussion?

> How about not doing the is-pushed check when the submodule has no
> remotes? If we assume that only people having remote tracking branches
> want to share them via push this would allow your usecase.

Yes, that would reduce the false positive; same for (2).

> This check is solely meant as a convenience security measure. It should
> and can not enforce a tight check whether a superproject (including its
> submodules) can be cloned/checked out at all times. But it ensures that
> a developer has pushed his submodule commits "somewhere" which is enough
> in practice.

I am not entirely convinced but if this would catch more than 80% of
casual mistakes, it would be good enough.  I was hoping that somebody may
come up with an idea that would work even in case (3), though.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC 2/2] Don't push a repository with unpushed submodules
  2011-06-28 20:43       ` Junio C Hamano
@ 2011-06-28 21:59         ` Fredrik Gustafsson
  2011-06-28 22:24         ` Jens Lehmann
  2011-07-04 20:05         ` Heiko Voigt
  2 siblings, 0 replies; 14+ messages in thread
From: Fredrik Gustafsson @ 2011-06-28 21:59 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Heiko Voigt, Fredrik Gustafsson, git, jens.lehmann

On Tue, Jun 28, 2011 at 01:43:18PM -0700, Junio C Hamano wrote:
> > This check is solely meant as a convenience security measure. It should
> > and can not enforce a tight check whether a superproject (including its
> > submodules) can be cloned/checked out at all times. But it ensures that
> > a developer has pushed his submodule commits "somewhere" which is enough
> > in practice.
> 
> I am not entirely convinced but if this would catch more than 80% of
> casual mistakes, it would be good enough.  I was hoping that somebody may
> come up with an idea that would work even in case (3), though.
> 

There's ways to do a "better" check, but only(*) if the client communicates
with the server. This is expensive and doesn't make any sense to do for
the error we're trying to prevent here, forgetful developers that
forgotten to push a submodule.

A design goal for this check has been to make it just a client side
check.

I do not have a % value of how usual this fault is. I do know that
developers being introduced to submodules that I know of tends to
forget this (and so do I occasionally).

* According to what I found out. If there's a better solution I would of
  course be very happy.

-- 
Med vänliga hälsningar
Fredrik Gustafsson

tel: 0733-608274
e-post: iveqy@iveqy.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC 2/2] Don't push a repository with unpushed submodules
  2011-06-26 18:29 ` [RFC 2/2] Don't push a repository with unpushed submodules Fredrik Gustafsson
  2011-06-28 18:29   ` Junio C Hamano
@ 2011-06-28 22:06   ` Marc Branchaud
  2011-06-28 22:32     ` Jens Lehmann
  2011-06-28 23:02     ` Fredrik Gustafsson
  1 sibling, 2 replies; 14+ messages in thread
From: Marc Branchaud @ 2011-06-28 22:06 UTC (permalink / raw)
  To: Fredrik Gustafsson; +Cc: git, gitster, hvoigt, jens.lehmann

On 11-06-26 02:29 PM, Fredrik Gustafsson wrote:
> When working with submodules it is easy to forget to push a submodule
> to the server but pushing a super-project that contains a commit for
> that submodule. The result is that the superproject points at a
> submodule commit that is not avaliable on the server.
> 
> Check that all submodule commits that are about to be pushed are present
> on a remote of the submodule and require forcing if that is not the
> case.

I have a few concerns about what this is trying to do.

First, I expect performance will be terrible in repositories with large
and/or many submodules.  I'd go so far as to say that it's pretty much a
show-stopper for our repository.

Second, there are many times where I'm working in a submodule on branch
"TopicA" but not in branch "TopicB".  If I've made submodule changes in
TopicA then try to push up TopicB, won't I have have to tell push to "-f"?
But that turns off other checks that I'd rather keep.

I'd feel a lot better about this patch if the check was *off* by default and
there was a config setting / command-line option to turn it on.

I also agree with Junio that this kind of verification makes more sense in a
hook on the server side.

		M.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC 2/2] Don't push a repository with unpushed submodules
  2011-06-28 20:43       ` Junio C Hamano
  2011-06-28 21:59         ` Fredrik Gustafsson
@ 2011-06-28 22:24         ` Jens Lehmann
  2011-07-04 20:05         ` Heiko Voigt
  2 siblings, 0 replies; 14+ messages in thread
From: Jens Lehmann @ 2011-06-28 22:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Heiko Voigt, Fredrik Gustafsson, git

Am 28.06.2011 22:43, schrieb Junio C Hamano:
> Heiko Voigt <hvoigt@hvoigt.net> writes:
> 
>>> What if
>>>
>>>  (1) you are binding somebody else's project as your own submodule, you do
>>>      not make any local changes (you won't be pushing them out anyway),
>>>      and you do not have remote tracking branches in that submodule
>>>      project?
>>
>> In this scenario the superproject can not be cloned that way that it
>> would contain the submodule right? I would consider this a rather exotic
>> way to work since pushing means to share your work somehow.
> 
> Sorry, I don't follow. Isn't this the classical example of an el-cheapo
> router firmware project (i.e. superproject) binding unmodified Linux
> kernel project as one of its submodules without you having any push
> privilege to Linus's repository, which was one of the original examples
> used in the very initial submodule discussion?

Yes, but if you push a version to el-cheapo upstream containing a Linux
submodule commit not contained in Linus' repository (because you made some
changes yourself), that won't be cloneable from upstream el-cheapo by
anyone else. This is what this check makes you aware of, and the best
practice here IMO is: make your own fork of Linus' repo (on github or
someplace similar) and then you do have the push rights. (This is what we
do where I work for every repo we don't have push access to and it solves
this problem nicely. And in fact this is pretty much the same I do for a
stand alone repo - like Git - I don't have push access to but want to
publish my changes for: I make a fork that gives me push rights and allows
me to share my work)

>> This check is solely meant as a convenience security measure. It should
>> and can not enforce a tight check whether a superproject (including its
>> submodules) can be cloned/checked out at all times. But it ensures that
>> a developer has pushed his submodule commits "somewhere" which is enough
>> in practice.
> 
> I am not entirely convinced but if this would catch more than 80% of
> casual mistakes, it would be good enough.  I was hoping that somebody may
> come up with an idea that would work even in case (3), though.

My impression is that we would catch more than 80% (but I admit that I
might be influenced by the way we use submodules). Anyways, maybe the
solution for (3) is to only take the default remote into account and to
ignore the others? (Because that's the one most users will initialize from
the .gitmodules of the superproject)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC 2/2] Don't push a repository with unpushed submodules
  2011-06-28 22:06   ` Marc Branchaud
@ 2011-06-28 22:32     ` Jens Lehmann
  2011-06-29 17:29       ` Marc Branchaud
  2011-06-28 23:02     ` Fredrik Gustafsson
  1 sibling, 1 reply; 14+ messages in thread
From: Jens Lehmann @ 2011-06-28 22:32 UTC (permalink / raw)
  To: Marc Branchaud; +Cc: Fredrik Gustafsson, git, gitster, hvoigt

Am 29.06.2011 00:06, schrieb Marc Branchaud:
> First, I expect performance will be terrible in repositories with large
> and/or many submodules.  I'd go so far as to say that it's pretty much a
> show-stopper for our repository.

Large submodules won't be the problem here, but many submodules and/or
many commits might cause some performance degradations here. But please
notice that there is no communication with the upstream of the submodules,
we only check the refs present locally. Do you still think doing some
"git rev-list" invocations in your submodules will be a problem? Then
please say so.

> Second, there are many times where I'm working in a submodule on branch
> "TopicA" but not in branch "TopicB".  If I've made submodule changes in
> TopicA then try to push up TopicB, won't I have have to tell push to "-f"?
> But that turns off other checks that I'd rather keep.

Nope, this patch only checks the refs to be pushed, not any others. So it
will only check that all submodule commits on branch "TopicB" are pushed.

> I'd feel a lot better about this patch if the check was *off* by default and
> there was a config setting / command-line option to turn it on.

I have no objections against making that configurable, although I tend
towards making this check default "on". But please feel free to test this
feature and tell us if it really hinders you in your work or does cause
performance degradation, we'll really appreciate the feedback!

> I also agree with Junio that this kind of verification makes more sense in a
> hook on the server side.

Hmm, but isn't that assuming that the server knows (= hosts?) both the
submodule and the superproject?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC 2/2] Don't push a repository with unpushed submodules
  2011-06-28 22:06   ` Marc Branchaud
  2011-06-28 22:32     ` Jens Lehmann
@ 2011-06-28 23:02     ` Fredrik Gustafsson
  2011-06-29 17:34       ` Marc Branchaud
  1 sibling, 1 reply; 14+ messages in thread
From: Fredrik Gustafsson @ 2011-06-28 23:02 UTC (permalink / raw)
  To: Marc Branchaud; +Cc: Fredrik Gustafsson, git, gitster, hvoigt, jens.lehmann

On Tue, Jun 28, 2011 at 06:06:35PM -0400, Marc Branchaud wrote:
> On 11-06-26 02:29 PM, Fredrik Gustafsson wrote:
> > When working with submodules it is easy to forget to push a submodule
> > to the server but pushing a super-project that contains a commit for
> > that submodule. The result is that the superproject points at a
> > submodule commit that is not avaliable on the server.
> > 
> > Check that all submodule commits that are about to be pushed are present
> > on a remote of the submodule and require forcing if that is not the
> > case.
> 
> I have a few concerns about what this is trying to do.
> 
> First, I expect performance will be terrible in repositories with large
> and/or many submodules.  I'd go so far as to say that it's pretty much a
> show-stopper for our repository.

This is not acceptable (in my opinion). The point of making this (only) a
client side check was to not have a huge performance impact.

Would you care for testing this in your enviroment or give me enough
data to be able to set up a test enviroment size-wize like yours?

> Second, there are many times where I'm working in a submodule on branch
> "TopicA" but not in branch "TopicB".  If I've made submodule changes in
> TopicA then try to push up TopicB, won't I have have to tell push to "-f"?
> But that turns off other checks that I'd rather keep.

I don't quite follow you here. Anyway, only the commits of the
superproject that you are about to push is checked, and only the
submodules that are changed in any of those commits are examined.

So if you're working in TopicA in a submodule and tries to push TopicB
in a superproject that uses TopicC in the submodule, TopicA will not be
required to be pushed. (if so, is it a bug).

> I'd feel a lot better about this patch if the check was *off* by default and
> there was a config setting / command-line option to turn it on.
> 
> I also agree with Junio that this kind of verification makes more sense in a
> hook on the server side.

Serverside, we cannot guarantee that all submodules are reachable, they
might be on different servers, maybe not even connected to eachothers. Even
if they are connected this would requiring network traffic. Making this check 
an even bigger performance killer. This check is not supposed to guarantee a 
sane server-repo (that would be much harder) and therefore this check is 
"overkill" to have on the server-side. Client side we always have all
information needed for this.

Note the problem:
"Prevent the developer of pushing a superrepo that has submodule
(commits) only locally avaliable"

That's the problem we're trying to solve, NOT:

"Prevent the developer of pushing a superrepo that has submodule
(commits) not avaliable for an other developer"

The second problem is just too complex and too slow to solve in a generic
way.

-- 
Med vänliga hälsningar
Fredrik Gustafsson

tel: 0733-608274
e-post: iveqy@iveqy.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC 2/2] Don't push a repository with unpushed submodules
  2011-06-28 22:32     ` Jens Lehmann
@ 2011-06-29 17:29       ` Marc Branchaud
  0 siblings, 0 replies; 14+ messages in thread
From: Marc Branchaud @ 2011-06-29 17:29 UTC (permalink / raw)
  To: Jens Lehmann; +Cc: Fredrik Gustafsson, git, gitster, hvoigt

On 11-06-28 06:32 PM, Jens Lehmann wrote:
> Am 29.06.2011 00:06, schrieb Marc Branchaud:
>> First, I expect performance will be terrible in repositories with large
>> and/or many submodules.  I'd go so far as to say that it's pretty much a
>> show-stopper for our repository.
> 
> Large submodules won't be the problem here, but many submodules and/or
> many commits might cause some performance degradations here. But please
> notice that there is no communication with the upstream of the submodules,
> we only check the refs present locally. Do you still think doing some
> "git rev-list" invocations in your submodules will be a problem? Then
> please say so.

Yes, that's exactly what I'm saying.  I'm concerned about rev-list's
performance, particularly when my system's disk cache is empty (or already
full with something else).  When I saw the patch I tried a quick
	git rev-list X --not --remotes -n 1
in a Linux kernel submodule where X was a non-existent ref, and watched my
disk whirl for a few minutes.

Admittedly that's nothing like what the patch does -- mea culpa.  I see now
that the check looks for refs in the submodule's recent history, so the scan
is likely to be pretty short.

Also, as you say below and Fredrik said in his reply, the patch only checks
submodules affected by a super-repo ref that is being pushed.  So I agree
that the performance hit is likely to be minimal even with large submodules.

(Fredrik, FYI my super-repo itself is large enough to make "git status" take
up a good part of the disk cache.  This repo has several submodules,
including a few (different) Linux kernel repos.  I already get bogged down a
bit when I have to status the main repo and one of the Linux submodules.  So
anything that adds extra submodule inspection makes me nervous.)

>> Second, there are many times where I'm working in a submodule on branch
>> "TopicA" but not in branch "TopicB".  If I've made submodule changes in
>> TopicA then try to push up TopicB, won't I have have to tell push to "-f"?
>> But that turns off other checks that I'd rather keep.
> 
> Nope, this patch only checks the refs to be pushed, not any others. So it
> will only check that all submodule commits on branch "TopicB" are pushed.

Thanks for pointing that out!

>> I'd feel a lot better about this patch if the check was *off* by default and
>> there was a config setting / command-line option to turn it on.
> 
> I have no objections against making that configurable, although I tend
> towards making this check default "on". But please feel free to test this
> feature and tell us if it really hinders you in your work or does cause
> performance degradation, we'll really appreciate the feedback!

Well I'm less concerned about it being configurable now.  I did a few more
tests and the rev-list performance looked fine.

		M.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC 2/2] Don't push a repository with unpushed submodules
  2011-06-28 23:02     ` Fredrik Gustafsson
@ 2011-06-29 17:34       ` Marc Branchaud
  0 siblings, 0 replies; 14+ messages in thread
From: Marc Branchaud @ 2011-06-29 17:34 UTC (permalink / raw)
  To: Fredrik Gustafsson; +Cc: git, gitster, hvoigt, jens.lehmann

On 11-06-28 07:02 PM, Fredrik Gustafsson wrote:
> 
>  [ ... ]
> 
> Serverside, we cannot guarantee that all submodules are reachable, they
> might be on different servers, maybe not even connected to eachothers. Even
> if they are connected this would requiring network traffic. Making this check 
> an even bigger performance killer. This check is not supposed to guarantee a 
> sane server-repo (that would be much harder) and therefore this check is 
> "overkill" to have on the server-side. Client side we always have all
> information needed for this.
> 
> Note the problem:
> "Prevent the developer of pushing a superrepo that has submodule
> (commits) only locally avaliable"
> 
> That's the problem we're trying to solve, NOT:
> 
> "Prevent the developer of pushing a superrepo that has submodule
> (commits) not avaliable for an other developer"
> 
> The second problem is just too complex and too slow to solve in a generic
> way.

Fair enough.

So my only remaining concern is that using "push -f" to override the check is
too much.  I'd rather see a different option control this.  Maybe
--ignore-submodules?  (I think it'd be fine if -f implied --ignore-submodules.)

		M.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC 2/2] Don't push a repository with unpushed submodules
  2011-06-28 20:43       ` Junio C Hamano
  2011-06-28 21:59         ` Fredrik Gustafsson
  2011-06-28 22:24         ` Jens Lehmann
@ 2011-07-04 20:05         ` Heiko Voigt
  2 siblings, 0 replies; 14+ messages in thread
From: Heiko Voigt @ 2011-07-04 20:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Fredrik Gustafsson, git, jens.lehmann

Hi,

On Tue, Jun 28, 2011 at 01:43:18PM -0700, Junio C Hamano wrote:
> Heiko Voigt <hvoigt@hvoigt.net> writes:
> 
> >> What if
> >> 
> >>  (1) you are binding somebody else's project as your own submodule, you do
> >>      not make any local changes (you won't be pushing them out anyway),
> >>      and you do not have remote tracking branches in that submodule
> >>      project?
> >
> > In this scenario the superproject can not be cloned that way that it
> > would contain the submodule right? I would consider this a rather exotic
> > way to work since pushing means to share your work somehow.
> 
> Sorry, I don't follow. Isn't this the classical example of an el-cheapo
> router firmware project (i.e. superproject) binding unmodified Linux
> kernel project as one of its submodules without you having any push
> privilege to Linus's repository, which was one of the original examples
> used in the very initial submodule discussion?

But in such an example the Linux submodule (if used with git submodule)
would have remote tracking branches even though they are not directly
pushable.

Cheers Heiko

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2011-07-04 20:05 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-06-26 18:29 [RFC 0/2] push checks for unpushed remotes in submodules Fredrik Gustafsson
2011-06-26 18:29 ` [RFC 1/2] test whether " Fredrik Gustafsson
2011-06-26 18:29 ` [RFC 2/2] Don't push a repository with unpushed submodules Fredrik Gustafsson
2011-06-28 18:29   ` Junio C Hamano
2011-06-28 19:30     ` Heiko Voigt
2011-06-28 20:43       ` Junio C Hamano
2011-06-28 21:59         ` Fredrik Gustafsson
2011-06-28 22:24         ` Jens Lehmann
2011-07-04 20:05         ` Heiko Voigt
2011-06-28 22:06   ` Marc Branchaud
2011-06-28 22:32     ` Jens Lehmann
2011-06-29 17:29       ` Marc Branchaud
2011-06-28 23:02     ` Fredrik Gustafsson
2011-06-29 17:34       ` Marc Branchaud

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).