Git development
 help / color / mirror / Atom feed
* [PATCH] Fix diffcore-break total breakage
From: Linus Torvalds @ 2007-10-20 19:31 UTC (permalink / raw)
  To: Git Mailing List, Junio C Hamano, Shawn O. Pearce



Ok, so on the kernel list, some people noticed that "git log --follow" 
doesn't work too well with some files in the x86 merge, because a lot of 
files got renamed in very special ways.

In particular, there was a pattern of doing single commits with renames 
that looked basically like

 - rename "filename.h" -> "filename_64.h"
 - create new "filename.c" that includes "filename_32.h" or 
   "filename_64.h" depending on whether we're 32-bit or 64-bit.

which was preparatory for smushing the two trees together.

Now, there's two issues here:

 - "filename.c" *remained*. Yes, it was a rename, but there was a new file 
   created with the old name in the same commit. This was important, 
   because we wanted each commit to compile properly, so that it was 
   bisectable, so splitting the rename into one commit and the "create 
   helper file" into another was *not* an option.

   So we need to break associations where the contents change too much. 
   Fine. We have the -B flag for that. When we break things up, then the 
   rename detection will be able to figure out whether there are better 
   alternatives.

 - "git log --follow" didn't with with -B.

Now, the second case was really simple: we use a different "diffopt" 
structure for the rename detection than the basic one (which we use for 
showing the diffs). So that second case is trivially fixed by a trivial 
one-liner that just copies the break_opt values from the "real" diffopts 
to the one used for rename following. So now "git log -B --follow" works 
fine:

	diff --git a/tree-diff.c b/tree-diff.c
	index 26bdbdd..7c261fd 100644
	--- a/tree-diff.c
	+++ b/tree-diff.c
	@@ -319,6 +319,7 @@ static void try_to_follow_renames(struct tree_desc *t1, struct tree_desc *t2, co
	 	diff_opts.detect_rename = DIFF_DETECT_RENAME;
	 	diff_opts.output_format = DIFF_FORMAT_NO_OUTPUT;
	 	diff_opts.single_follow = opt->paths[0];
	+	diff_opts.break_opt = opt->break_opt;
	 	paths[0] = NULL;
	 	diff_tree_setup_paths(paths, &diff_opts);
	 	if (diff_setup_done(&diff_opts) < 0)

however, the end result does *not* work. Because our diffcore-break.c 
logic is totally bogus!

In particular:

 - it used to do

	if (base_size < MINIMUM_BREAK_SIZE)
		return 0; /* we do not break too small filepair */

   which basically says "don't bother to break small files". But that 
   "base_size" is the *smaller* of the two sizes, which means that if some 
   large file was rewritten into one that just includes another file, we 
   would look at the (small) result, and decide that it's smaller than the 
   break size, so it cannot be worth it to break it up! Even if the other 
   side was ten times bigger and looked *nothing* like the samell file!

   That's clearly bogus. I replaced "base_size" with "max_size", so that 
   we compare the *bigger* of the filepair with the break size.

 - It calculated a "merge_score", which was the score needed to merge it 
   back together if nothing else wanted it. But even if it was *so* 
   different that we would never want to merge it back, we wouldn't 
   consider it a break! That makes no sense. So I added

	if (*merge_score_p > break_score)
		return 1;

   to make it clear that if we wouldn't want to merge it at the end, it 
   was *definitely* a break.

 - It compared the whole "extent of damage", counting all inserts and 
   deletes, but it based this score on the "base_size", and generated the 
   damage score with

	delta_size = src_removed + literal_added;
	damage_score = delta_size * MAX_SCORE / base_size;

   but that makes no sense either, since quite often, this will result in 
   a number that is *bigger* than MAX_SCORE! Why? Because base_size is 
   (again) the smaller of the two files we compare, and when you start out 
   from a small file and add a lot (or start out from a large file and 
   remove a lot), the base_size is going to be much smaller than the 
   damage!

   Again, the fix was to replace "base_size" with "max_size", at which 
   point the damage actually becomes a sane percentage of the whole.

With these changes in place, not only does "git log -B --follow" work for 
the case that triggered this in the first place, ie now

	git log -B --follow arch/x86/kernel/vmlinux_64.lds.S

actually gives reasonable resulys. But I also wanted to verify it in 
general, by doing a full-history

	git log --stat -B -C

on my kernel tree with the old code and the new code. 

There's some tweaking to be done, but generally, the new code generates 
much better results wrt breaking up files (and then finding better rename 
candidates). Here's a few examples of the "--stat" output:

 - This:
	include/asm-x86/Kbuild        |    2 -
	include/asm-x86/debugreg.h    |   79 +++++++++++++++++++++++++++++++++++------
	include/asm-x86/debugreg_32.h |   64 ---------------------------------
	include/asm-x86/debugreg_64.h |   65 ---------------------------------
	4 files changed, 68 insertions(+), 142 deletions(-)

      Becomes:

	include/asm-x86/Kbuild                        |    2 -
	include/asm-x86/{debugreg_64.h => debugreg.h} |    9 +++-
	include/asm-x86/debugreg_32.h                 |   64 -------------------------
	3 files changed, 7 insertions(+), 68 deletions(-)

 - This:
	include/asm-x86/bug.h    |   41 +++++++++++++++++++++++++++++++++++++++--
	include/asm-x86/bug_32.h |   37 -------------------------------------
	include/asm-x86/bug_64.h |   34 ----------------------------------
	3 files changed, 39 insertions(+), 73 deletions(-)

      Becomes

	include/asm-x86/{bug_64.h => bug.h} |   20 +++++++++++++-----
	include/asm-x86/bug_32.h            |   37 -----------------------------------
	2 files changed, 14 insertions(+), 43 deletions(-)

Now, in some other cases, it does actually turn a rename into a real 
"delete+create" pair, and then the diff is usually bigger, so truth in 
advertizing: it doesn't always generate a nicer diff. But for what -B was 
meant for, I think this is a big improvement, and I suspect those cases 
where it generates a bigger diff are tweakable.

So I think this diff fixes a real bug, but we might still want to tweak 
the default values and perhaps the exact rules for when a break happens.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---

Hmm? At least the "should_break()" tests seem to make some amount of sense 
now, I think. 

		Linus

----
 diffcore-break.c |   11 +++++++----
 tree-diff.c      |    1 +
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/diffcore-break.c b/diffcore-break.c
index ae8a7d0..c71a226 100644
--- a/diffcore-break.c
+++ b/diffcore-break.c
@@ -45,8 +45,8 @@ static int should_break(struct diff_filespec *src,
 	 * The value we return is 1 if we want the pair to be broken,
 	 * or 0 if we do not.
 	 */
-	unsigned long delta_size, base_size, src_copied, literal_added,
-		src_removed;
+	unsigned long delta_size, base_size, max_size;
+	unsigned long src_copied, literal_added, src_removed;
 
 	*merge_score_p = 0; /* assume no deletion --- "do not break"
 			     * is the default.
@@ -63,7 +63,8 @@ static int should_break(struct diff_filespec *src,
 		return 0; /* error but caught downstream */
 
 	base_size = ((src->size < dst->size) ? src->size : dst->size);
-	if (base_size < MINIMUM_BREAK_SIZE)
+	max_size = ((src->size > dst->size) ? src->size : dst->size);
+	if (max_size < MINIMUM_BREAK_SIZE)
 		return 0; /* we do not break too small filepair */
 
 	if (diffcore_count_changes(src, dst,
@@ -89,12 +90,14 @@ static int should_break(struct diff_filespec *src,
 	 * less than the minimum, after rename/copy runs.
 	 */
 	*merge_score_p = (int)(src_removed * MAX_SCORE / src->size);
+	if (*merge_score_p > break_score)
+		return 1;
 
 	/* Extent of damage, which counts both inserts and
 	 * deletes.
 	 */
 	delta_size = src_removed + literal_added;
-	if (delta_size * MAX_SCORE / base_size < break_score)
+	if (delta_size * MAX_SCORE / max_size < break_score)
 		return 0;
 
 	/* If you removed a lot without adding new material, that is
diff --git a/tree-diff.c b/tree-diff.c
index 26bdbdd..7c261fd 100644
--- a/tree-diff.c
+++ b/tree-diff.c
@@ -319,6 +319,7 @@ static void try_to_follow_renames(struct tree_desc *t1, struct tree_desc *t2, co
 	diff_opts.detect_rename = DIFF_DETECT_RENAME;
 	diff_opts.output_format = DIFF_FORMAT_NO_OUTPUT;
 	diff_opts.single_follow = opt->paths[0];
+	diff_opts.break_opt = opt->break_opt;
 	paths[0] = NULL;
 	diff_tree_setup_paths(paths, &diff_opts);
 	if (diff_setup_done(&diff_opts) < 0)

^ permalink raw reply related

* Re: [PATCH 09/14] Use the asyncronous function infrastructure in builtin-fetch-pack.c.
From: Shawn O. Pearce @ 2007-10-21  0:29 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: git
In-Reply-To: <200710202022.33782.johannes.sixt@telecom.at>

Johannes Sixt <johannes.sixt@telecom.at> wrote:
> On Saturday 20 October 2007 04:53, Shawn O. Pearce wrote:
> >
> > If this is a threaded start_async() system this close is going
> > to impact the caller.
> 
> Yes, I noticed this, too. I think that a solution calls for a member .in of 
> struct async analogous to .in of struct child_process.

Probably.
 
> How do we continue from here? Could you park the series in pu so that I don't 
> have to resend if it turns out that the fix is just another followup patch 
> (which is how I'd prefer to solve the issue)? Then I tell you no or go after 
> I have it tested on mingw.git.

Yes, this series is already queued for pu.  I built the branch last
night but didn't push anything out.  I will be doing a push tonight
and this branch will be included in pu.

I think I would also rather receive a follow up patch than a
replacement/resend.

-- 
Shawn.

^ permalink raw reply

* [PATCH] Fix receive-pack error msg.
From: Joakim Tjernlund @ 2007-10-20 19:31 UTC (permalink / raw)
  To: git; +Cc: Joakim Tjernlund

receive-pack is only executed remotely so when
reporting errors, say so.

Signed-off-by: Joakim Tjernlund <Joakim.Tjernlund@transmode.se>
---
 receive-pack.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/receive-pack.c b/receive-pack.c
index d3c422b..1521d0b 100644
--- a/receive-pack.c
+++ b/receive-pack.c
@@ -166,7 +166,7 @@ static const char *update(struct command *cmd)
 	struct ref_lock *lock;
 
 	if (!prefixcmp(name, "refs/") && check_ref_format(name + 5)) {
-		error("refusing to create funny ref '%s' locally", name);
+		error("refusing to create funny ref '%s' remotely", name);
 		return "funny refname";
 	}
 
-- 
1.5.3.4

^ permalink raw reply related

* Re: [PATCH] On error, do not list all commands, but point to --help option.
From: Johannes Schindelin @ 2007-10-20 23:02 UTC (permalink / raw)
  To: Jari Aalto; +Cc: git
In-Reply-To: <odetifoh.fsf@blue.sea.net>

Hi,

On Sun, 21 Oct 2007, Jari Aalto wrote:

> * Sat 2007-10-20 Johannes Schindelin <Johannes.Schindelin@gmx.de> INBOX
>
> > On Sat, 20 Oct 2007, Jari Aalto wrote:
> >
> >> - commented out call to list_common_cmds_help()
> >
> > If you're really sure that this is desired, do not comment it out.  Delete 
> > it.
> 
> I'm sure.

Well, I'm almost sure of the opposite.  One of the big results of the Git 
Survey was that git is still not user-friendly enough.  Your patch would 
only make this issue worse.

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH] When exec'ing sub-commands,      fall back on execvp (thePATH)
From: Johannes Schindelin @ 2007-10-20 20:25 UTC (permalink / raw)
  To: Scott R Parish; +Cc: git
In-Reply-To: <1192867937.v2.fusewebmail-240137@f>

Hi,

[please do not top post. Just delete everything you do not reply to, and 
put your answers below the text you are replying to.  This spares others 
so much time.]

On Sat, 20 Oct 2007, Scott R Parish wrote:

> The theoretical drawback to this approach is that it could possibly 
> effect the order in which the paths are tried. For instance, if a user 
> did "export GIT_EXEC_PATH=", then the builtin_exec_path wouldn't be 
> tried before the PATH. (i doubt that it would be a problem, but thought 
> i should note it)

In that respect, my code does not change anything from your code.

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH] When exec'ing sub-commands, fall back on execvp (thePATH)
From: Scott Parish @ 2007-10-20 20:57 UTC (permalink / raw)
  To: Johannes Schindelin, git
In-Reply-To: <1192867937.v2.fusewebmail-240137@f>

Actually, i didn't test it right, execve() were using the files in
my cwd. In addition to you patch, you'd need to use execvp() instead
of execve().

sRp

On Sat, Oct 20, 2007 at 03:12:17AM -0500, Scott R Parish wrote:

> Yeah, that seems to work fine. The theoretical drawback to this approach
> is that it could possibly effect the order in which the paths are tried.
> For instance, if a user did "export GIT_EXEC_PATH=", then the
> builtin_exec_path wouldn't be tried before the PATH. (i doubt that it
> would be a problem, but thought i should note it)
> 
> sRp
> 
> 
> ----- Original Message -----
> Subject: Re: [PATCH] When exec'ing sub-commands, fall back on execvp
> (thePATH)
> Date: Sat, October 20, 2007 0:30
> From: "Johannes Schindelin" <Johannes.Schindelin@gmx.de>
> 
> > Hi,
> >
> > On Fri, 19 Oct 2007, Scott Parish wrote:
> >
> > > diff --git a/exec_cmd.c b/exec_cmd.c
> > > index 9b74ed2..674c9f3 100644
> > > --- a/exec_cmd.c
> > > +++ b/exec_cmd.c
> > > @@ -34,15 +34,15 @@ int execv_git_cmd(const char **argv)
> > >  {
> > >          char git_command[PATH_MAX + 1];
> > >          int i;
> > > +        int rc;
> > >          const char *paths[] = { current_exec_path,
> > >                                  getenv(EXEC_PATH_ENVIRONMENT),
> > >                                  builtin_exec_path };
> > > +        const char *tmp;
> > > +        size_t len;
> > >
> > >          for (i = 0; i < ARRAY_SIZE(paths); ++i) {
> > > -                size_t len;
> > > -                int rc;
> > >                  const char *exec_dir = paths[i];
> > > -                const char *tmp;
> > >
> > >                  if (!exec_dir || !*exec_dir) continue;
> > >
> > > @@ -106,8 +106,26 @@ int execv_git_cmd(const char **argv)
> > >
> > >                  argv[0] = tmp;
> > >          }
> > > -        return -1;
> > >
> > > +        rc = snprintf(git_command, sizeof(git_command), "git-%s",
> argv[0]);
> > > +        if (rc < 0 || rc >= sizeof(git_command) - len) {
> > > +                fprintf(stderr, "git: command name given is too
> long.\n");
> > > +                return -1;
> > > +        }
> > > +
> > > +        tmp = argv[0];
> > > +        argv[0] = git_command;
> > > +
> > > +        trace_argv_printf(argv, -1, "trace: exec:");
> > > +
> > > +        /* execve() can only ever return if it fails */
> > > +        execvp(git_command, (char **)argv);
> > > +
> > > +        trace_printf("trace: exec failed: %s\n", strerror(errno));
> > > +
> > > +        argv[0] = tmp;
> > > +
> > > +        return -1;
> > >  }
> >
> > I am not sure that this is elegant enough: Something like this (completely
> > untested) might be better:
> >
> > diff --git a/exec_cmd.c b/exec_cmd.c
> > index 9b74ed2..c928f37 100644
> > --- a/exec_cmd.c
> > +++ b/exec_cmd.c
> > @@ -36,7 +36,8 @@ int execv_git_cmd(const char **argv)
> >          int i;
> >          const char *paths[] = { current_exec_path,
> >                                  getenv(EXEC_PATH_ENVIRONMENT),
> > -                                builtin_exec_path };
> > +                                builtin_exec_path,
> > +                                "" };
> >
> >          for (i = 0; i < ARRAY_SIZE(paths); ++i) {
> >                  size_t len;
> > @@ -44,9 +45,12 @@ int execv_git_cmd(const char **argv)
> >                  const char *exec_dir = paths[i];
> >                  const char *tmp;
> >
> > -                if (!exec_dir || !*exec_dir) continue;
> > +                if (!exec_dir) continue;
> >
> > -                if (*exec_dir != '/') {
> > +                if (!*exec_dir)
> > +                        /* try PATH */
> > +                        *git_command = '\0';
> > +                else if (*exec_dir != '/') {
> >                          if (!getcwd(git_command, sizeof(git_command))) {
> >                                  fprintf(stderr, "git: cannot determine "
> >                                          "current directory: %s\n",
> > @@ -81,7 +85,7 @@ int execv_git_cmd(const char **argv)
> >
> >                  len = strlen(git_command);
> >                  rc = snprintf(git_command + len, sizeof(git_command) -
> len,
> > -                              "/git-%s", argv[0]);
> > +                              "%sgit-%s", *exec_dir ? "/" : "", argv[0]);
> >                  if (rc < 0 || rc >= sizeof(git_command) - len) {
> >                          fprintf(stderr,
> >                                  "git: command name given is too long.\n");
> >
> > Ciao,
> > Dscho
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe git" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> >
> -
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Scott Parish
http://srparish.net/

^ permalink raw reply

* Re: [PATCH] Deduce exec_path also from calls to git with a relative path
From: Scott Parish @ 2007-10-20 23:04 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, spearce, gitster
In-Reply-To: <Pine.LNX.4.64.0710202225520.25221@racer.site>

On Sat, Oct 20, 2007 at 10:31:47PM +0100, Johannes Schindelin wrote:

> BTW I did not mean to discourage you...  Rather, I wanted to show you that 
> this list is a wonderful place to learn, as I did, do, and will do many 
> times here.  (Just to clarify, since somebody said that I am usually not 
> nice to newbies... cannot understand that at all ;-)

Nah, i'm actually rather encouraged that people have shown interest
in my patches and are so quick to find ways to improve on them!

Sorry about the top posting earlier; i've been away from active
open source participation for a while and have been forgetting my
etiquette.

sRp

-- 
Scott Parish
http://srparish.net/

^ permalink raw reply

* Re: [PATCH] On error, do not list all commands, but point to --help option.
From: Johannes Schindelin @ 2007-10-20 20:28 UTC (permalink / raw)
  To: Jari Aalto; +Cc: git
In-Reply-To: <bqaujirk.fsf@blue.sea.net>

Hi,

On Sat, 20 Oct 2007, Jari Aalto wrote:

> - commented out call to list_common_cmds_help()

If you're really sure that this is desired, do not comment it out.  Delete 
it.

But am not at all sure that this is the way to go.  Rather, I like it that 
the most common commands are listed.  It would be better to find out what 
commands are really the most helpful to users who are likely to benefit 
from the list, and to present them better (such as showing them in 
categories).

Ciao,
Dscho

^ permalink raw reply

* [PATCH] execv_git_cmd(): also try PATH if everything else fails.
From: Johannes Schindelin @ 2007-10-20 22:00 UTC (permalink / raw)
  To: Scott Parish; +Cc: git
In-Reply-To: <20071020205721.GA16291@srparish.net>


Earlier, we tried to find the git commands in several possible exec
dirs.  Now, if all of these failed, try to find the git command in
PATH.

This allows you to install the git programs somewhere else than
originally specified when building git, as long as you add that location
to the PATH.

Initial implementation by Scott R Parish.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---

	On Sat, 20 Oct 2007, Scott Parish wrote:

	> Actually, i didn't test it right, execve() were using the files 
	> in my cwd. In addition to you patch, you'd need to use execvp() 
	> instead of execve().

	Ah, right.  I missed that one ;-)

	How about this instead?

 exec_cmd.c |   12 ++++++++----
 1 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/exec_cmd.c b/exec_cmd.c
index 9b74ed2..c928f37 100644
--- a/exec_cmd.c
+++ b/exec_cmd.c
@@ -36,7 +36,8 @@ int execv_git_cmd(const char **argv)
 	int i;
 	const char *paths[] = { current_exec_path,
 				getenv(EXEC_PATH_ENVIRONMENT),
-				builtin_exec_path };
+				builtin_exec_path,
+				"" };
 
 	for (i = 0; i < ARRAY_SIZE(paths); ++i) {
 		size_t len;
@@ -44,9 +45,12 @@ int execv_git_cmd(const char **argv)
 		const char *exec_dir = paths[i];
 		const char *tmp;
 
-		if (!exec_dir || !*exec_dir) continue;
+		if (!exec_dir) continue;
 
-		if (*exec_dir != '/') {
+		if (!*exec_dir)
+			/* try PATH */
+			*git_command = '\0';
+		else if (*exec_dir != '/') {
 			if (!getcwd(git_command, sizeof(git_command))) {
 				fprintf(stderr, "git: cannot determine "
 					"current directory: %s\n",
@@ -81,7 +85,7 @@ int execv_git_cmd(const char **argv)
 
 		len = strlen(git_command);
 		rc = snprintf(git_command + len, sizeof(git_command) - len,
-			      "/git-%s", argv[0]);
+			      "%sgit-%s", *exec_dir ? "/" : "", argv[0]);
 		if (rc < 0 || rc >= sizeof(git_command) - len) {
 			fprintf(stderr,
 				"git: command name given is too long.\n");
-- 
1.5.3.4.1287.g8b31e

^ permalink raw reply related

* Re: [PATCH] On error, do not list all commands, but point to --help option.
From: Shawn O. Pearce @ 2007-10-21  2:06 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Jari Aalto, git
In-Reply-To: <Pine.LNX.4.64.0710210001390.25221@racer.site>

Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> > > On Sat, 20 Oct 2007, Jari Aalto wrote:
> > >
> > >> - commented out call to list_common_cmds_help()
> 
> Well, I'm almost sure of the opposite.  One of the big results of the Git 
> Survey was that git is still not user-friendly enough.  Your patch would 
> only make this issue worse.

Actually I think Jari's patch helps for the reason originally
stated in the message (less output when you make a small typo).
Though I agree that the commented out code should just be removed.

I actually had to do `git config alias.upsh push` just to keep
myself from screaming every time I made a small typo and Git gave
me a screenful of "helpful reminders".

Hmm.  Lets see.

  "cvs foo":
    Big spew of commands.  Like "git foo".

  "svn foo":
    Unknown command: 'foo'
    Type 'svn help' for usage.

Both are considered to be more newbie friendly then Git.  So clearly
SVN's output of almost nothing is friendly.  And so is CVS'
big spew of frequently used commands.  Either way is apparently
newbie friendly.

Though I find SVN's message a little insulting, asking me to type.
I know I have to type the command, thanks.

-- 
Shawn.

^ permalink raw reply

* Re: Git User's Survey 2007 unfinished summary continued
From: Jakub Narebski @ 2007-10-20 23:06 UTC (permalink / raw)
  To: Steffen Prohaska
  Cc: Andreas Ericsson, Federico Mena Quintero, Johannes Schindelin,
	git
In-Reply-To: <DE4FB702-24E8-421F-8447-04A5C7F7B5D2@zib.de>

On 10/20/07, Steffen Prohaska <prohaska@zib.de> wrote:

> Maybe we could group commands into more categories?
>
> plumbing: should be hidden from the 'normal' user. Porcelain
>    should be sufficient for every standard task.

The problem is division between what is porcelain and what is plumbing.
Some commands are right on border (git-fsck, git-update-index, git-rev-parse
comes to mind).

But it should be fairly easy to:
 1. put only porcelain in bash / zsh completion ('git <tab>' shows
only porcelain
 2. move plumbing out of PATH, but use exec-dir instead.

[...]
> mail porcelain: the list will probably hate me for this, but
>    I think all commands needed to create and send patches per
>    mail are not essential. I suspect that I'll _never_ ask
>    my colleagues at work to send me a patch by mail. They'll
>    always push it to a shared repo.

Usually mail porcelain is in separate binary package, git-mail for
RPMS packages for example. But iMVHO git-format-patch is as often used
as other commands, and is certainly  porcelain.

> import/export: Many commands are only used for importing
>    from or exporting to other version control systems. Examples
>    are git-cvs*, git-svn*. They are not needed once you switched
>    to git.

Those are also in separate packages.

> admin: Some commands are not used in a typical workflow. For
>    example git-filter-branch or git-fsck have a more admin
>    flavor.

These are a few commands only. I'm not sure about how to separate
those from ordinary commands.

[...]
> So here are a few questions:
>
> Could we find a small set of core porcelain commands that
> completely cover a typical workflow? The core section of the
> manual should only refer to those commands. Absolutely no
> plumbing should be needed to tweak things. In principle, a
> typical user should be able to work if _all other_ commands
> except for core porcelain are hidden from his PATH.

The problem here I suppose might lie with the same reason why
(almost?) all Office Lite systems failed: because even if 80% of people
use only 20% of functaionality, it is not the _same_ 20%.

-- 
Jakub Narebski

^ permalink raw reply

* Re: Git User's Survey 2007 unfinished summary continued
From: Johannes Schindelin @ 2007-10-20 23:33 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Steffen Prohaska, Andreas Ericsson, Federico Mena Quintero, git
In-Reply-To: <8fe92b430710201606i47e85b24k17abd819bf0d353b@mail.gmail.com>

Hi,

On Sun, 21 Oct 2007, Jakub Narebski wrote:

> On 10/20/07, Steffen Prohaska <prohaska@zib.de> wrote:
> 
> > Maybe we could group commands into more categories?
> >
> > plumbing: should be hidden from the 'normal' user. Porcelain
> >    should be sufficient for every standard task.
> 
> The problem is division between what is porcelain and what is plumbing. 
> Some commands are right on border (git-fsck, git-update-index, 
> git-rev-parse comes to mind).

Sorry, but my impression from the latest mails was that the commands are 
fine.  What is lacking is a nice, _small_ collection of recommended 
workflows.  And when we have agreed on such a set of workflows, we 
optimize the hell out of them.  Only this time it is not performance, but 
user-friendliness.

Hmm?

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH] execv_git_cmd(): also try PATH if everything else fails.
From: Shawn O. Pearce @ 2007-10-21  2:36 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Scott Parish, git
In-Reply-To: <Pine.LNX.4.64.0710202258440.25221@racer.site>

Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> 	On Sat, 20 Oct 2007, Scott Parish wrote:
> 
> 	> Actually, i didn't test it right, execve() were using the files 
> 	> in my cwd. In addition to you patch, you'd need to use execvp() 
> 	> instead of execve().
> 
> 	Ah, right.  I missed that one ;-)
> 
> 	How about this instead?

Uhhh.  Its the same, isn't it?  Still using execve() which means
we will not look at PATH in the final attempt.
 
>  exec_cmd.c |   12 ++++++++----
>  1 files changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/exec_cmd.c b/exec_cmd.c
> index 9b74ed2..c928f37 100644
> --- a/exec_cmd.c
> +++ b/exec_cmd.c
> @@ -36,7 +36,8 @@ int execv_git_cmd(const char **argv)
>  	int i;
>  	const char *paths[] = { current_exec_path,
>  				getenv(EXEC_PATH_ENVIRONMENT),
> -				builtin_exec_path };
> +				builtin_exec_path,
> +				"" };
>  
>  	for (i = 0; i < ARRAY_SIZE(paths); ++i) {
>  		size_t len;
> @@ -44,9 +45,12 @@ int execv_git_cmd(const char **argv)
>  		const char *exec_dir = paths[i];
>  		const char *tmp;
>  
> -		if (!exec_dir || !*exec_dir) continue;
> +		if (!exec_dir) continue;
>  
> -		if (*exec_dir != '/') {
> +		if (!*exec_dir)
> +			/* try PATH */
> +			*git_command = '\0';
> +		else if (*exec_dir != '/') {
>  			if (!getcwd(git_command, sizeof(git_command))) {
>  				fprintf(stderr, "git: cannot determine "
>  					"current directory: %s\n",
> @@ -81,7 +85,7 @@ int execv_git_cmd(const char **argv)
>  
>  		len = strlen(git_command);
>  		rc = snprintf(git_command + len, sizeof(git_command) - len,
> -			      "/git-%s", argv[0]);
> +			      "%sgit-%s", *exec_dir ? "/" : "", argv[0]);
>  		if (rc < 0 || rc >= sizeof(git_command) - len) {
>  			fprintf(stderr,
>  				"git: command name given is too long.\n");
> -- 
> 1.5.3.4.1287.g8b31e

-- 
Shawn.

^ permalink raw reply

* Re: [BUG] git-mv submodule failure
From: Johannes Schindelin @ 2007-10-20 21:24 UTC (permalink / raw)
  To: Yin Ping; +Cc: git
In-Reply-To: <46dff0320710192301p3e1d88d5l3b662b72b051d920@mail.gmail.com>

Hi,

On Sat, 20 Oct 2007, Yin Ping wrote:

> project
>   .git
>   file1
>   submoudle
>      .git
>       file2
> 
> $ cd project
> $ git-mv submodule submodule1
> fatal: source directory is empty, source=submodule, destination=submodule1
> 
> However, the following is ok and rename is automatically detected
> $ cd project
> $ mv submodule submodule1
> $ git-add submodule1
> $ git-commit -a
> 
> which gives in vim:
> # Please enter the commit message for your changes.
> # (Comment lines starting with '#' will not be included)
> # On branch master
> # Changes to be committed:
> #   (use "git reset HEAD <file>..." to unstage)
> #
> #   renamed:    submodule -> submodule1
> #

But of course .gitmodules is unaffected.  But it should be changed, too.

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH] Deduce exec_path also from calls to git with a relative path
From: Johannes Schindelin @ 2007-10-20 21:25 UTC (permalink / raw)
  To: David Brown; +Cc: Scott R Parish, git, spearce, gitster
In-Reply-To: <20071020122516.GA23190@old.davidb.org>

Hi,

On Sat, 20 Oct 2007, David Brown wrote:

> On Sat, Oct 20, 2007 at 08:21:34AM +0100, Johannes Schindelin wrote:
> 
> > For example, when you call "../../hello/world/git", it will not turn 
> > "../../hello/world" into an absolute path, and use that.
> 
> Did you mean "it will turn..."?

Yes, I meant that.  I was in a hurry, since a car was waiting outside the 
door, taking me to the highlands.

Ciao,
Dscho

^ permalink raw reply

* Re: [PATCH-resent] gitk: fix in procedure drawcommits
From: Paul Mackerras @ 2007-10-21  3:01 UTC (permalink / raw)
  To: Michele Ballabio; +Cc: git, Shawn O. Pearce, pdmef
In-Reply-To: <200710201802.48111.barra_cuda@katamail.com>

Michele Ballabio writes:

> This commit (and many others) has two parents, but the two parents
> have the same hash. So gitk tries to unset the same variable twice,
> hence the error. At this point, the fix for gitk should be either to
> check if the parents have the same hash when reading the commit or
> avoiding to unset two times the same variable.

Actually, there is a commit like that in the kernel tree, and with
this clue, I have managed to reproduce the problem on the kernel tree
with the command

	gitk v2.6.12-rc2..13e652800d1644dfedcd0d59ac95ef0beb7f3165

I have just pushed out a fix to my gitk.git tree.

Paul.

^ permalink raw reply

* Re: Announcement of Git wikibook
From: Steven Walter @ 2007-10-21  3:09 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Ciprian Dorin Craciun, Steffen Prohaska, Evan Carroll, git
In-Reply-To: <Pine.LNX.4.64.0710202232280.25221@racer.site>

[-- Attachment #1: Type: text/plain, Size: 1577 bytes --]

On Sat, Oct 20, 2007 at 10:34:34PM +0100, Johannes Schindelin wrote:
> I am torn.  On one side I like the Wiki approach.  On the other hand, the 
> Wiki will get less review by git oldtimers, whereas the patches to 
> user-manual are usually reviewed as thoroughly as the code patches.

No offense, but review by old timers can be both a blessing and a curse.
Well, it's not the "review" that is so much a problem as the "editorial
control."  In my opinion (and I believe this is what the original poster
was saying), the official Git User Manual focuses more on technical
issues and less on introducing git to a new user.

This makes perfect sense given that it's edited by oldtimers, who are
neither inclined nor particularly suited to explaining git to newbies;
they have simply forgotten what it was like for these concepts to be
foreign.  They eat SHA1 hashes for breakfast and dream about index
files.  And that's great :)

I don't think the wikibook should try to duplicate the Git User Manual.
That would be a wasted effort.  But there is a niche to be filled in git
documentation, particularly in regard to specific workflows and git best
practices.  With git, TMTOWTDI.  It's quite difficult for a newbie to
know which of those ways will come back and bite them in the ass down the
road.

Of course, it is a wikibook, so it will go where it goes.  I for one am
glad to see this project started.
-- 
-Steven Walter <stevenrwalter@gmail.com>
Freedom is the freedom to say that 2 + 2 = 4
B2F1 0ECC E605 7321 E818  7A65 FC81 9777 DC28 9E8F 

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* [PATCH] On error, do not list all commands, but point to --help option
From: Jari Aalto @ 2007-10-20 22:41 UTC (permalink / raw)
  To: git
In-Reply-To: <bqaujirk.fsf@blue.sea.net>


- Remove out call to list_common_cmds_help()
- Send error message to stderr, not stdout.

Signed-off-by: Jari Aalto <jari.aalto@cante.net>
---
 help.c |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/help.c b/help.c
index 1cd33ec..814a8cd 100644
--- a/help.c
+++ b/help.c
@@ -185,8 +185,7 @@ static void show_man_page(const char *git_cmd)
 
 void help_unknown_cmd(const char *cmd)
 {
-	printf("git: '%s' is not a git-command\n\n", cmd);
-	list_common_cmds_help();
+	fprintf(stderr, "git: '%s' is not a git-command. See --help\n\n", cmd);
 	exit(1);
 }
 
-- 
1.5.3.2.81.g17ed

^ permalink raw reply related

* Re: [PATCH] On error, do not list all commands, but point to --help option.
From: Jeff King @ 2007-10-21  3:24 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Johannes Schindelin, Jari Aalto, git
In-Reply-To: <20071021020653.GA14735@spearce.org>

On Sat, Oct 20, 2007 at 10:06:53PM -0400, Shawn O. Pearce wrote:

> I actually had to do `git config alias.upsh push` just to keep
> myself from screaming every time I made a small typo and Git gave
> me a screenful of "helpful reminders".

Yeah, somebody should really work on bash completion...

-Peff

^ permalink raw reply

* Re: [PATCH] gitweb: Provide title attributes for abbreviated author names.
From: Shawn O. Pearce @ 2007-10-21  3:25 UTC (permalink / raw)
  To: David Symonds; +Cc: pasky, git
In-Reply-To: <1192581277533-git-send-email-dsymonds@gmail.com>

Nice, but...

David Symonds <dsymonds@gmail.com> wrote:
> +++ b/gitweb/gitweb.perl
> @@ -3461,9 +3461,15 @@ sub git_shortlog_body {
>  			print "<tr class=\"light\">\n";
>  		}
>  		$alternate ^= 1;
> +		my $author = chop_str($co{'author_name'}, 10);
> +		if ($author ne $co{'author_name'}) {
> +			$author = "<span title=\"$co{'author_name'}\">" . esc_html($author) . "</span>";

Doesn't this produce invalid HTML if $co{'author_name'} has a special
HTML character in it such as & or "?  Note that " is much more likely
as it is often used for nicknames.  The old code properly escaped
the author name, and indeed you are doing it for the abbreviated
version but not the full version.

This bug seemed to exist in almost all (if not all) of the hunks.

-- 
Shawn.

^ permalink raw reply

* Re: [PATCH] On error, do not list all commands, but point to --help option.
From: Shawn O. Pearce @ 2007-10-21  3:29 UTC (permalink / raw)
  To: Jeff King; +Cc: Johannes Schindelin, Jari Aalto, git
In-Reply-To: <20071021032427.GB8545@coredump.intra.peff.net>

Jeff King <peff@peff.net> wrote:
> On Sat, Oct 20, 2007 at 10:06:53PM -0400, Shawn O. Pearce wrote:
> 
> > I actually had to do `git config alias.upsh push` just to keep
> > myself from screaming every time I made a small typo and Git gave
> > me a screenful of "helpful reminders".
> 
> Yeah, somebody should really work on bash completion...

  $ git pu<TAB><TAB>
  pull     push

By the time I type out "pus" and hit tab I've already typed out
the name "push ".  Except I frequently find myself getting the
u before the p, which can't complete.  Of course with the above
alias in place "git u<TAB>" completes out uniquely to "git push "
(between bash completion and the alias expansion).

But that alias isn't there for my bash tab completion.  Its there
exactly because otherwise "git upsh" gives me 31 lines of useless
(to me) output without it.

-- 
Shawn.

^ permalink raw reply

* Re: [PATCH] On error, do not list all commands, but point to --help option
From: Shawn O. Pearce @ 2007-10-21  3:33 UTC (permalink / raw)
  To: Jari Aalto; +Cc: git
In-Reply-To: <ir51if2y.fsf@blue.sea.net>

Jari Aalto <jari.aalto@cante.net> wrote:
> - Remove out call to list_common_cmds_help()

Even if the list is against this change (which I'm in favor of)... 

> - Send error message to stderr, not stdout.

I really think this really should be done.  CVS and SVN both print
to stderr in this case, as does any other program I can think of
that takes subcommands.  Its just the right thing to do.

> @@ -185,8 +185,7 @@ static void show_man_page(const char *git_cmd)
>  
>  void help_unknown_cmd(const char *cmd)
>  {
> -	printf("git: '%s' is not a git-command\n\n", cmd);
> -	list_common_cmds_help();
> +	fprintf(stderr, "git: '%s' is not a git-command. See --help\n\n", cmd);

Why are you still printing two LFs here?  We have no additional
text to display after this error message, we probably only need
the one LF.

-- 
Shawn.

^ permalink raw reply

* Re: [PATCH] gitweb: Provide title attributes for abbreviated author names.
From: David Symonds @ 2007-10-21  4:29 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: pasky, git
In-Reply-To: <20071021032533.GA30717@spearce.org>

On 21/10/2007, Shawn O. Pearce <spearce@spearce.org> wrote:
> Nice, but...
>
> David Symonds <dsymonds@gmail.com> wrote:
> > +++ b/gitweb/gitweb.perl
> > @@ -3461,9 +3461,15 @@ sub git_shortlog_body {
> >                       print "<tr class=\"light\">\n";
> >               }
> >               $alternate ^= 1;
> > +             my $author = chop_str($co{'author_name'}, 10);
> > +             if ($author ne $co{'author_name'}) {
> > +                     $author = "<span title=\"$co{'author_name'}\">" . esc_html($author) . "</span>";
>
> Doesn't this produce invalid HTML if $co{'author_name'} has a special
> HTML character in it such as & or "?  Note that " is much more likely
> as it is often used for nicknames.  The old code properly escaped
> the author name, and indeed you are doing it for the abbreviated
> version but not the full version.

Sure, I'll fix it up and resend. I might even refactor some code at
the same time.


Dave.

^ permalink raw reply

* [PATCH 1/2] Use PRIuMAX instead of 'unsigned long long' in show-index
From: Shawn O. Pearce @ 2007-10-21  5:25 UTC (permalink / raw)
  To: git

Elsewhere in Git we already use PRIuMAX and cast to uintmax_t when
we need to display a value that is 'very big' and we're not exactly
sure what the largest display size is for this platform.

This particular fix is needed so we can do the incredibly crazy
temporary hack of:

    diff --git a/cache.h b/cache.h
    index e0abcd6..6637fd8 100644
    --- a/cache.h
    +++ b/cache.h
    @@ -6,6 +6,7 @@

     #include SHA1_HEADER
     #include <zlib.h>
    +#define long long long

     #if ZLIB_VERNUM < 0x1200
     #define deflateBound(c,s)  ((s) + (((s) + 7) >> 3) + (((s) + 63) >> 6) + 11)

allowing us to more easily look for locations where we are passing
a pointer to an 8 byte value to a function that expects a 4 byte
value.  This can occur on some platforms where sizeof(long) == 8
and sizeof(size_t) == 4.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
 show-index.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/show-index.c b/show-index.c
index 57ed9e8..7253991 100644
--- a/show-index.c
+++ b/show-index.c
@@ -68,7 +68,7 @@ int main(int argc, char **argv)
 						     ntohl(off64[1]);
 				off64_nr++;
 			}
-			printf("%llu %s (%08x)\n", (unsigned long long) offset,
+			printf("%" PRIuMAX " %s (%08x)\n", (uintmax_t) offset,
 			       sha1_to_hex(entries[i].sha1),
 			       ntohl(entries[i].crc));
 		}
-- 
1.5.3.4.1270.g2fe543

^ permalink raw reply related

* [PATCH 2/2] Correct some sizeof(size_t) != sizeof(unsigned long) typing errors
From: Shawn O. Pearce @ 2007-10-21  5:25 UTC (permalink / raw)
  To: git

On at least one system I've used recently sizeof(size_t) == 4
and sizeof(unsigned long) == 8.  Trying to pass a pointer to an 8
byte value into strbuf_detach() causes problems as the function is
expecting an address for a 4 byte result location.  Writing only 4
bytes here will leave the other 4 bytes unitialized and may cause
problems when the caller evalutes the result.

I am introducing strbuf_detach_ul() as a variant that takes its
size as an unsigned long rather than as a size_t.  This approach is
shorter than fixing all of the callers to use their own temporary
size_t value for the call.

Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
 builtin-apply.c   |    2 +-
 builtin-archive.c |    2 +-
 diff.c            |    4 ++--
 entry.c           |    2 +-
 strbuf.h          |    8 +++++++-
 test-delta.c      |    3 ++-
 6 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/builtin-apply.c b/builtin-apply.c
index 05c6bc3..022f916 100644
--- a/builtin-apply.c
+++ b/builtin-apply.c
@@ -1955,7 +1955,7 @@ static int apply_data(struct patch *patch, struct stat *st, struct cache_entry *
 
 	if (apply_fragments(&buf, patch) < 0)
 		return -1; /* note with --reject this succeeds. */
-	patch->result = strbuf_detach(&buf, &patch->resultsize);
+	patch->result = strbuf_detach_ul(&buf, &patch->resultsize);
 
 	if (0 < patch->is_delete && patch->resultsize)
 		return error("removal patch leaves file contents");
diff --git a/builtin-archive.c b/builtin-archive.c
index 04385de..46d5de0 100644
--- a/builtin-archive.c
+++ b/builtin-archive.c
@@ -153,7 +153,7 @@ void *sha1_file_to_archive(const char *path, const unsigned char *sha1,
 		strbuf_attach(&buf, buffer, *sizep, *sizep + 1);
 		convert_to_working_tree(path, buf.buf, buf.len, &buf);
 		convert_to_archive(path, buf.buf, buf.len, &buf, commit);
-		buffer = strbuf_detach(&buf, sizep);
+		buffer = strbuf_detach_ul(&buf, sizep);
 	}
 
 	return buffer;
diff --git a/diff.c b/diff.c
index 6648e01..6fd0c0a 100644
--- a/diff.c
+++ b/diff.c
@@ -1519,7 +1519,7 @@ static int populate_from_stdin(struct diff_filespec *s)
 				     strerror(errno));
 
 	s->should_munmap = 0;
-	s->data = strbuf_detach(&buf, &s->size);
+	s->data = strbuf_detach_ul(&buf, &s->size);
 	s->should_free = 1;
 	return 0;
 }
@@ -1611,7 +1611,7 @@ int diff_populate_filespec(struct diff_filespec *s, int size_only)
 		if (convert_to_git(s->path, s->data, s->size, &buf)) {
 			munmap(s->data, s->size);
 			s->should_munmap = 0;
-			s->data = strbuf_detach(&buf, &s->size);
+			s->data = strbuf_detach_ul(&buf, &s->size);
 			s->should_free = 1;
 		}
 	}
diff --git a/entry.c b/entry.c
index 98f5f6d..d36a0bb 100644
--- a/entry.c
+++ b/entry.c
@@ -120,7 +120,7 @@ static int write_entry(struct cache_entry *ce, char *path, const struct checkout
 		strbuf_init(&buf, 0);
 		if (convert_to_working_tree(ce->name, new, size, &buf)) {
 			free(new);
-			new = strbuf_detach(&buf, &size);
+			new = strbuf_detach_ul(&buf, &size);
 		}
 
 		if (to_tempfile) {
diff --git a/strbuf.h b/strbuf.h
index 9b9e861..d6d6bd0 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -52,7 +52,13 @@ struct strbuf {
 /*----- strbuf life cycle -----*/
 extern void strbuf_init(struct strbuf *, size_t);
 extern void strbuf_release(struct strbuf *);
-extern char *strbuf_detach(struct strbuf *, size_t *);
+extern char *strbuf_detach(struct strbuf *, unsigned long *);
+static inline char *strbuf_detach_ul(struct strbuf *a, unsigned long *n) {
+	size_t len;
+	char *res = strbuf_detach(a, &len);
+	*n = len;
+	return res;
+}
 extern void strbuf_attach(struct strbuf *, void *, size_t, size_t);
 static inline void strbuf_swap(struct strbuf *a, struct strbuf *b) {
 	struct strbuf tmp = *a;
diff --git a/test-delta.c b/test-delta.c
index 3d885ff..018e7dc 100644
--- a/test-delta.c
+++ b/test-delta.c
@@ -20,7 +20,8 @@ int main(int argc, char *argv[])
 	int fd;
 	struct stat st;
 	void *from_buf, *data_buf, *out_buf;
-	unsigned long from_size, data_size, out_size;
+	unsigned long from_size, data_size;
+	size_t out_size;
 
 	if (argc != 5 || (strcmp(argv[1], "-d") && strcmp(argv[1], "-p"))) {
 		fprintf(stderr, "Usage: %s\n", usage_str);
-- 
1.5.3.4.1270.g2fe543

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox