Git development

Git development
 help / color / mirror / Atom feed

* Re: ERANGE strikes again on my Windows build; RFH
From: Johannes Sixt @ 2019-12-29 18:08 UTC (permalink / raw)
  To: Alban Gruin, Torsten Bögershausen; +Cc: Git Mailing List
In-Reply-To: <a32e075b-9e6c-2b6a-8619-0330501eee97@gmail.com>

Am 29.12.19 um 18:25 schrieb Alban Gruin:
> Le 29/12/2019 à 15:29, Torsten Bögershausen a écrit :
>> On Sat, Dec 28, 2019 at 04:41:42PM +0100, Johannes Sixt wrote:
>>> In sha1-file.c:read_object_file_extended() we have the following pattern:
>>>
>>> 	errno = 0;
>>> 	data = read_object(r, repl, type, size);
>>> 	if (data)
>>> 		return data;
>>>
>>> 	if (errno && errno != ENOENT)
>>> 		die_errno(_("failed to read object %s"), oid_to_hex(oid));
>>>
>>> That is, it is expected that read_object() does not change the value of
>>> errno in the non-error case. I find it intriguing that we expect a quite
>>> large call graph that is behind read_object() to behave this way.
>>>
>>> What if a subordinate callee starts doing
>>>
>>> 	if (some_syscall(...) < 0) {
>>> 		if (errno == EEXIST) {
>>> 			/* never mind, that's OK */
>>> 			...
>>> 		}
>>> 	}
>>>
>>> Would it be required to reset errno to its previous value in this
>>> failure-is-not-an-error case?
>>>
>>> The problem in my Windows build is that one of these subordinate
>>> syscalls is vsnprintf() (as part of a strbuf_add variant, I presume),
>>> and it fails with ERANGE when the buffer is too short. Do I have to
>>> modify the vsnprintf emulation to restore errno?
>>
>> If you ask me: I think so, yes.
>> At least the documentation about vsnprintf does not mention that errno is touched at all.
>> That is the man pages for Linux and Mac OS, or see here:
>> https://linux.die.net/man/3/vsnprintf
>>
>> It would make sense to analyze the complete callstack, I think.
>> Is your problem reproducable ?
>>
>> Changing the function strbuf_vaddf() strbuf.c seems to be straight forward to me.
>>
> 
> According to the standard, vsnprintf() _can_ change errno[1] (and the
> BSDs do so[2][3][4].)  But apparently, not to ERANGE.

I am not worried about errno being set (or to what value) when there
actually is an error. I am asking what to do when there is actually *no*
error. In my vsnprintf emulation, the case where ERANGE happens is *not*
an error as far as the emulation is concerned.

What if in the huge call graph behind read_object() some function
changes errno to, say, EEXIST, EISDIR, or ENODIR and the condition under
which this happens is *not* an error in that context? Is the function
required to restore the original errno?

Consider the task to create file "foo/bar.c". We would have to
mkdir("foo"), but it is *not* an error when mkdir() fails with errno ==
EEXIST. Are we required to reset errno back to its old value?

(I know, read_object() is unlikely to allocate files, but I think I have
to explain in some way that the context may define that there is no
error -- even though a lower-level function failed and modified errno.)

-- Hannes

^ permalink raw reply

* Re: ERANGE strikes again on my Windows build; RFH
From: Alban Gruin @ 2019-12-29 17:25 UTC (permalink / raw)
  To: Torsten Bögershausen, Johannes Sixt; +Cc: Git Mailing List
In-Reply-To: <20191229142909.7bmjbrroboitvnzq@tb-raspi4>

Hi,

Le 29/12/2019 à 15:29, Torsten Bögershausen a écrit :
> On Sat, Dec 28, 2019 at 04:41:42PM +0100, Johannes Sixt wrote:
>> In sha1-file.c:read_object_file_extended() we have the following pattern:
>>
>> 	errno = 0;
>> 	data = read_object(r, repl, type, size);
>> 	if (data)
>> 		return data;
>>
>> 	if (errno && errno != ENOENT)
>> 		die_errno(_("failed to read object %s"), oid_to_hex(oid));
>>
>> That is, it is expected that read_object() does not change the value of
>> errno in the non-error case. I find it intriguing that we expect a quite
>> large call graph that is behind read_object() to behave this way.
>>
>> What if a subordinate callee starts doing
>>
>> 	if (some_syscall(...) < 0) {
>> 		if (errno == EEXIST) {
>> 			/* never mind, that's OK */
>> 			...
>> 		}
>> 	}
>>
>> Would it be required to reset errno to its previous value in this
>> failure-is-not-an-error case?
>>
>> The problem in my Windows build is that one of these subordinate
>> syscalls is vsnprintf() (as part of a strbuf_add variant, I presume),
>> and it fails with ERANGE when the buffer is too short. Do I have to
>> modify the vsnprintf emulation to restore errno?
> 
> If you ask me: I think so, yes.
> At least the documentation about vsnprintf does not mention that errno is touched at all.
> That is the man pages for Linux and Mac OS, or see here:
> https://linux.die.net/man/3/vsnprintf
> 
> It would make sense to analyze the complete callstack, I think.
> Is your problem reproducable ?
> 
> Changing the function strbuf_vaddf() strbuf.c seems to be straight forward to me.
> 

According to the standard, vsnprintf() _can_ change errno[1] (and the
BSDs do so[2][3][4].)  But apparently, not to ERANGE.

[1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/vfprintf.html
[2]
https://www.freebsd.org/cgi/man.cgi?query=vsnprintf&manpath=FreeBSD+12.1-RELEASE
[3] https://www.freebsd.org/cgi/man.cgi?query=vsnprintf&manpath=NetBSD+8.1
[4] https://man.openbsd.org/vsnprintf

Cheers,
Alban


^ permalink raw reply

* Re: Comparing rebase --am with --interactive via p3400
From: Alban Gruin @ 2019-12-29 17:25 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Johannes Schindelin, Git Mailing List
In-Reply-To: <CABPp-BGCQR+MLVTDoaXDmPrE4SCu+dQ794X6Kvx1PpiQ=6D7KQ@mail.gmail.com>

Hi Elijah,

Le 27/12/2019 à 23:45, Elijah Newren a écrit :
> Hi Alban,
> 
> On Fri, Dec 27, 2019 at 1:11 PM Alban Gruin <alban.gruin@gmail.com> wrote:
>>
>> Hi Johannes & Elijah,
>>
>> Le 01/02/2019 à 07:04, Johannes Schindelin a écrit :
>>> Hi Elijah,
>>>
>>> as discussed at the Contributors' Summit, I ran p3400 as-is (i.e. with the
>>> --am backend) and then with --keep-empty to force the interactive backend
>>> to be used. Here are the best of 10, on my relatively powerful Windows 10
>>> laptop, with current `master`.
>>>
>>> With regular rebase --am:
>>>
>>> 3400.2: rebase on top of a lot of unrelated changes             5.32(0.06+0.15)
>>> 3400.4: rebase a lot of unrelated changes without split-index   33.08(0.04+0.18)
>>> 3400.6: rebase a lot of unrelated changes with split-index      30.29(0.03+0.18)
>>>
>>> with --keep-empty to force the interactive backend:
>>>
>>> 3400.2: rebase on top of a lot of unrelated changes             3.92(0.03+0.18)
>>> 3400.4: rebase a lot of unrelated changes without split-index   33.92(0.03+0.22)
>>> 3400.6: rebase a lot of unrelated changes with split-index      38.82(0.03+0.16)
>>>
>>> I then changed it to -m to test the current scripted version, trying to
>>> let it run overnight, but my laptop eventually went to sleep and the tests
>>> were not even done. I'll let them continue and report back.
>>>
>>> My conclusion after seeing these numbers is: the interactive rebase is
>>> really close to the performance of the --am backend. So to me, it makes a
>>> total lot of sense to switch --merge over to it, and to make --merge the
>>> default. We still should investigate why the split-index performance is so
>>> significantly worse, though.
>>>
>>> Ciao,
>>> Dscho
>>>
>>
>> I investigated a bit on this.  From a quick glance at a callgrind trace,
>> I can see that ce_write_entry() is called 20 601[1] times with `git am',
>> but 739 802 times with the sequencer when the split-index is enabled.
> 
> Sweet, thanks for digging in and analyzing this.
> 
>> For reference, here are the timings, measured on my Linux machine, on a
>> tmpfs, with git.git as the repo:
>>
>> `rebase --am':
>>> 3400.2: rebase on top of a lot of unrelated changes             0.29(0.24+0.03)
>>> 3400.4: rebase a lot of unrelated changes without split-index   6.77(6.51+0.22)
>>> 3400.6: rebase a lot of unrelated changes with split-index      4.43(4.29+0.13)
>> `rebase --quiet':
> 
> --quiet?  Isn't that flag supposed to work with both backends and not
> imply either one?  We previously used --keep-empty, though there's a
> chance that flag means we're not doing a fair comparison (since 'am'
> will drop empty commits and thus have less work to do).  Is there any
> chance you actually ran a different command, but when you went to
> summarize just typed the wrong flag name?  Anyway, the best would
> probably be to use --merge here (at the time Johannes and I were
> testing, that wouldn't have triggered the sequencer, but it does now),
> after first applying the en/rebase-backend series just to make sure
> we're doing an apples to apples comparison.  However, I suspect that
> empty commits probably weren't much of a factor and you did find some
> interesting things...
> 

Yes, I did use `--keep-empty' but misremembered it when writing this email…

>>> 3400.2: rebase on top of a lot of unrelated changes             0.24(0.21+0.02)
>>> 3400.4: rebase a lot of unrelated changes without split-index   5.60(5.32+0.27)
>>> 3400.6: rebase a lot of unrelated changes with split-index      5.67(5.40+0.26)
>>
>> This comes from two things:
>>
>> 1. There is not enough shared entries in the index with the sequencer.
>>
>> do_write_index() is called only by do_write_locked_index() with `--am',
>> but is also called by write_shared_index() with the sequencer once for
>> every other commit.  As the latter is only called by
>> write_locked_index(), which means that too_many_not_shared_entries()
>> returns true for the sequencer, but never for `--am'.
>>
>> Removing the call to discard_index() in do_pick_commit() (as in the
>> first attached patch) solve this particular issue, but this would
>> require a more thorough analysis to see if it is actually safe to do.
> 
> I'm actually surprised the sequencer would call discard_index(); I
> would have thought it would have relied on merge_recursive() to do the
> necessary index changes and updates other than writing the new index
> out.  But I'm not quite as familar with the sequencer so perhaps
> there's some reason I'm unaware of.  (Any chance this is a left-over
> from when sequencer invoked external scripts to do the work, and thus
> the index was updated in another processes' memory and on disk, and it
> had to discard and re-read to get its own process updated?)
> 

The sequencer re-reads the index after invoking an external command
(either `git checkout', `git merge' or an `exec' command from the todo
list), which makes sense.  But this one seems to come from 6eb1b437933
("cherry-pick/revert: make direct internal call to merge_tree()",
2008-09-02).  So, yes, quite old, and perhaps no longer justified.

I know I had to add another discard_cache() in rebase--interactive.c
because it broke something with the submodules, but it does not seems
all that useful now that rebase.c no longer has to fork to use the
sequencer.

Cheers,
Alban


^ permalink raw reply

* [PATCH v2 1/1] git-gui: allow opening currently selected file
From: Zoli Szabó via GitGitGadget @ 2019-12-29 19:32 UTC (permalink / raw)
  To: git; +Cc: Pratyush Yadav, Zoli Szabó
In-Reply-To: <pull.499.v2.git.1577647930.gitgitgadget@gmail.com>

From: =?UTF-8?q?Zoli=20Szab=C3=B3?= <zoli.szabo@gmail.com>

...in the default associated app (e.g. in a text editor / IDE).

Many times there's the need to quickly open a source file (the one you're
looking at in Git GUI) in the predefined text editor / IDE. Of course,
the file can be searched for in your preferred file manager or directly
in the text editor, but having the option to directly open the current
file from Git GUI would be just faster. This change enables just that by:
 - clicking the diff header path (which is now highlighted as a hyperlink)
 - or diff header path context menu -> Open;

Note: executable files will be run and not opened for editing.

Signed-off-by: Zoli Szabó <zoli.szabo@gmail.com>
---
 git-gui.sh | 29 ++++++++++++++++++++++++-----
 1 file changed, 24 insertions(+), 5 deletions(-)

diff --git a/git-gui.sh b/git-gui.sh
index c1be733e3e..8920e4ddb0 100755
--- a/git-gui.sh
+++ b/git-gui.sh
@@ -2248,9 +2248,8 @@ proc do_git_gui {} {
 	}
 }
 
-proc do_explore {} {
-	global _gitworktree
-	set explorer {}
+# Get the system-specific explorer app/command.
+proc get_explorer {} {
 	if {[is_Cygwin] || [is_Windows]} {
 		set explorer "explorer.exe"
 	} elseif {[is_MacOSX]} {
@@ -2259,9 +2258,23 @@ proc do_explore {} {
 		# freedesktop.org-conforming system is our best shot
 		set explorer "xdg-open"
 	}
+	return $explorer
+}
+
+proc do_explore {} {
+	global _gitworktree
+	set explorer [get_explorer]
 	eval exec $explorer [list [file nativename $_gitworktree]] &
 }
 
+# Open file relative to the working tree by the default associated app.
+proc do_file_open {file} {
+	global _gitworktree
+	set explorer [get_explorer]
+	set full_file_path [file join $_gitworktree $file]
+	eval exec $explorer [list [file nativename $full_file_path]] &
+}
+
 set is_quitting 0
 set ret_code    1
 
@@ -3513,9 +3526,11 @@ tlabel .vpane.lower.diff.header.file \
 	-justify left
 tlabel .vpane.lower.diff.header.path \
 	-background gold \
-	-foreground black \
+	-foreground blue \
 	-anchor w \
-	-justify left
+	-justify left \
+	-font [eval font create [font configure font_ui] -underline 1] \
+	-cursor hand2
 pack .vpane.lower.diff.header.status -side left
 pack .vpane.lower.diff.header.file -side left
 pack .vpane.lower.diff.header.path -fill x
@@ -3530,8 +3545,12 @@ $ctxm add command \
 			-type STRING \
 			-- $current_diff_path
 	}
+$ctxm add command \
+	-label [mc Open] \
+	-command {do_file_open $current_diff_path}
 lappend diff_actions [list $ctxm entryconf [$ctxm index last] -state]
 bind_button3 .vpane.lower.diff.header.path "tk_popup $ctxm %X %Y"
+bind .vpane.lower.diff.header.path <Button-1> {do_file_open $current_diff_path}
 
 # -- Diff Body
 #
-- 
gitgitgadget

^ permalink raw reply related

* [PATCH v2 0/1] git-gui: allow opening currently selected file
From: Zoli Szabó via GitGitGadget @ 2019-12-29 19:32 UTC (permalink / raw)
  To: git; +Cc: Pratyush Yadav
In-Reply-To: <pull.499.git.1577386915.gitgitgadget@gmail.com>

...in the default associated app (e.g. in a text editor / IDE).

Many times there's the need to quickly open a source file (the one you're
looking at in Git GUI) in the predefined text editor / IDE. Of course, the
file can be searched for in your preferred file manager or directly in the
text editor, but having the option to directly open the current file from
Git GUI would be just faster. This change enables just that by:

 * clicking the diff header path (which is now highlighted as a hyperlink)
 * or diff header path context menu -> Open;

Note: executable files will be run and not opened for editing.

Signed-off-by: Zoli Szabó zoli.szabo@gmail.com [zoli.szabo@gmail.com]

Zoli Szabó (1):
  git-gui: allow opening currently selected file

 git-gui.sh | 29 ++++++++++++++++++++++++-----
 1 file changed, 24 insertions(+), 5 deletions(-)


base-commit: 23cbe427c44645a3ab0449919e55bade5eb264bc
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-499%2Fzoliszabo%2Fgit-gui%2Fopen-current-file-v2
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-499/zoliszabo/git-gui/open-current-file-v2
Pull-Request: https://github.com/gitgitgadget/git/pull/499

Range-diff vs v1:

 1:  fce80f1b95 ! 1:  a6fde256f8 git-gui: add possibility to open currently selected file
     @@ -1,6 +1,6 @@
      Author: Zoli Szabó <zoli.szabo@gmail.com>
      
     -    git-gui: add possibility to open currently selected file
     +    git-gui: allow opening currently selected file
      
          ...in the default associated app (e.g. in a text editor / IDE).
      
     @@ -9,11 +9,10 @@
          the file can be searched for in your preferred file manager or directly
          in the text editor, but having the option to directly open the current
          file from Git GUI would be just faster. This change enables just that by:
     -     - Diff header path context menu -> Open;
     -     - or double-clicking the diff header path.
     +     - clicking the diff header path (which is now highlighted as a hyperlink)
     +     - or diff header path context menu -> Open;
      
     -    One "downside" of the approach is that executable files will be run
     -    and not opened for editing.
     +    Note: executable files will be run and not opened for editing.
      
          Signed-off-by: Zoli Szabó <zoli.szabo@gmail.com>
      
     @@ -26,11 +25,12 @@
       
      -proc do_explore {} {
      -	global _gitworktree
     +-	set explorer {}
      +# Get the system-specific explorer app/command.
      +proc get_explorer {} {
     - 	set explorer {}
       	if {[is_Cygwin] || [is_Windows]} {
       		set explorer "explorer.exe"
     + 	} elseif {[is_MacOSX]} {
      @@
       		# freedesktop.org-conforming system is our best shot
       		set explorer "xdg-open"
     @@ -44,9 +44,7 @@
       	eval exec $explorer [list [file nativename $_gitworktree]] &
       }
       
     -+# Trigger opening a file (relative to the working tree) by the default
     -+# associated app of the OS (e.g. a text editor or IDE).
     -+# FIXME: What about executables (will be run, not opened for editing)?
     ++# Open file relative to the working tree by the default associated app.
      +proc do_file_open {file} {
      +	global _gitworktree
      +	set explorer [get_explorer]
     @@ -57,18 +55,30 @@
       set is_quitting 0
       set ret_code    1
       
     +@@
     + 	-justify left
     + tlabel .vpane.lower.diff.header.path \
     + 	-background gold \
     +-	-foreground black \
     ++	-foreground blue \
     + 	-anchor w \
     +-	-justify left
     ++	-justify left \
     ++	-font [eval font create [font configure font_ui] -underline 1] \
     ++	-cursor hand2
     + pack .vpane.lower.diff.header.status -side left
     + pack .vpane.lower.diff.header.file -side left
     + pack .vpane.lower.diff.header.path -fill x
      @@
       			-type STRING \
       			-- $current_diff_path
       	}
      +$ctxm add command \
      +	-label [mc Open] \
     -+	-command {
     -+		do_file_open $current_diff_path
     -+	}
     ++	-command {do_file_open $current_diff_path}
       lappend diff_actions [list $ctxm entryconf [$ctxm index last] -state]
       bind_button3 .vpane.lower.diff.header.path "tk_popup $ctxm %X %Y"
     -+bind .vpane.lower.diff.header.path <Double-1> {do_file_open $current_diff_path}
     ++bind .vpane.lower.diff.header.path <Button-1> {do_file_open $current_diff_path}
       
       # -- Diff Body
       #

-- 
gitgitgadget

^ permalink raw reply

* Git for Windows v2.25.0-rc0, was Re: [ANNOUNCE] Git v2.25.0-rc0
From: Johannes Schindelin @ 2019-12-29 19:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git-for-windows, git, git-packagers
In-Reply-To: <xmqqblrwm65l.fsf@gitster-ct.c.googlers.com>

Team,

On Wed, 25 Dec 2019, Junio C Hamano wrote:

> An early preview release Git v2.25.0-rc0 is now available for
> testing at the usual places.  It is comprised of 531 non-merge
> commits since v2.24.0, contributed by 61 people, 24 of which are
> new faces.

The corresponding Git for Windows version can now be found at

https://github.com/git-for-windows/git/releases/tag/v2.25.0-rc0.windows.1

The improvement most users are likely to notice is that the installer will
skip over most of the installer "wizard" pages when upgrading: if you saw
all of the options on the page before, it won't be shown unless you
unchecked the box on the bottom.

Ciao,
Johannes

^ permalink raw reply

* Re: Re: [PATCH 1/1] git-gui: add possibility to open currently selected file
From: Zoli Szabó @ 2019-12-29 20:15 UTC (permalink / raw)
  To: Junio C Hamano, Pratyush Yadav; +Cc: git
In-Reply-To: <xmqqy2uxiemt.fsf@gitster-ct.c.googlers.com>


On 2019.12.28 00:32, Junio C Hamano wrote:

> The phrasing on the title is a bit awkward.  "add possibility to do
> X" is better spelled "allow doing X", no?

Thank you, Junio, for the hint. Updated the commit message accordingly 
(PATCH v2).


^ permalink raw reply

* Re: [PATCH v4 10/15] bugreport: add config values from safelist
From: Johannes Schindelin @ 2019-12-29 20:17 UTC (permalink / raw)
  To: Emily Shaffer; +Cc: git
In-Reply-To: <20191213004312.169753-11-emilyshaffer@google.com>

Hi Emily,

On Thu, 12 Dec 2019, Emily Shaffer wrote:

> diff --git a/bugreport.c b/bugreport.c
> index 759cc0b0f8..1fca28f0b9 100644
> --- a/bugreport.c
> +++ b/bugreport.c
> @@ -6,6 +6,9 @@
>  #include "help.h"
>  #include <gnu/libc-version.h>
>  #include "run-command.h"
> +#include "config.h"
> +#include "bugreport-config-safelist.h"
> +#include "khash.h"

Seems that this patch makes things fail in the CI build
(https://dev.azure.com/gitgitgadget/git/_build/results?buildId=23830&view=results):

-- snipsnap --
bugreport.c:10:10: fatal error: 'bugreport-config-safelist.h' file not found
#include "bugreport-config-safelist.h"
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
Makefile:2395: recipe for target 'bugreport.o' failed
make: *** [bugreport.o] Error 1
make: *** Waiting for unfinished jobs....


^ permalink raw reply

* 2.24.1: git cherry-pick --edit fails
From: Steffen Nurpmeso @ 2019-12-29 20:29 UTC (permalink / raw)
  To: git; +Cc: Steffen Nurpmeso

Hello again.

  P.S.: i can confirm the surroundings are fixed in 2.25.0-rc0!

Looking into the archives it could be addressed for upcoming
release already, please ignore then.  But with 2.24.1, if i do

  git cherry-pick --allow-empty --edit SHA1 SHA1...

i end up with

  [topic/anonymous-1 746c421c] include/mx: new-style headers: add forgotten struct forwards
   Date: Fri Aug 23 16:49:53 2019 +0200
   15 files changed, 43 insertions(+), 2 deletions(-)
  On branch topic/anonymous-1
  Cherry-pick currently in progress.

  nothing to commit, working tree clean
  The previous cherry-pick is now empty, possibly due to conflict resolution.
  If you wish to commit it anyway, use:

      git commit --allow-empty

  and then use:

      git cherry-pick --continue

  to resume cherry-picking the remaining commits.
  If you wish to skip this commit, use:

      git cherry-pick --skip

I.e., i edit the commit message, the commit message has been
modified and the valid is ok, but the above appears.

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)

^ permalink raw reply

* [RFC PATCH 0/1] *** Add branchname in commit header ***
From: Arnaud Bertrand @ 2019-12-29 22:26 UTC (permalink / raw)
  To: git; +Cc: gitster, Arnaud Bertrand

From: Arnaud Bertrand <xda@abalgo.com>

For tracability purpose it is often necessary to know which
commit is envolved in a branch
Keeping track of the branchname in the commit header
will make this traceability easy and will facilitate
the graphical toolis that represent the branches and
that have today to use complex algorithm to try to
determine the branch of a commit that was known at
the commit time.

no big change in the code, today rebase is not considered yet
I'm waiting feedback about that before touching
the rebase code.

Arnaud Bertrand (1):
  Add branchname in commit header

 Documentation/pretty-formats.txt |  1 +
 commit.c                         | 11 +++++++++++
 pretty.c                         | 15 +++++++++++++++
 3 files changed, 27 insertions(+)

-- 
2.25.0.rc0.7.g17b02bf28a

^ permalink raw reply

* [RFC PATCH 1/1] Add branchname in commit header
From: Arnaud Bertrand @ 2019-12-29 22:26 UTC (permalink / raw)
  To: git; +Cc: gitster, Arnaud Bertrand
In-Reply-To: <20191229222633.23815-1-arnaud.bertrand@abalgo.com>

From: Arnaud Bertrand <xda@abalgo.com>

Add the branchname in the commit header before the commit message
the following line is added:

branch <branchname>

where <branchname> comes from the function resolve_ref_unsafe("HEAD",...)
without the prefix refs/heads/
A placeholder is added to the pretty format "%Xb" to print the branch information,
X if for "extra-header" and can be use in the future for new features
b is of course for "branch"

the %Xb returns an empty string when branchname information is not found
---
 Documentation/pretty-formats.txt |  1 +
 commit.c                         | 11 +++++++++++
 pretty.c                         | 15 +++++++++++++++
 3 files changed, 27 insertions(+)

diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt
index 1a7212ce5a..bd52908f53 100644
--- a/Documentation/pretty-formats.txt
+++ b/Documentation/pretty-formats.txt
@@ -241,6 +241,7 @@ endif::git-rev-list[]
 '%gE':: reflog identity email (respecting .mailmap, see
 	linkgit:git-shortlog[1] or linkgit:git-blame[1])
 '%gs':: reflog subject
+'%Xb':: branchname in which commit was done
 '%(trailers[:options])':: display the trailers of the body as
 			  interpreted by
 			  linkgit:git-interpret-trailers[1]. The
diff --git a/commit.c b/commit.c
index 434ec030d6..f64a0698be 100644
--- a/commit.c
+++ b/commit.c
@@ -1425,6 +1425,9 @@ int commit_tree_extended(const char *msg, size_t msg_len,
 	int result;
 	int encoding_is_utf8;
 	struct strbuf buffer;
+	const char *branch = "Unknown";
+	int flags;
+	const char *lbranch =resolve_ref_unsafe("HEAD",0,NULL,&flags);
 
 	assert_oid_type(tree, OBJ_TREE);
 
@@ -1453,6 +1456,14 @@ int commit_tree_extended(const char *msg, size_t msg_len,
 		author = git_author_info(IDENT_STRICT);
 	strbuf_addf(&buffer, "author %s\n", author);
 	strbuf_addf(&buffer, "committer %s\n", git_committer_info(IDENT_STRICT));
+	if (lbranch) {
+	   skip_prefix(lbranch,"refs/heads/",&branch);
+	   strbuf_addf(&buffer, "branch %s\n", branch);
+	}
+	else {
+	   strbuf_addf(&buffer, "branch Unknown\n");
+	}
+
 	if (!encoding_is_utf8)
 		strbuf_addf(&buffer, "encoding %s\n", git_commit_encoding);
 
diff --git a/pretty.c b/pretty.c
index 305e903192..5961c39398 100644
--- a/pretty.c
+++ b/pretty.c
@@ -804,6 +804,7 @@ struct format_commit_context {
 
 	/* The following ones are relative to the result struct strbuf. */
 	size_t wrap_start;
+	char *branch;
 };
 
 static void parse_commit_header(struct format_commit_context *context)
@@ -1367,6 +1368,20 @@ static size_t format_commit_one(struct strbuf *sb, /* in UTF-8 */
 		return 1;
 	}
 
+
+	/* Now add extra header info */
+	if (placeholder[0] == 'X') {
+		switch (placeholder[1]) {
+		case 'b': /* branch ... */
+			c->branch = get_header(msg,"branch");
+			if (c->branch)
+				strbuf_addstr(sb, c->branch);
+			free(c->branch);
+			return 2;
+		}
+	}
+
+
 	/* Now we need to parse the commit message. */
 	if (!c->commit_message_parsed)
 		parse_commit_message(c);
-- 
2.25.0.rc0.7.g17b02bf28a


^ permalink raw reply related

* Autotools version
From: Jeffrey Walton @ 2019-12-29 22:25 UTC (permalink / raw)
  To: Git List

Hi Everyone,

I'm trying to build 2.24.1 from sources. It is failing on an old
Fedora machine with the error:

    Autoconf 2.59 or higher is required

The irony of Autootools checking version numbers is not lost on me
considering I only run configure, make and make install.

Would it be possible to drop the Autotools version back to earlier
versions of Git?

Jeff

^ permalink raw reply

* Re: [PATCH 1/1] git-gui: add possibility to open currently selected file
From: Junio C Hamano @ 2019-12-29 23:14 UTC (permalink / raw)
  To: Zoli Szabó; +Cc: Pratyush Yadav, git
In-Reply-To: <ee041086-3579-190d-e607-1973bcc94c23@gmail.com>

Zoli Szabó <zoli.szabo@gmail.com> writes:

> On 2019.12.28 00:32, Junio C Hamano wrote:
>
>> The phrasing on the title is a bit awkward.  "add possibility to do
>> X" is better spelled "allow doing X", no?
>
> Thank you, Junio, for the hint. Updated the commit message accordingly
> (PATCH v2).

Also, do not start the body of the proposed log message with half
sentence.  The title is supposed to be a full sentence.



^ permalink raw reply

* Re: Feature request: add a metadata in the commit: the "commited in branch" information
From: Junio C Hamano @ 2019-12-29 23:17 UTC (permalink / raw)
  To: Arnaud Bertrand; +Cc: git
In-Reply-To: <CAEW0o+jV+r1UMZReRXa3g_fyqCYxHTVYVf6pWvjB7_isofbBaw@mail.gmail.com>

Arnaud Bertrand <xda@abalgo.com> writes:

> I understood that in git philosophy, once it is merged, a branch can
> disappear. But for a lot of companies, a SCM is also a guardian of the
> history.

A lot more important point than "once it is merged" is that the
branch identity is strictly local to your repository.  Contaminating
the object header, which is cast in stone and cannot be modified
after the fact, with such a piece of information will not mix well
with the rest of Git, so ...

^ permalink raw reply

* Re: Autotools version
From: Junio C Hamano @ 2019-12-29 23:30 UTC (permalink / raw)
  To: Jeffrey Walton; +Cc: Git List
In-Reply-To: <CAH8yC8kwsnnvmo2Oyx-NDN17KGWTgDB+_CwNqsqJQWWe47sf3Q@mail.gmail.com>

Jeffrey Walton <noloader@gmail.com> writes:

> Hi Everyone,
>
> I'm trying to build 2.24.1 from sources. It is failing on an old
> Fedora machine with the error:
>
>     Autoconf 2.59 or higher is required
>
> The irony of Autootools checking version numbers is not lost on me
> considering I only run configure, make and make install.
>
> Would it be possible to drop the Autotools version back to earlier
> versions of Git?

Hmph, because the primary way to use our build procedure is to tweak
with either command line or config.mak file without using configure
at all, and because I don't use ./configure at all, I didn't even
know that our release procedure was adding ./configure to the tarball.

I wonder if it would help for me to stop shipping the file so that
anybody who wants to use 'configure' can do so using their own
version of autoconf?  That way, there won't be an issue like this
that may come from autoconf & autotools versions, I guess.

I dunno.

^ permalink raw reply

* Re: Autotools version
From: Eric Sunshine @ 2019-12-29 23:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeffrey Walton, Git List
In-Reply-To: <xmqq4kxiiuba.fsf@gitster-ct.c.googlers.com>

On Sun, Dec 29, 2019 at 6:30 PM Junio C Hamano <gitster@pobox.com> wrote:
> Jeffrey Walton <noloader@gmail.com> writes:
> > I'm trying to build 2.24.1 from sources. It is failing on an old
> > Fedora machine with the error:
> >     Autoconf 2.59 or higher is required
> > Would it be possible to drop the Autotools version back to earlier
> > versions of Git?
>
> Hmph, because the primary way to use our build procedure is to tweak
> with either command line or config.mak file without using configure
> at all, and because I don't use ./configure at all, I didn't even
> know that our release procedure was adding ./configure to the tarball.
>
> I wonder if it would help for me to stop shipping the file so that
> anybody who wants to use 'configure' can do so using their own
> version of autoconf?  That way, there won't be an issue like this
> that may come from autoconf & autotools versions, I guess.

Dropping 'configure' from the tarball would be a step backward, and it
wouldn't help in Jeffrey's particular case. The whole point of
distributing 'configure' in the tarball is so that people can still
build Git itself (via the familiar "./configure && make && make
install") even when they don't have autotools installed or when they
simply can't install a sufficiently recent autotools version. Years
back, I was often in situations building software for old operating
systems for which it was extremely difficult or impossible to get
autotools installed, so having 'configure' packaged in the tarball was
a lifesaver (because, although autotools itself may require a
sufficiently recent development platform, the generated configure
script doesn't).

Also, simply removing the Autoconf 2.59 requirement likely wouldn't
solve Jeffrey's problem since (presumably) the person who set that
version prerequisite did so because Git's configure.ac actually
requires features of Autoconf 2.59 and wouldn't work correctly with an
earlier version. By the way, Autoconf 2.59 was released in 2003, two
years before Git itself, so by restricting ourselves to 2.59 features,
we're already being quite build-friendly to old platforms.

A few options for Jeffery:

If you're not trying to track the bleeding-edge, build Git from the
tarball Junio releases.

If you're trying to follow the bleeding-edge, then grab 'configure'
from the latest tarball release and use it when building git.git.
(This may not be perfect, but configure.ac changes so infrequently
that it may be "good enough".)

Install Autoconf 2.59 rather than relying on the version supplied by
that old Fedora. (This is sometimes a necessary "evil" when building
modern software on old platforms.)

You could also try locally removing the Autoconf 2.59 prerequisite
from configure.ac and see if you get a working 'configure' script out
of it -- but I would be hesitant to accept such a change back into
git.git unless a thorough audit shows that Git's configure.ac doesn't
require any 2.59 features, and even then I would be hesitant because
it would restrict us from taking advantage of 2.59 features (and my
recollection was that 2.59 had a plenty useful enhancements that
earlier versions lacked).

^ permalink raw reply

* Re: Feature request: add a metadata in the commit: the "commited in branch" information
From: Arnaud Bertrand @ 2019-12-29 23:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Arnaud Bertrand, git
In-Reply-To: <xmqqd0c6iuw0.fsf@gitster-ct.c.googlers.com>

Hi Junio,

It really depends how git is used. With big collaborative project
(like git or linux kernel), you are totally right.
for development limited to a company that has developments with team
of 10-20 developers and that uses
a correct SCM plan, the name of the branch is regulated and is
meaningful, mostly  linked to a bug tracking system
system. For audits and  traceability, the branch name is really
important... certainly more than the email of the developer ;-)
So the "contamination" is negligible compare to the bentefit in this context.
It will also helps the graphical tools to have a comprehensive
representation which can do git even better.

If you think it is a bad idea to have it by default, what about an
option to activate this functionality ? Today with the patch I've
done, it is not a problem if there is no branchname in the commit. The
only thing is the "%Xb" placeholder.

I would like to have your advice about the name because I have added
the "branch" metadata but, even it is really the name of the branch, I
think it too "hard". I preferred "BranchOfCommit" or something similar
that is more explicit... I think this one is too heavy. Do you have
other suggestions ?

Thanks for your feedback
.

Le lun. 30 déc. 2019 à 00:20, Junio C Hamano <gitster@pobox.com> a écrit :
>
> Arnaud Bertrand <xda@abalgo.com> writes:
>
> > I understood that in git philosophy, once it is merged, a branch can
> > disappear. But for a lot of companies, a SCM is also a guardian of the
> > history.
>
> A lot more important point than "once it is merged" is that the
> branch identity is strictly local to your repository.  Contaminating
> the object header, which is cast in stone and cannot be modified
> after the fact, with such a piece of information will not mix well
> with the rest of Git, so ...
>
>

^ permalink raw reply

* [L10N] Kickoff for Git 2.25.0 round #1
From: Jiang Xin @ 2019-12-30  1:47 UTC (permalink / raw)
  To: Git List, Alexander Shopov, Jordi Mas, Matthias Rüster,
	Jimmy Angelakos, Christopher Díaz, Jean-Noël Avila,
	Alessandro Menti, Gwan-gyeong Mun, Vasco Almeida,
	Dimitriy Ryazantcev, Peter Krefting,
	Trần Ngọc Quân, Jiang Xin, Yi-Jyun Pan

From: Jiang Xin <worldhello.net@gmail.com>

Hi,

Git v2.25.0-rc0 has been released, and it's time to start new round of git l10n.
This time there are 119 updated messages need to be translated since last update:

    l10n: git.pot: v2.25.0 round 1 (119 new, 13 removed)

    Generate po/git.pot from v2.25.0-rc0 for git v2.25.0 l10n round 1.

    Signed-off-by: Jiang Xin <worldhello.net@gmail.com>

You can get it from the usual place:

    https://github.com/git-l10n/git-po/

As how to update your XX.po and help to translate Git, please see
"Updating a XX.po file" and other sections in "po/README" file.

--
Jiang Xin

^ permalink raw reply

* Re: [RFC PATCH 0/1] *** Add branchname in commit header ***
From: brian m. carlson @ 2019-12-30  2:32 UTC (permalink / raw)
  To: Arnaud Bertrand; +Cc: git, gitster, Arnaud Bertrand
In-Reply-To: <20191229222633.23815-1-arnaud.bertrand@abalgo.com>

[-- Attachment #1: Type: text/plain, Size: 3146 bytes --]

On 2019-12-29 at 22:26:32, Arnaud Bertrand wrote:
> From: Arnaud Bertrand <xda@abalgo.com>
> 
> For tracability purpose it is often necessary to know which
> commit is envolved in a branch
> Keeping track of the branchname in the commit header
> will make this traceability easy and will facilitate
> the graphical toolis that represent the branches and
> that have today to use complex algorithm to try to
> determine the branch of a commit that was known at
> the commit time.
> 
> no big change in the code, today rebase is not considered yet
> I'm waiting feedback about that before touching
> the rebase code.

I encourage you to read back in the history of the list as to why we
haven't done this and why it's not likely to be accepted now, but let me
provide a few reasons of my own.

First, as any contributor to the mailing list can tell you, I am
absolutely terrible at naming things.  I frequently name my branches
something that makes sense to me at the time without regard to whether
that will make sense in the future.  I don't want to memorialize my
momentary thoughtlessness in the history of the repository forever.

Second, one workflow I commonly use is creating a branch with many
commits and then breaking them down into small series that are logical
and easy for review.  If I have a branch called "test-fixes-part7" with
50 commits and then I decide to split that into two branches,
test-fixes-part7 and test-fixes-part8, by copying the branch and using
git reset --hard to truncate the old one, I don't want the old branch
name in my new branch.  A lot of Git workflows assume you can reset and
rename branches this way and having the branch name in the commit header
breaks those workflows.

Third, people reuse branch names.  Right now, I have eight branches with
test fixes all starting with "test-fixes-part" because I'm working on
one major project with all of those test fixes.  However, if a developer
working on another major project also has a lot of changes to the test
suite, they may have lots of identically named branches, which would be
confusing, since our identically named test fix branches would relate to
different projects.  (See my first point.)

However, despite the fact that we aren't likely to add this in the
commit header, there are definitely ways to achieve this.

If you want to include the branch name in the commit, you can do so with
a trailer.  git interpret-trailers can then be used to manipulate and
extract these, and along with a hook, add them automatically if they're
missing.

If you're working on a more centralized project and you want to require
the branch name in your commit trailers, you can set your CI system to
fail or reject commits that don't contain them.  This is the approach
that systems like Gerrit use when the required trailers are missing and
it seems to work reasonably well.

Hopefully these suggestions are helpful in getting you the traceability
you desire without requiring fundamental changes to the way Git works.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

^ permalink raw reply

* Re: Feature request: add a metadata in the commit: the "commited in branch" information
From: Theodore Y. Ts'o @ 2019-12-30  4:15 UTC (permalink / raw)
  To: Arnaud Bertrand; +Cc: Junio C Hamano, git
In-Reply-To: <CAEW0o+g7vXj841h+4nNK8iSoO758Uh9fLKMCN87RE2w2Nd=CRg@mail.gmail.com>

On Mon, Dec 30, 2019 at 12:53:56AM +0100, Arnaud Bertrand wrote:
> Hi Junio,
> 
> It really depends how git is used. With big collaborative project
> (like git or linux kernel), you are totally right.
> for development limited to a company that has developments with team
> of 10-20 developers and that uses
> a correct SCM plan, the name of the branch is regulated and is
> meaningful, mostly  linked to a bug tracking system
> system. For audits and  traceability, the branch name is really
> important... certainly more than the email of the developer ;-)
> So the "contamination" is negligible compare to the bentefit in this context.
> It will also helps the graphical tools to have a comprehensive
> represeintation which can do git even better.

Why does it need to be the branch name?  You can add your own extra
metadata to the git description.  So for example, I might have a git
commit that looks like this:

    ext4: avoid declaring fs inconsistent due to invalid file handles

    If we receive a file handle, either from NFS or open_by_handle_at(2),
    and it points at an inode which has not been initialized, and the file
    system has metadata checksums enabled, we shouldn't try to get the
    inode, discover the checksum is invalid, and then declare the file
    system as being inconsistent.

    ... <details of repro omitted to keep this email short>

    Google-Bug-Id: 120690101
    Upstream-5.0-SHA1: 8a363970d1dc38c4ec4ad575c862f776f468d057
    Tested: used the repro to verify that open_by_handle_at(2)
       will not declare the fs inconsistent
    Effort: storage/ext4       
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Change-Id: Iafb6da7c360a4c34b882f7fd6a91e3bb

The tie-in to the bug tracking system is done via "Google-Bug-Id:".
The Effort tag is used to identify which subteam should be responsble
for rebasing the commit to a newer upstream kernel.  (E.g., how to
account for all of the patches made on top of 4.14.x when you are
rebasing to the newer 4.19 long-term-stable kernel, to make sure all
not-yet-usptreamed commits are carried over during the rebase
process.)

The Upstream-X.Y-SHA1: tag indicates that this is an upstream commit
that was backported to the internal kernel.  If the commit isn't an
upstream backport, we have a policy (which is enforced via an
automated bot when the commit goes through Gerritt for review) that
there must be an "Upstream-Plan: " tag indicating how the committer
plans to get the change upstream.

The automated review bot also enforces that a Tested: tag exists,
describing how the developer tested the change, and Change-Id: is used
to link the commit to Gerrit, which is how we enforce that all commits
have to be reviewed by a second engineer before it is allowed into the
production kernel sources.  We also maintain all of the Gerrit
comments as history and so we can have accountability as to who
reviewed a commit before it was submitted into the release repository.

We also have automated bots which will run checkpatch and note the
warnings from checkpatch as Gerrit commits; and if the kernel doesn't
build on a variety of architetures and configurations (e.g., debug,
installer, etc.) a bot can also report this and add -1 Gerrit review.

See?  You can do an awful lot without regulating and recording the
branch name used by the engineer.  We have full audit and traceability
through the Gerrit reviews, and we can use the Google-Bug-Id to track
which release versions of which kernels have which bugs fixed.

The bottom line is each company is going to have a different workflow
for doing reviews, linkage to bug tracking systems, auditability, etc.
If everybody were to demand their unique scheme was to be supported
directly in Git, it would be a mess.  The scheme that I've described
above needs no special git features.  It just uses some git hooks as a
convenience to to developers to help them fill in these required
fields, using Gerrit for commit review, and some bots which submit
reviews to Gerrit.

Cheers,

						- Ted

^ permalink raw reply

* Re: [RFC PATCH 0/1] *** Add branchname in commit header ***
From: Arnaud Bertrand @ 2019-12-30 10:33 UTC (permalink / raw)
  To: brian m. carlson, Arnaud Bertrand, git, Junio C Hamano,
	Arnaud Bertrand
In-Reply-To: <20191230023250.GC6570@camp.crustytoothpaste.net>

Hello Brian,

Le lun. 30 déc. 2019 à 03:33, brian m. carlson
<sandals@crustytoothpaste.net> a écrit :
>
> On 2019-12-29 at 22:26:32, Arnaud Bertrand wrote:
> > From: Arnaud Bertrand <xda@abalgo.com>
> >
> > For tracability purpose it is often necessary to know which
> > commit is envolved in a branch
> > Keeping track of the branchname in the commit header
> > will make this traceability easy and will facilitate
> > the graphical toolis that represent the branches and
> > that have today to use complex algorithm to try to
> > determine the branch of a commit that was known at
> > the commit time.
> >
> > no big change in the code, today rebase is not considered yet
> > I'm waiting feedback about that before touching
> > the rebase code.
>
> I encourage you to read back in the history of the list as to why we
> haven't done this and why it's not likely to be accepted now, but let me
> provide a few reasons of my own.
>
> First, as any contributor to the mailing list can tell you, I am
> absolutely terrible at naming things.  I frequently name my branches
> something that makes sense to me at the time without regard to whether
> that will make sense in the future.  I don't want to memorialize my
> momentary thoughtlessness in the history of the repository forever.
>

I think you got the point! Git is written by software people for
software people and we know that we don't like to keep track of our
mistakes... Git allows you to to work like this because you can have
your own local branches with the name you want, just use a git merge
--no-ff at the end and only the commits that make senses will be in
the repository forever.
For CMM compliance, the branch type and the branch name must be
described in a SCM plan. In big open source project, I understand it
is not the so important because, at the end, we will only integrate
one patch on the master trunk, it is not like this for most of the
projects in companies.

> Second, one workflow I commonly use is creating a branch with many
> commits and then breaking them down into small series that are logical
> and easy for review.  If I have a branch called "test-fixes-part7" with
> 50 commits and then I decide to split that into two branches,
> test-fixes-part7 and test-fixes-part8, by copying the branch and using
> git reset --hard to truncate the old one, I don't want the old branch
> name in my new branch.  A lot of Git workflows assume you can reset and
> rename branches this way and having the branch name in the commit header
> breaks those workflows.

I understand but it will not break the workflow. .. the fact to have
the branchname as information in the commit header could be no more
than an "additional information". Exactly as you name and email is in
the commit header too It is simply an additional information for those
who want it, who need it. If you don't want to use is, that's right,
this information is even not visible in a normal git log command.It is
only visible with a dedicated placeholder.

>
> Third, people reuse branch names.  Right now, I have eight branches with
> test fixes all starting with "test-fixes-part" because I'm working on
> one major project with all of those test fixes.  However, if a developer
> working on another major project also has a lot of changes to the test
> suite, they may have lots of identically named branches, which would be
> confusing, since our identically named test fix branches would relate to
> different projects.  (See my first point.)

If it is different projects... it will not be confusing. And again,
you have 2 situations:
1. People working with SCM plan and hard branch naming convention. In
this case, they will never get this kind of problem, duplicated name
are impossible "by construction"
2. People that are working with temporary branch will not use the
branchname in the header and having duplicate branchname will no be
confusing.

>
> However, despite the fact that we aren't likely to add this in the
> commit header, there are definitely ways to achieve this.
>

I agree with the exception of the fact that it will depend on the
thoroughness of the developers themselves. And it will be more risky
to develop new features that has to be confident in the commit
message.

> If you want to include the branch name in the commit, you can do so with
> a trailer.  git interpret-trailers can then be used to manipulate and
> extract these, and along with a hook, add them automatically if they're
> missing.

Unfortunately, hooks are (can be) personal and there no way I know to
force a hook to be used. That exactly what I was using today but a lot
of developer that start on the project, clone it but does not use the
correct hook. So, finally, It is impossible to be confident in the
commit message content to get the name of the branch.

Let me explain our workflow (which is a really common one):
Everyday, for backup reasons, all developers have to push their work
on the central repository (at least their major development branch,
not the experimental). In this context, the rebase is not used (should
even be forbidden) but we are using the "merge back" instead (merge
master to localdev) very often to guarantee the coherence between the
current master state and the development..
At the end of each "small development" (bug fix or feature request),
the branch is merged with no fast-forward.
So we have this kind of graph:

*   d122671 (HEAD -> master) master: Merge branch 'dev_feature1'
|\
| * 7a6a93d (dev_feature1) dev_feauture1: some new changes
| *   9e498de Merge branch 'master' into dev_feature1
| |\
| |/
|/|
* | 2fca855 master: feature x merged
| * 3610279 dev_feauture1: some other changes
| * ae69fa8 dev_feauture1: some changes
|/
* 8bfee18 master:current status

As you can see, to explain the flow I had to add the branchname on
each commit ;-)

So, each new version in the master branch is the result of merge
(exactly what you are doing when you integrate patch) but with
difference that we want to keep track of feature history and we want
to be confident in the tool, not in the rigor of our developers. And
we don't want to see the development commits as history of the master
version:
To be more clear, the master previous version is 2fca, not 7a6a nor 9e49.
And it is important to have that view to answer a lot of question
(why, when, what, who, which effort, how long, how many, ... )

>
> If you're working on a more centralized project and you want to require
> the branch name in your commit trailers, you can set your CI system to
> fail or reject commits that don't contain them.  This is the approach
> that systems like Gerrit use when the required trailers are missing and
> it seems to work reasonably well.

I don't know gerrit but I think it is an something that use git as
kernel but I don't know if developers have directly access to the git
repository. or has to use the web interface.
of course, with a upper layer, everything is still possible.

>
> Hopefully these suggestions are helpful in getting you the traceability
> you desire without requiring fundamental changes to the way Git works.

Again, what I propose is certainly not a fundamental change but just
an additional metadata to help those who need it.
I even propose to configure a variable to activate it (even if I
prefer it is activated by default to be sure it is done by new users).
I will publish the patch with the variable but I've seen a small bug
to fix before ;-)

Thanks for your feedback,

Arnaud

> --
> brian m. carlson: Houston, Texas, US
> OpenPGP: https://keybase.io/bk2204

^ permalink raw reply

* Re: Feature request: add a metadata in the commit: the "commited in branch" information
From: Arnaud Bertrand @ 2019-12-30 11:59 UTC (permalink / raw)
  To: Theodore Y. Ts'o; +Cc: Arnaud Bertrand, Junio C Hamano, git
In-Reply-To: <20191230041517.GA84036@mit.edu>

Le lun. 30 déc. 2019 à 05:15, Theodore Y. Ts'o <tytso@mit.edu> a écrit :
>
> On Mon, Dec 30, 2019 at 12:53:56AM +0100, Arnaud Bertrand wrote:
> > Hi Junio,
> >
> > It really depends how git is used. With big collaborative project
> > (like git or linux kernel), you are totally right.
> > for development limited to a company that has developments with team
> > of 10-20 developers and that uses
> > a correct SCM plan, the name of the branch is regulated and is
> > meaningful, mostly  linked to a bug tracking system
> > system. For audits and  traceability, the branch name is really
> > important... certainly more than the email of the developer ;-)
> > So the "contamination" is negligible compare to the bentefit in this context.
> > It will also helps the graphical tools to have a comprehensive
> > represeintation which can do git even better.
>
> Why does it need to be the branch name?  You can add your own extra
> metadata to the git description.

That's typically my problem.  It is not possible "by default", I mean
- It is only possible if the developer configure something
- or if there is an upper layer that guarantee this
By default, there is no hook embedded with the clone. So, as far as I
know (and I hope I'm wrong!), you have to use upper layer tools or to
change environment variables to activate this feature. Furthermore, it
will never be used by third party tool to beautify the branch
representation.

I think that the major problem is that git calls "branches" what it is
not. Git branches are, in reality, "dynamic tags". In other words,
when you are on this tag and you perform a commit, this dynamic tag
moves with your commit. So it is not really a branch as clearcase,
mercurial, svn or cvs defined it.


> So for example, I might have a git
> commit that looks like this:
>
>     ext4: avoid declaring fs inconsistent due to invalid file handles
>
>     If we receive a file handle, either from NFS or open_by_handle_at(2),
>     and it points at an inode which has not been initialized, and the file
>     system has metadata checksums enabled, we shouldn't try to get the
>     inode, discover the checksum is invalid, and then declare the file
>     system as being inconsistent.
>
>     ... <details of repro omitted to keep this email short>
>
>     Google-Bug-Id: 120690101
>     Upstream-5.0-SHA1: 8a363970d1dc38c4ec4ad575c862f776f468d057
>     Tested: used the repro to verify that open_by_handle_at(2)
>        will not declare the fs inconsistent
>     Effort: storage/ext4
>     Signed-off-by: Theodore Ts'o <tytso@mit.edu>
>     Change-Id: Iafb6da7c360a4c34b882f7fd6a91e3bb
>
> The tie-in to the bug tracking system is done via "Google-Bug-Id:".
> The Effort tag is used to identify which subteam should be responsble
> for rebasing the commit to a newer upstream kernel.  (E.g., how to
> account for all of the patches made on top of 4.14.x when you are
> rebasing to the newer 4.19 long-term-stable kernel, to make sure all
> not-yet-usptreamed commits are carried over during the rebase
> process.)
>
> The Upstream-X.Y-SHA1: tag indicates that this is an upstream commit
> that was backported to the internal kernel.  If the commit isn't an
> upstream backport, we have a policy (which is enforced via an
> automated bot when the commit goes through Gerritt for review) that
> there must be an "Upstream-Plan: " tag indicating how the committer
> plans to get the change upstream.
>
> The automated review bot also enforces that a Tested: tag exists,
> describing how the developer tested the change, and Change-Id: is used
> to link the commit to Gerrit, which is how we enforce that all commits
> have to be reviewed by a second engineer before it is allowed into the
> production kernel sources.  We also maintain all of the Gerrit
> comments as history and so we can have accountability as to who
> reviewed a commit before it was submitted into the release repository.
>
> We also have automated bots which will run checkpatch and note the
> warnings from checkpatch as Gerrit commits; and if the kernel doesn't
> build on a variety of architetures and configurations (e.g., debug,
> installer, etc.) a bot can also report this and add -1 Gerrit review.
>
> See?  You can do an awful lot without regulating and recording the
> branch name used by the engineer.  We have full audit and traceability
> through the Gerrit reviews, and we can use the Google-Bug-Id to track
> which release versions of which kernels have which bugs fixed.
>

You have convinced me that Gerrit is a very nice tool that complete git  ;-)
However, one of the main point of git is that it is easy to setup
(once the tool is installed, it is one second to setup a new
repository and track files)
So, I certainly don't want to reduce this strong point of git!

> The bottom line is each company is going to have a different workflow
> for doing reviews, linkage to bug tracking systems, auditability, etc.
> If everybody were to demand their unique scheme was to be supported
> directly in Git, it would be a mess.

Include the name of the branch is not harmless or fanciful, it is
something important in most of the workflows.
For example:
If you check this article:
https://nvie.com/posts/a-successful-git-branching-model/
The branchname is fundamental and the pictures he shows in its article
will never be achieved without keeping track of the branchname.
Git lightened the notion of branch and therefore its use, which is a
good thing but on the other hand, why forget the branch history? And
certainly if it is only a light metadata in the commit ?
Don't you agree that the branchname where modifications were done
could give really precious information ?
Don't you agree that a representation like it is in the article above
is more clear than the standard git representation ?

I think that add the branchname as "weak" metadata, invisible except
with a dedicated request like specific placeholder in git log could be
a real big added value compared to the inconvenient of its absence.

Cheers,

Arnaud

> The scheme that I've described
> above needs no special git features.  It just uses some git hooks as a
> convenience to to developers to help them fill in these required
> fields, using Gerrit for commit review, and some bots which submit
> reviews to Gerrit.
>
> Cheers,
>
>                                                 - Ted

^ permalink raw reply

* Re: [PATCH v2 2/2] sparse-checkout: document interactions with submodules
From: Derrick Stolee @ 2019-12-30 13:11 UTC (permalink / raw)
  To: Eric Sunshine, Derrick Stolee via GitGitGadget
  Cc: Git List, SZEDER Gábor, Elijah Newren, Jon Simons,
	Derrick Stolee, Junio C Hamano
In-Reply-To: <CAPig+cTvFW_84TqKsOPjjBM37cX1OL9uX15APcEYpuKVjvM+dg@mail.gmail.com>

On 12/27/2019 3:20 PM, Eric Sunshine wrote:
> On Fri, Dec 27, 2019 at 1:48 PM Derrick Stolee via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
>> diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
>> @@ -340,4 +340,32 @@ test_expect_success 'cone mode: set with core.ignoreCase=true' '
>> +test_expect_success 'interaction with submodules' '
>> +       ...
>> +       cat >expect <<-EOF &&
>> +               a
>> +               folder1
>> +               modules
>> +       EOF
> 
> You would normally use \-EOF rather than -EOF to make it clear that no
> interpolation is needed/expected within the here-doc body. However,
> this script is already full of -EOF when \-EOF ought to be used, so
> being consistent with existing tests may override an objection.
> 
> Likewise, please note for future reference that the usual way
> here-docs are formatted in Git test scripts is to indent the body of
> the here-doc to the same level as the command which opens it. That is:

Thanks for pointing out the difference, except...
 
>     cat >expect <<\-EOF &&

This should be <<-\EOF

>     a
>     folder1
>     modules
>     EOF
> 
> But, again, this script is already full of these malformatted
> here-docs, so maintaining consistency with the existing test in the
> script is probably okay.

Hm. Having these lines have the same tabbing hurts my eyes (it is
harder to see where the contents end, much like if we didn't tab
inside a subshell or an if block).

This is also a place where we are inconsistent, and it's not just
my fault for writing the test script in my own style. Here are a
few scripts that tab the same way as here:

	t0008-ignore.sh
	t4124-apply-ws-rule.sh
	t9400-diff-highlight.sh

These are definitely the minority. I just mention them so anyone
who does a cleanup of this whitespace inconsistency takes the time
to look for all examples.

For now, I'll fix the here-doc interpolation issue for this test,
but keep the whitespace matching the rest of the test script. I'll
add this concern to my next series.

Thanks,
-Stolee

^ permalink raw reply

* Re: [PATCH 1/3] commit-graph: examine changed-path objects in pack order
From: Derrick Stolee @ 2019-12-30 14:37 UTC (permalink / raw)
  To: Jeff King
  Cc: Garima Singh via GitGitGadget, git, szeder.dev, jonathantanmy,
	jeffhost, me, Junio C Hamano
In-Reply-To: <20191229061246.GB220034@coredump.intra.peff.net>

On 12/29/2019 1:12 AM, Jeff King wrote:
> On Fri, Dec 27, 2019 at 09:51:02AM -0500, Derrick Stolee wrote:
> 
>> On 12/22/2019 4:32 AM, Jeff King wrote:
>>> Looking at the diff of commit objects in pack order is much faster than
>>> in sha1 order, as it gives locality to the access of tree deltas
>>> (whereas sha1 order is effectively random). Unfortunately the
>>> commit-graph code sorts the commits (several times, sometimes as an oid
>>> and sometimes a pointer-to-commit), and we ultimately traverse in sha1
>>> order.
>>>
>>> Instead, let's remember the position at which we see each commit, and
>>> traverse in that order when looking at bloom filters. This drops my time
>>> for "git commit-graph write --changed-paths" in linux.git from ~4
>>> minutes to ~1.5 minutes.
>>
>> I'm doing my own perf tests on these patches, and my copy of linux.git
>> has four packs of varying sizes (corresponding with my rare fetches and
>> lack of repacks). My time goes from 3m50s to 3m00s. I was confused at
>> first, but then realized that I used the "--reachable" flag. In that
>> case, we never run set_commit_pos(), so all positions are equal and the
>> sort is not helpful.
>>
>> I thought that inserting some set_commit_pos() calls into close_reachable()
>> and add_missing_parents() would give some amount of time-order to the
>> commits as we compute the filters. However, the time did not change at
>> all.
>>
>> I've included the patch below for reference, anyway.
> 
> Yeah, I expected that would cover it, too. But instrumenting it to dump
> the position of each commit (see patch below), and then decorating "git
> log" output with the positions (see script below) shows that we're all
> over the map:
> 
>   *   3
>   |\  
>   | * 2791
>   | * 5476
>   | * 8520
>   | * 12040
>   | * 16036
>   * |   2790
>   |\ \  
>   | * | 5475
>   | * | 8519
>   | * | 12039
>   | * | 16035
>   | * | 20517
>   | * | 25527
>   | |/  
>   * |   5474
>   |\ \  
>   | * | 8518
>   | * | 12038
>   * | |   8517
>   [...]

This makes a lot of sense why the previous approach did not work. Thanks!

> I think the root issue is that we never do any date-sorting on the
> commits. So:
> 
>   - we hit each ref tip in lexical order; with tags, this is quite often
>     the opposite of reverse-chronological
> 
>   - we traverse breadth-first, but we don't order queue at all. So if we
>     see a merge X, then we'll next process X^1 and X^2, and then X^1^,
>     and then X^2^, and so forth. So we keep digging equally down
>     simultaneous branches, even if one branch is way shorter than the
>     other. Whereas a regular Git traversal will order the queue by
>     commit timestamp, so it tends to be roughly chronological (of course
>     a topo-sort would work too, but that's probably overkill).
> 
> I wonder if this would be simpler if "commit-graph --reachable" just
> used the regular revision machinery instead of doing its own custom
> traversal.

Instead, why not use our already-computed generation numbers? That seems
to improve the time a bit. (6m30s to 4m50s)

-->8--

From: Derrick Stolee <dstolee@microsoft.com>
Date: Fri, 27 Dec 2019 09:47:49 -0500
Subject: [PATCH] commit-graph: examine commits by generation number

When running 'git commit-graph write --changed-paths', we sort the
commits by pack-order to save time when computing the changed-paths
bloom filters. This does not help when finding the commits via the
--reachable flag.

If not using pack-order, then sort by generation number before
examining the diff. Commits with similar generation are more likely
to have many trees in common, making the diff faster.

On the Linux kernel repository, this change reduced the computation
time for 'git commit-graph write --reachable --changed-paths' from
6m30s to 4m50s.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 commit-graph.c | 33 ++++++++++++++++++++++++++++++---
 1 file changed, 30 insertions(+), 3 deletions(-)

diff --git a/commit-graph.c b/commit-graph.c
index bf6c663772..fe4ab545f2 100644
--- a/commit-graph.c
+++ b/commit-graph.c
@@ -72,6 +72,25 @@ static int commit_pos_cmp(const void *va, const void *vb)
 	       commit_pos_at(&commit_pos, b);
 }
 
+static int commit_gen_cmp(const void *va, const void *vb)
+{
+	const struct commit *a = *(const struct commit **)va;
+	const struct commit *b = *(const struct commit **)vb;
+
+	/* lower generation commits first */
+	if (a->generation < b->generation)
+		return -1;
+	else if (a->generation > b->generation)
+		return 1;
+
+	/* use date as a heuristic when generations are equal */
+	if (a->date < b->date)
+		return -1;
+	else if (a->date > b->date)
+		return 1;
+	return 0;
+}
+
 char *get_commit_graph_filename(const char *obj_dir)
 {
 	char *filename = xstrfmt("%s/info/commit-graph", obj_dir);
@@ -849,7 +868,8 @@ struct write_commit_graph_context {
 		 report_progress:1,
 		 split:1,
 		 check_oids:1,
-		 bloom:1;
+		 bloom:1,
+		 order_by_pack:1;
 
 	const struct split_commit_graph_opts *split_opts;
 	uint32_t total_bloom_filter_size;
@@ -1245,7 +1265,11 @@ static void compute_bloom_filters(struct write_commit_graph_context *ctx)
 
 	ALLOC_ARRAY(sorted_by_pos, ctx->commits.nr);
 	COPY_ARRAY(sorted_by_pos, ctx->commits.list, ctx->commits.nr);
-	QSORT(sorted_by_pos, ctx->commits.nr, commit_pos_cmp);
+
+	if (ctx->order_by_pack)
+		QSORT(sorted_by_pos, ctx->commits.nr, commit_pos_cmp);
+	else
+		QSORT(sorted_by_pos, ctx->commits.nr, commit_gen_cmp);
 
 	for (i = 0; i < ctx->commits.nr; i++) {
 		struct commit *c = sorted_by_pos[i];
@@ -1979,6 +2003,7 @@ int write_commit_graph(const char *obj_dir,
 	}
 
 	if (pack_indexes) {
+		ctx->order_by_pack = 1;
 		if ((res = fill_oids_from_packs(ctx, pack_indexes)))
 			goto cleanup;
 	}
@@ -1988,8 +2013,10 @@ int write_commit_graph(const char *obj_dir,
 			goto cleanup;
 	}
 
-	if (!pack_indexes && !commit_hex)
+	if (!pack_indexes && !commit_hex) {
+		ctx->order_by_pack = 1;
 		fill_oids_from_all_packs(ctx);
+	}
 
 	close_reachable(ctx);
 
-- 
2.25.0.rc0




^ permalink raw reply related

* Re: git filters don't get applied to dotfiles
From: Adrien LEMAIRE @ 2019-12-30 14:42 UTC (permalink / raw)
  To: Dennis Kaarsemaker; +Cc: git
In-Reply-To: <ea322b4c06dce0332ead3521e45514d10f2a76b8.camel@kaarsemaker.net>

Hi Dennis, and thanks for looking into this.

I cannot reproduce this issue anymore, and it works as expected:

$ GIT_TRACE=2 config add .mailrc
23:31:05.135580 git.c:439               trace: built-in: git add .mailrc
23:31:05.135902 run-command.c:663       trace: run_command: 'sed -e
'\''s/gmail.com:.*@smtp/gmail.com:PASSWORD@smtp/'\'''

$ config diff --cached
diff --git a/.mailrc b/.mailrc
new file mode 100644
index 0000000..2698128
--- /dev/null
+++ b/.mailrc
@@ -0,0 +1,4 @@
+account gmail {
+  set v15-compat
+  set mta=smtp://lemaire.adrien%40gmail.com:PASSWORD@smtp.gmail.com:587
smtp-use-starttls
+}

To answer your question, yes I first added the file without a filter.
But I'm pretty sure I did a `config restore --staged .mailrc` after
creating the filter (and I actually repeated the operation several
times before contacting you the other day), but I must have been wrong
about that.
I didn't know about the GIT_TRACE environment variable. Thank you for
teaching me something, and sorry about the false bug report.

Best regards
Adrien

On Mon, Dec 30, 2019 at 1:02 AM Dennis Kaarsemaker
<dennis@kaarsemaker.net> wrote:
>
> On Fri, 2019-12-27 at 16:51 +0900, Adrien LEMAIRE wrote:
> > I'd like to report a bug regarding git filters not being applied to
> > files beginning with a dot character "."
> > Using git version 2.24.1
> > Please let me know if there is a better way to report bugs. The github
> > page only mentions this email.
>
> <snip reproduction recipe>
>
> I was not able to reproduce this in the git test suite with a quick
> patch (see below). Your output does not show any git add command, is it
> possible that you added the changes before configuring the filter?
>
> If you set GIT_TRACE=2 in your environment before doing the git add of
> the .mailrc file, you should see it run the filter command. It should
> look something like:
>
> + git add test test.t test.i .mailrc
> trace: built-in: git add test test.t test.i .mailrc
> trace: run_command: ./rot13.sh
> trace: run_command: ./rot13.sh
>
> (which is a part of the output of GIT_TRACE=2 ./t0021-conversion.sh -x
> -v -i)
>
>
> diff --git t/t0021-conversion.sh t/t0021-conversion.sh
> index 6c6d77b51a..32c27d513b 100755
> --- t/t0021-conversion.sh
> +++ t/t0021-conversion.sh
> @@ -77,6 +77,7 @@ test_expect_success setup '
>
>         {
>             echo "*.t filter=rot13"
> +           echo ".mailrc filter=rot13"
>             echo "*.i ident"
>         } >.gitattributes &&
>
> @@ -88,9 +89,10 @@ test_expect_success setup '
>         cat test >test.t &&
>         cat test >test.o &&
>         cat test >test.i &&
> -       git add test test.t test.i &&
> +       cat test >.mailrc &&
> +       git add test test.t test.i .mailrc &&
>         rm -f test test.t test.i &&
> -       git checkout -- test test.t test.i &&
> +       git checkout -- test test.t test.i .mailrc &&
>
>         echo "content-test2" >test2.o &&
>         echo "content-test3 - filename with special characters" >"test3 '\''sq'\'',\$x=.o"
> @@ -102,6 +104,7 @@ test_expect_success check '
>
>         test_cmp test.o test &&
>         test_cmp test.o test.t &&
> +       test_cmp test.o .mailrc &&
>
>         # ident should be stripped in the repository
>         git diff --raw --exit-code :test :test.i &&
> @@ -110,9 +113,12 @@ test_expect_success check '
>         test "z$id" = "z$embedded" &&
>
>         git cat-file blob :test.t >test.r &&
> +       git cat-file blob :.mailrc >.mailrc.r &&
>
>         ./rot13.sh <test.o >test.t &&
> -       test_cmp test.r test.t
> +       ./rot13.sh <test.o >.mailrc &&
> +       test_cmp test.r test.t &&
> +       test_cmp .mailrc.r .mailrc
>  '
>
>  # If an expanded ident ever gets into the repository, we want to make sure that
>

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox