* [PATCH 1/7] diff: export diffstat interface
From: Daniel Ferreira via GitGitGadget @ 2018-12-20 12:09 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Daniel Ferreira
In-Reply-To: <pull.103.git.gitgitgadget@gmail.com>
From: Daniel Ferreira <bnmvco@gmail.com>
Make the diffstat interface (namely, the diffstat_t struct and
compute_diffstat) no longer be internal to diff.c and allow it to be used
by other parts of git.
This is helpful for code that may want to easily extract information
from files using the diff machinery, while flushing it differently from
how the show_* functions used by diff_flush() do it. One example is the
builtin implementation of git-add--interactive's status.
Signed-off-by: Daniel Ferreira <bnmvco@gmail.com>
Signed-off-by: Slavica Djukic <slawica92@hotmail.com>
---
diff.c | 36 ++++++++++++++----------------------
diff.h | 18 ++++++++++++++++++
2 files changed, 32 insertions(+), 22 deletions(-)
diff --git a/diff.c b/diff.c
index dc9965e836..46a7d8cf29 100644
--- a/diff.c
+++ b/diff.c
@@ -2421,22 +2421,6 @@ static void pprint_rename(struct strbuf *name, const char *a, const char *b)
}
}
-struct diffstat_t {
- int nr;
- int alloc;
- struct diffstat_file {
- char *from_name;
- char *name;
- char *print_name;
- const char *comments;
- unsigned is_unmerged:1;
- unsigned is_binary:1;
- unsigned is_renamed:1;
- unsigned is_interesting:1;
- uintmax_t added, deleted;
- } **files;
-};
-
static struct diffstat_file *diffstat_add(struct diffstat_t *diffstat,
const char *name_a,
const char *name_b)
@@ -5922,12 +5906,7 @@ void diff_flush(struct diff_options *options)
dirstat_by_line) {
struct diffstat_t diffstat;
- memset(&diffstat, 0, sizeof(struct diffstat_t));
- for (i = 0; i < q->nr; i++) {
- struct diff_filepair *p = q->queue[i];
- if (check_pair_status(p))
- diff_flush_stat(p, options, &diffstat);
- }
+ compute_diffstat(options, &diffstat);
if (output_format & DIFF_FORMAT_NUMSTAT)
show_numstat(&diffstat, options);
if (output_format & DIFF_FORMAT_DIFFSTAT)
@@ -6227,6 +6206,19 @@ static int is_submodule_ignored(const char *path, struct diff_options *options)
return ignored;
}
+void compute_diffstat(struct diff_options *options, struct diffstat_t *diffstat)
+{
+ int i;
+ struct diff_queue_struct *q = &diff_queued_diff;
+
+ memset(diffstat, 0, sizeof(struct diffstat_t));
+ for (i = 0; i < q->nr; i++) {
+ struct diff_filepair *p = q->queue[i];
+ if (check_pair_status(p))
+ diff_flush_stat(p, options, diffstat);
+ }
+}
+
void diff_addremove(struct diff_options *options,
int addremove, unsigned mode,
const struct object_id *oid,
diff --git a/diff.h b/diff.h
index ce5e8a8183..7809db3039 100644
--- a/diff.h
+++ b/diff.h
@@ -239,6 +239,22 @@ void diff_emit_submodule_error(struct diff_options *o, const char *err);
void diff_emit_submodule_pipethrough(struct diff_options *o,
const char *line, int len);
+struct diffstat_t {
+ int nr;
+ int alloc;
+ struct diffstat_file {
+ char *from_name;
+ char *name;
+ char *print_name;
+ const char *comments;
+ unsigned is_unmerged:1;
+ unsigned is_binary:1;
+ unsigned is_renamed:1;
+ unsigned is_interesting:1;
+ uintmax_t added, deleted;
+ } **files;
+};
+
enum color_diff {
DIFF_RESET = 0,
DIFF_CONTEXT = 1,
@@ -327,6 +343,8 @@ void diff_change(struct diff_options *,
struct diff_filepair *diff_unmerge(struct diff_options *, const char *path);
+void compute_diffstat(struct diff_options *options, struct diffstat_t *diffstat);
+
#define DIFF_SETUP_REVERSE 1
#define DIFF_SETUP_USE_SIZE_CACHE 4
--
gitgitgadget
^ permalink raw reply related
* Re: Periodic hang during git index-pack
From: Sitsofe Wheeler @ 2018-12-20 10:03 UTC (permalink / raw)
To: Jeff King; +Cc: git
In-Reply-To: <20181219232200.GB21283@sigill.intra.peff.net>
On Wed, 19 Dec 2018 at 23:22, Jeff King <peff@peff.net> wrote:
>
> On Wed, Dec 19, 2018 at 10:59:30PM +0000, Sitsofe Wheeler wrote:
>
> > While using trying to use git to clone a remote repository git
> > index-pack occasionally goes on to hang:
> > [...]
> > Looking at where it is stuck, git is doing read of a pipe:
> >
> > #0 0x00007fd1b845034e in __libc_read (fd=fd@entry=0,
> > buf=buf@entry=0x55ab81e19d40 <input_buffer>, nbytes=nbytes@entry=4096)
> > at ../sysdeps/unix/sysv/linux/read.c:27
> > #1 0x000055ab81b51b23 in read (__nbytes=4096, __buf=0x55ab81e19d40
> > <input_buffer>, __fd=0)
> > at /usr/include/x86_64-linux-gnu/bits/unistd.h:44
>
> Index-pack is reading the pack on stdin, so it's expecting more bytes.
> Those bytes should be coming from the git-clone process, which is
> relaying the bytes from the other side.
>
> Check the backtrace of git-clone to see why it isn't feeding more data
> (but note that it will generally have two threads -- one processing the
> data from the remote, and one wait()ing for index-pack to finish).
>
> My guess, though, is that you'll find that git-clone is simply waiting
> on another pipe: the one from ssh.
Ok here are backtraces from another run (with SSH multiplexing disabled):
(gdb) thread apply all bt
Thread 2 (Thread 0x7faafbf1c700 (LWP 36586)):
#0 0x00007faafc805384 in __libc_read (fd=fd@entry=5,
buf=buf@entry=0x7faafbf0ddec, nbytes=nbytes@entry=5)
at ../sysdeps/unix/sysv/linux/read.c:27
#1 0x000055c8ca2f5b23 in read (__nbytes=5, __buf=0x7faafbf0ddec, __fd=5)
at /usr/include/x86_64-linux-gnu/bits/unistd.h:44
#2 xread (fd=fd@entry=5, buf=buf@entry=0x7faafbf0ddec, len=len@entry=5)
at wrapper.c:260
#3 0x000055c8ca2f5cdb in read_in_full (fd=5, buf=buf@entry=0x7faafbf0bdf0,
count=5, count@entry=8193) at wrapper.c:318
#4 0x000055c8ca27222b in get_packet_data (fd=fd@entry=5,
src_buf=src_buf@entry=0x0, src_size=src_size@entry=0x0,
dst=dst@entry=0x7faafbf0bdf0, size=size@entry=8193,
options=options@entry=0) at pkt-line.c:289
#5 0x000055c8ca272ed8 in packet_read_with_status (fd=fd@entry=5,
src_buffer=src_buffer@entry=0x0, src_len=src_len@entry=0x0,
buffer=buffer@entry=0x7faafbf0bdf0
"\001\344\305\066JJj\341q@\243\225ժ\350\026M\fkM9:-ƀ\253\206\336\001\275\070\325\372\250\204\232aM\221\213(\320B%\a\275\233\261g\321A\245\n
\247\374\326\b'\v\252\277rA\211\312l\212j\352\177\260\317j\aT\252&t2\256\254\360\002\217V\024\061k\201ڲ;;\017`\361\020:*b\n5\222\036i\272\067}\360,\323\345Y\314ir\311\034b\232F\267\364\016]",
size=size@entry=65520,
pktlen=pktlen@entry=0x7faafbf0bd94, options=0) at pkt-line.c:344
#6 0x000055c8ca273078 in packet_read (fd=fd@entry=5,
src_buffer=src_buffer@entry=0x0, src_len=src_len@entry=0x0,
buffer=buffer@entry=0x7faafbf0bdf0
"\001\344\305\066JJj\341q@\243\225ժ\350\026M\fkM9:-ƀ\253\206\336\001\275\070\325\372\250\204\232aM\221\213(\320B%\a\275\233\261g\321A\245\n
\247\374\326\b'\v\252\277rA\211\312l\212j\352\177\260\317j\aT\252&t2\256\254\360\002\217V\024\061k\201ڲ;;\017`\361\020:*b\n5\222\036i\272\067}\360,\323\345Y\314ir\311\034b\232F\267\364\016]",
size=size@entry=65520,
options=options@entry=0) at pkt-line.c:364
#7 0x000055c8ca2cbf73 in recv_sideband (
me=me@entry=0x55c8ca3466ed "fetch-pack", in_stream=5, out=out@entry=6)
at sideband.c:143
#8 0x000055c8ca22d6cb in sideband_demux (in=<optimised out>, out=6,
data=<optimised out>) at fetch-pack.c:776
#9 0x000055c8ca2a8458 in run_thread (data=0x7ffda489ef90)
at run-command.c:1032
#10 0x00007faafc7fb6db in start_thread (arg=0x7faafbf1c700)
at pthread_create.c:463
#11 0x00007faafc31c88f in clone ()
at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Thread 1 (Thread 0x7faafce3bb80 (LWP 36584)):
#0 0x00007faafc805384 in __libc_read (fd=fd@entry=7,
buf=buf@entry=0x7ffda489ef10, nbytes=nbytes@entry=46)
at ../sysdeps/unix/sysv/linux/read.c:27
#1 0x000055c8ca2f5b23 in read (__nbytes=46, __buf=0x7ffda489ef10, __fd=7)
at /usr/include/x86_64-linux-gnu/bits/unistd.h:44
#2 xread (fd=fd@entry=7, buf=buf@entry=0x7ffda489ef10, len=len@entry=46)
at wrapper.c:260
#3 0x000055c8ca2f5cdb in read_in_full (fd=7, buf=buf@entry=0x7ffda489ef10,
count=count@entry=46) at wrapper.c:318
#4 0x000055c8ca26a54f in index_pack_lockfile (ip_out=<optimised out>)
at pack-write.c:297
#5 0x000055c8ca22e18c in get_pack (args=args@entry=0x7ffda489f2e0,
xd=xd@entry=0x55c8cb4966f8,
pack_lockfile=pack_lockfile@entry=0x55c8cb496660) at fetch-pack.c:881
#6 0x000055c8ca22fbfb in do_fetch_pack (pack_lockfile=<optimised out>,
si=0x7ffda489f1f0, nr_sought=<optimised out>, sought=<optimised out>,
orig_ref=<optimised out>, fd=<optimised out>, args=0x7ffda489f2e0)
at fetch-pack.c:1019
#7 fetch_pack (args=args@entry=0x7ffda489f2e0, fd=<optimised out>,
conn=<optimised out>, ref=<optimised out>,
dest=dest@entry=0x55c8cb498820
"remotehost:diffusion/LIBEDIT/libedit",
sought=sought@entry=0x55c8cb498580,
nr_sought=<optimised out>, shallow=<optimised out>,
pack_lockfile=<optimised out>, version=<optimised out>)
at fetch-pack.c:1649
#8 0x000055c8ca2dddf8 in fetch_refs_via_pack (transport=0x55c8cb496620,
nr_heads=2, to_fetch=0x55c8cb498580) at transport.c:365
#9 0x000055c8ca2df246 in transport_fetch_refs (
transport=transport@entry=0x55c8cb496620, refs=refs@entry=0x55c8cb4959e0)
at transport.c:1295
#10 0x000055c8ca1529ab in cmd_clone (argc=<optimised out>,
argv=<optimised out>, prefix=<optimised out>) at builtin/clone.c:1212
#11 0x000055c8ca1374e1 in run_builtin (argv=<optimised out>,
argc=<optimised out>, p=<optimised out>) at git.c:421
#12 handle_builtin (argc=<optimised out>, argv=<optimised out>) at git.c:647
#13 0x000055c8ca138515 in run_argv (argv=0x7ffda489fa00, argcp=0x7ffda489fa0c)
at git.c:701
#14 cmd_main (argc=<optimised out>, argv=<optimised out>) at git.c:798
#15 0x000055c8ca13718f in main (argc=4, argv=0x7ffda489fc78)
at common-main.c:45
(gdb) thread apply all bt
Thread 1 (Thread 0x7ff8a03dab80 (LWP 36587)):
#0 0x00007ff89fda434e in __libc_read (fd=fd@entry=0,
buf=buf@entry=0x5604bea43d40 <input_buffer>, nbytes=nbytes@entry=4096)
at ../sysdeps/unix/sysv/linux/read.c:27
#1 0x00005604be77bb23 in read (__nbytes=4096,
__buf=0x5604bea43d40 <input_buffer>, __fd=0)
at /usr/include/x86_64-linux-gnu/bits/unistd.h:44
#2 xread (fd=0, buf=0x5604bea43d40 <input_buffer>, len=4096) at wrapper.c:260
#3 0x00005604be5fb069 in fill (min=min@entry=1) at builtin/index-pack.c:255
#4 0x00005604be5fb23a in unpack_entry_data (offset=528312,
size=size@entry=19448, type=<optimised out>, oid=oid@entry=0x5604c0605ea0)
at builtin/index-pack.c:445
#5 0x00005604be5fbbc7 in unpack_raw_entry (oid=0x5604c0605ea0,
ref_oid=0x7ffc74cd73f0, ofs_offset=0x5604c0603580, obj=0x5604c0605ea0)
at builtin/index-pack.c:526
#6 parse_pack_objects (hash=0x7ffc74cd74a0 "\001")
at builtin/index-pack.c:1113
#7 cmd_index_pack (argc=<optimised out>, argv=<optimised out>,
prefix=<optimised out>) at builtin/index-pack.c:1775
#8 0x00005604be5bd4e1 in run_builtin (argv=<optimised out>,
argc=<optimised out>, p=<optimised out>) at git.c:421
#9 handle_builtin (argc=<optimised out>, argv=<optimised out>) at git.c:647
#10 0x00005604be5be515 in run_argv (argv=0x7ffc74cd7640, argcp=0x7ffc74cd764c)
at git.c:701
#11 cmd_main (argc=<optimised out>, argv=<optimised out>) at git.c:798
#12 0x00005604be5bd18f in main (argc=7, argv=0x7ffc74cd78b8)
at common-main.c:45
As before, any hints on how to debug this gratefully accepted!
--
Sitsofe | http://sucs.org/~sits/
^ permalink raw reply
* Re: [PATCH 0/8] introduce no-overlay and cached mode in git checkout
From: Thomas Gummerer @ 2018-12-20 8:43 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Nguyễn Thái Ngọc Duy, Elijah Newren
In-Reply-To: <xmqqa7le2gc2.fsf@gitster-ct.c.googlers.com>
On 12/10, Junio C Hamano wrote:
> Thomas Gummerer <t.gummerer@gmail.com> writes:
>
> > Basically the idea is to also delete files when the match <pathspec>
> > in 'git checkout <tree-ish> -- <pathspec>' in the current tree, but
> > don't match <pathspec> in <tree-ish>.
>
> I cannot quite parse it, but perhaps.
>
> "git checkout --no-overlay <tree-ish> -- <pathspec>" can
> remove paths in the index and in the working tree that match
> <pathspec>, if they do not appear in <tree-ish>.
>
> If a new file D/F is in the index and in the working tree but not in
> HEAD, "git checkout HEAD D/" or "git checkout HEAD D/F" would not
> remove D/F from the index or the working tree.
>
> With the --no-overlay option, it would, and that is often closer to
> the wish of the user who wanted to say "restore the working tree
> state of D/ (or D/F) from the state recorded in HEAD".
Yeah thanks, that reads much better.
> > The final step in the series is to actually make use of this in 'git
> > stash', which simplifies the code there a bit. I am however happy to
> > hold off on this step until the stash-in-C series is merged, so we
> > don't delay that further.
>
> I think that is probably a good idea, for now.
>
> > In addition to the no-overlay mode, we also add a --cached mode, which
> > works only on the index, thus similar to 'git reset <tree-ish> -- <pathspec>'.
> >
> > Actually deprecating 'git reset <tree-ish> -- <pathspec>' should come
> > later,...
>
> Or we may not even need to deprecate it. IIRC, what "stash" wished
> to exist was "git reset --hard <tree-ish> -- <pathspec>", which, if
> the command followed "--cached/--index" convention, would have been
> called "git reset --index ...". Did we actually have the need for
> "--cached" mode?
I don't have a pressing need for "--cached". I mainly included it
because you described "git reset <tree-ish> -- <pathspec>" as bad UI
in the original thread [*1*], which after reading that message I agree
with. It also seemed to cause some confusion in [*2*]. Since it was
fairly easy to introduce a "--cached" mode it seemed like a potential
UI improvement in the long term to deprecate 'git reset <tree-ish> --
<pathspec>'.
That said, this series is probably not the right place to introduce
this feature, as it should mainly be focused on the no-overlay mode.
I'll drop the patch from v2.
We can revisit whether we want to introduce a "--cached" mode in 'git
checkout' at some point in the future.
> > probably not before Duy's restore-files command lands, as 'git
> > checkout --no-overlay <tree-ish> -- <pathspec>' is a bit cumbersome to
> > type compared to 'git reset <tree-ish> -- <pathspec>'.
>
> Yes, between "checkout --cached" and "checkout --no-overlay", the
> latter is much more important, as the latter is what a missing "git
> reset --hard <tree-ish> -- <pathspec>" would have been, but the
> former can be written with an existing command.
>
> > My hope is also that the no-overlay mode could become the new default
> > in the restore-files command Duy is currently working on.
>
> Yup, that is my hope, too ;-).
*1*: <xmqq4loqplou.fsf@gitster.mtv.corp.google.com>
*2*: <alpine.LFD.2.21.1812081103500.29142@localhost.localdomain>
^ permalink raw reply
* A bug in git-add with GIT_DIR?
From: Orgad Shaneh @ 2018-12-20 8:37 UTC (permalink / raw)
To: git
Hi,
I played around with t5403-post-checkout-hook, and noticed that its
state is not exactly what I'd expect it to be.
The test setup is:
echo Data for commit0. >a &&
echo Data for commit0. >b &&
git update-index --add a &&
git update-index --add b &&
tree0=$(git write-tree) &&
commit0=$(echo setup | git commit-tree $tree0) &&
git update-ref refs/heads/master $commit0 &&
git clone ./. clone1 &&
git clone ./. clone2 &&
GIT_DIR=clone2/.git git branch new2 &&
echo Data for commit1. >clone2/b &&
GIT_DIR=clone2/.git git add clone2/b &&
GIT_DIR=clone2/.git git commit -m new2
Now, the line before the last one executes git add clone2/b with GIT_DIR set.
I'd expect that to add b inside clone2, but instead it adds an
inexistent clone2/clone2/b, and if I stop at this line, then the
status shows:
On branch master
Your branch is up to date with 'origin/master'.
Changes to be committed:
(use "git reset HEAD <file>..." to unstage)
new file: clone2/b
Changes not staged for commit:
(use "git add/rm <file>..." to update what will be committed)
(use "git checkout -- <file>..." to discard changes in working directory)
modified: b
deleted: clone2/b
Is this the intended behavior? It looks like that's not what the test
meant to do anyway...
And if I change it to (cd clone2 && git add b), then the commits look
reasonable, but step 6 fails.
- Orgad
^ permalink raw reply
* Re: [PATCH v3 1/3] ref-filter: add worktreepath atom
From: Nickolai Belakovski @ 2018-12-20 7:09 UTC (permalink / raw)
To: Jeff King
Cc: git, Rafael Ascensão, Junio C Hamano,
Ævar Arnfjörð Bjarmason
In-Reply-To: <20181218172236.GA28455@sigill.intra.peff.net>
On Tue, Dec 18, 2018 at 9:22 AM Jeff King <peff@peff.net> wrote:
>
> On Sun, Dec 16, 2018 at 01:57:57PM -0800, nbelakovski@gmail.com wrote:
>
> > From: Nickolai Belakovski <nbelakovski@gmail.com>
> >
> > Add an atom proving the path of the linked worktree where this ref is
> > checked out, if it is checked out in any linked worktrees, and empty
> > string otherwise.
>
> I stumbled over the word "proving" here. Maybe "showing" would be more
> clear?
Oops, providing
> > +worktreepath::
> > + The absolute path to the worktree in which the ref is checked
> > + out, if it is checked out in any linked worktree. ' ' otherwise.
> > +
>
> Also, why are we replacing it with a single space? Wouldn't the empty
> string be more customary (and work with the other "if empty, then do
> this" formatting options)?
I was just following what was done for HEAD, but overall I agree that
empty is preferable to single space, will change.
> Minor style nit: we put the "*" in a pointer declaration next to the
> variable name, without intervening whitespace. Like:
>
> static struct worktree **worktrees;
Gotcha, will do, thanks for pointing it out.
To sum up the hashmap comments:
-I hadn't thought to re-use the head_ref of worktree as the key.
That's clever. I like the readability of having separate entries for
key and value, but I can see the benefit of not having to do an extra
allocation. I can make up for the readability hit with a comment.
-Actually, for any valid use case there will only be one instance of
the map since the entries of used_atom are cached, but regardless it
makes sense to keep per-atom info in used_atom and global context
somewhere else, so I'll make that change to make it a static variable
outside of used_atom.
-Will change the lookup logic to remove the extra allocation. Since
I'm letting the hashmap use its internal comparison function on the
hash, I don't need to provide a comparison function.
> What's this extra strncmp about? If we're _not_ a worktreepath atom,
> we'd still do the lookup only to put nothing in the string?
Leftover from an earlier iteration where I was going to support
getting more info out of the worktree struct. I decided to limit scope
to just the info I really needed for the branch change. I left it like
this because I thought it would make the code more readable for
someone who wanted to come in and add that extra info, but I think
you're right that it ends up just reading kind of awkwardly.
>
> > @@ -2013,7 +2076,14 @@ void ref_array_clear(struct ref_array *array)
> > int i;
> >
> > for (i = 0; i < used_atom_cnt; i++)
> > + {
> > + if (!strncmp(used_atom[i].name, "worktreepath", strlen("worktreepath")))
> > + {
> > + hashmap_free(&(used_atom[i].u.reftoworktreeinfo_map), 1);
> > + free_worktrees(worktrees);
> > + }
>
> And if we move the mapping out to a static global, then this only has to
> be done once, not once per atom. In fact, I think this could double-free
> "worktrees" with your current patch if you have two "%(worktree)"
> placeholders, since "worktrees" already is a global.
Only if someone put a colon on one of the %(worktree) atoms, otherwise
they're all cached, but as you say moot point anyway if the map is
moved outside the used_atom structure.
>
> It's probably worth testing that the path we get is actually sane, too.
> I.e., expect something more like:
>
> cat >expect <<-\EOF
> master: $PWD
> master: $PWD/worktree
> side: not checked out
> EOF
> git for-each-ref \
> --format="%(refname:short): %(if)%(worktreepath)%(then)%(worktreepath)%(else)not checked %out%(end)
>
> (I wish there was a way to avoid that really long line, but I don't
> think there is).
>
Yea good call, can do.
Thanks for all the feedback, will try to turn these around quickly.
^ permalink raw reply
* Re: [PATCH 1/3] setup: drop return value from `read_repository_format()`
From: brian m. carlson @ 2018-12-20 3:45 UTC (permalink / raw)
To: Jeff King; +Cc: Martin Ågren, git
In-Reply-To: <20181220025211.GB24002@sigill.intra.peff.net>
[-- Attachment #1: Type: text/plain, Size: 1805 bytes --]
On Wed, Dec 19, 2018 at 09:52:12PM -0500, Jeff King wrote:
> On Thu, Dec 20, 2018 at 12:17:53AM +0000, brian m. carlson wrote:
>
> > On Wed, Dec 19, 2018 at 10:27:35AM -0500, Jeff King wrote:
> > > I dunno. This is one of those dark corners of the code where we appear
> > > to do the wrong thing, but nobody seems to have noticed or cared much,
> > > and changing it runs the risk of breaking some obscure cases. I'm not
> > > sure if we should bite the bullet and try to address that, or just back
> > > away slowly and pretend we never looked at it. ;)
> >
> > I will point out that with the SHA-256 work, reading the config file
> > becomes essential for SHA-256 repositories, because we need to know the
> > object format. Removing the config file leads to things blowing up in a
> > bad way (what specific bad way I don't remember).
> >
> > That may influence the direction we want to take in this work, or not.
>
> Wouldn't we just treat that the same way we do now? I.e., assume the
> default of sha1, just like we assume repositoryformatversion==0?
Yeah, we'll default to SHA-1, but the repository will be broken. HEAD
can't be read. Trying to run git status dies with "fatal: Unknown index
entry format". And so on. We've written data with 64-character object
IDs, which can't be read by Git in SHA-1 mode.
My point is essentially that in an SHA-256 repository, the config file
isn't optional anymore. We probably need to consider that and error out
in more situations (e.g. unreadable file or I/O error) instead of
silently falling back to the defaults, since failing loudly in a visible
way is better than having the user try to figure out why the index is
suddenly "corrupt".
--
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]
^ permalink raw reply
* Re: [PATCH 1/3] setup: drop return value from `read_repository_format()`
From: Jeff King @ 2018-12-20 2:52 UTC (permalink / raw)
To: brian m. carlson, Martin Ågren, git
In-Reply-To: <20181220001752.GA35965@genre.crustytoothpaste.net>
On Thu, Dec 20, 2018 at 12:17:53AM +0000, brian m. carlson wrote:
> On Wed, Dec 19, 2018 at 10:27:35AM -0500, Jeff King wrote:
> > I dunno. This is one of those dark corners of the code where we appear
> > to do the wrong thing, but nobody seems to have noticed or cared much,
> > and changing it runs the risk of breaking some obscure cases. I'm not
> > sure if we should bite the bullet and try to address that, or just back
> > away slowly and pretend we never looked at it. ;)
>
> I will point out that with the SHA-256 work, reading the config file
> becomes essential for SHA-256 repositories, because we need to know the
> object format. Removing the config file leads to things blowing up in a
> bad way (what specific bad way I don't remember).
>
> That may influence the direction we want to take in this work, or not.
Wouldn't we just treat that the same way we do now? I.e., assume the
default of sha1, just like we assume repositoryformatversion==0?
-Peff
^ permalink raw reply
* Re: Referring to commits in commit messages
From: Jeff King @ 2018-12-20 2:51 UTC (permalink / raw)
To: Jonathan Nieder
Cc: Ævar Arnfjörð Bjarmason, git, Han-Wen Nienhuys,
Johannes Schindelin
In-Reply-To: <20181219232948.GD228469@google.com>
On Wed, Dec 19, 2018 at 03:29:48PM -0800, Jonathan Nieder wrote:
> > I'm also not sure it really matters all that much either way. If you buy
> > my argument that this is just about placing the general era of the
> > commit in the mind of the reader, then "just before v2.11" or "just
> > after v2.11" are about the same.
>
> If it's that unreliable, I'd rather just have the hash, to be honest.
Well, that was sort of my point. :) I think the hash is the most
interesting part, and everything else is gravy for the reader to save
them time digging into the commit.
> > I think that's a good idea if something is in fact being fixed. But
> > there are many other reasons to refer to another commit in prose (or
> > even outside of a commit message entirely).
>
> Sure, but in those cases do we need the ability to query on them?
I'm not sure what you mean. We were talking about how to reference
commits in prose. I think a "Fixes" trailer eliminates the need to do so
(or least makes it redundant) in _some_ cases, but the other cases are
still of interest.
> To me it seems similar to having a policy on how to reference people
> in commit messages (e.g. "always include their email address"), so
> that I can grep for a contributor to see how they were involved in a
> patch. If it's not structured data, then at some point I stop
> worrying so much about machine parsability.
Sure. All I'm really saying is "always include the hash".
-Peff
^ permalink raw reply
* [PATCH] diff: add support for reading files literally with --no-index
From: brian m. carlson @ 2018-12-20 0:26 UTC (permalink / raw)
To: git; +Cc: Jeff King, Duy Nguyen
In some shells, such as bash and zsh, it's possible to use a command
substitution to provide the output of a command as a file argument to
another process, like so:
diff -u <(printf "a\nb\n") <(printf "a\nc\n")
However, this syntax does not produce useful results with git diff
--no-index.
On macOS, the arguments to the command are named pipes under /dev/fd,
and git diff doesn't know how to handle a named pipe. On Linux, the
arguments are symlinks to pipes, so git diff "helpfully" diffs these
symlinks, comparing their targets like "pipe:[1234]" and "pipe:[5678]".
Because this behavior is not very helpful, and because git diff has many
features that people would like to use even on non-Git files, add an
option to git diff --no-index to read files literally, dereferencing
symlinks and reading them as a normal file.
Note that this behavior requires that the files be read entirely into
memory, just as we do when reading from standard input.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
This is a long-standing annoyance of mine, and it also makes some use
cases possible (for example, diffing filtered and non-filtered objects).
We don't include a test for the pipe scenario because I couldn't get
that case to work in portable shell (although of course it works in
bash). I have, however, tested it on both macOS and Linux. No clue how
this works on Windows.
Documentation/git-diff.txt | 5 +++++
diff-no-index.c | 34 +++++++++++++++++++++++++++-------
diff.c | 24 +++++++++++++-----------
diff.h | 1 +
diffcore.h | 1 +
t/t4053-diff-no-index.sh | 28 ++++++++++++++++++++++++++++
6 files changed, 75 insertions(+), 18 deletions(-)
diff --git a/Documentation/git-diff.txt b/Documentation/git-diff.txt
index 030f162f30..4c4695c88d 100644
--- a/Documentation/git-diff.txt
+++ b/Documentation/git-diff.txt
@@ -111,6 +111,11 @@ include::diff-options.txt[]
"Unmerged". Can be used only when comparing the working tree
with the index.
+--literally::
+ Read the specified files literally, as `diff` would,
+ dereferencing any symlinks and reading data from pipes.
+ This option only works with `--no-index`.
+
<path>...::
The <paths> parameters, when given, are used to limit
the diff to the named paths (you can give directory
diff --git a/diff-no-index.c b/diff-no-index.c
index 9414e922d1..2707206aee 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -75,7 +75,25 @@ static int populate_from_stdin(struct diff_filespec *s)
return 0;
}
-static struct diff_filespec *noindex_filespec(const char *name, int mode)
+static int populate_literally(struct diff_filespec *s)
+{
+ struct strbuf buf = STRBUF_INIT;
+ size_t size = 0;
+ int fd = xopen(s->path, O_RDONLY);
+
+ if (strbuf_read(&buf, fd, 0) < 0)
+ return error_errno("error while reading from '%s'", s->path);
+
+ s->should_munmap = 0;
+ s->data = strbuf_detach(&buf, &size);
+ s->size = size;
+ s->should_free = 1;
+ s->read_literally = 1;
+ return 0;
+}
+
+static struct diff_filespec *noindex_filespec(const char *name, int mode,
+ struct diff_options *o)
{
struct diff_filespec *s;
@@ -85,6 +103,8 @@ static struct diff_filespec *noindex_filespec(const char *name, int mode)
fill_filespec(s, &null_oid, 0, mode);
if (name == file_from_standard_input)
populate_from_stdin(s);
+ else if (o->flags.read_literally)
+ populate_literally(s);
return s;
}
@@ -101,14 +121,14 @@ static int queue_diff(struct diff_options *o,
if (S_ISDIR(mode1)) {
/* 2 is file that is created */
- d1 = noindex_filespec(NULL, 0);
- d2 = noindex_filespec(name2, mode2);
+ d1 = noindex_filespec(NULL, 0, o);
+ d2 = noindex_filespec(name2, mode2, o);
name2 = NULL;
mode2 = 0;
} else {
/* 1 is file that is deleted */
- d1 = noindex_filespec(name1, mode1);
- d2 = noindex_filespec(NULL, 0);
+ d1 = noindex_filespec(name1, mode1, o);
+ d2 = noindex_filespec(NULL, 0, o);
name1 = NULL;
mode1 = 0;
}
@@ -189,8 +209,8 @@ static int queue_diff(struct diff_options *o,
SWAP(name1, name2);
}
- d1 = noindex_filespec(name1, mode1);
- d2 = noindex_filespec(name2, mode2);
+ d1 = noindex_filespec(name1, mode1, o);
+ d2 = noindex_filespec(name2, mode2, o);
diff_queue(&diff_queued_diff, d1, d2);
return 0;
}
diff --git a/diff.c b/diff.c
index dc9965e836..740d0087b9 100644
--- a/diff.c
+++ b/diff.c
@@ -4282,18 +4282,18 @@ static void run_diff_cmd(const char *pgm,
fprintf(o->file, "* Unmerged path %s\n", name);
}
-static void diff_fill_oid_info(struct diff_filespec *one, struct index_state *istate)
+static void diff_fill_oid_info(struct diff_filespec *one, struct diff_options *o)
{
if (DIFF_FILE_VALID(one)) {
if (!one->oid_valid) {
struct stat st;
- if (one->is_stdin) {
+ if (one->is_stdin || one->read_literally) {
oidclr(&one->oid);
return;
}
if (lstat(one->path, &st) < 0)
die_errno("stat '%s'", one->path);
- if (index_path(istate, &one->oid, one->path, &st, 0))
+ if (index_path(o->repo->index, &one->oid, one->path, &st, 0))
die("cannot hash %s", one->path);
}
}
@@ -4341,8 +4341,8 @@ static void run_diff(struct diff_filepair *p, struct diff_options *o)
return;
}
- diff_fill_oid_info(one, o->repo->index);
- diff_fill_oid_info(two, o->repo->index);
+ diff_fill_oid_info(one, o);
+ diff_fill_oid_info(two, o);
if (!pgm &&
DIFF_FILE_VALID(one) && DIFF_FILE_VALID(two) &&
@@ -4389,8 +4389,8 @@ static void run_diffstat(struct diff_filepair *p, struct diff_options *o,
if (o->prefix_length)
strip_prefix(o->prefix_length, &name, &other);
- diff_fill_oid_info(p->one, o->repo->index);
- diff_fill_oid_info(p->two, o->repo->index);
+ diff_fill_oid_info(p->one, o);
+ diff_fill_oid_info(p->two, o);
builtin_diffstat(name, other, p->one, p->two,
diffstat, o, p);
@@ -4414,8 +4414,8 @@ static void run_checkdiff(struct diff_filepair *p, struct diff_options *o)
if (o->prefix_length)
strip_prefix(o->prefix_length, &name, &other);
- diff_fill_oid_info(p->one, o->repo->index);
- diff_fill_oid_info(p->two, o->repo->index);
+ diff_fill_oid_info(p->one, o);
+ diff_fill_oid_info(p->two, o);
builtin_checkdiff(name, other, attr_path, p->one, p->two, o);
}
@@ -5159,6 +5159,8 @@ int diff_opt_parse(struct diff_options *options,
options->flags.funccontext = 1;
else if (!strcmp(arg, "--no-function-context"))
options->flags.funccontext = 0;
+ else if (!strcmp(arg, "--literally"))
+ options->flags.read_literally = 1;
else if ((argcount = parse_long_opt("output", av, &optarg))) {
char *path = prefix_filename(prefix, optarg);
options->file = xfopen(path, "w");
@@ -5720,8 +5722,8 @@ static int diff_get_patch_id(struct diff_options *options, struct object_id *oid
if (DIFF_PAIR_UNMERGED(p))
continue;
- diff_fill_oid_info(p->one, options->repo->index);
- diff_fill_oid_info(p->two, options->repo->index);
+ diff_fill_oid_info(p->one, options);
+ diff_fill_oid_info(p->two, options);
len1 = remove_space(p->one->path, strlen(p->one->path));
len2 = remove_space(p->two->path, strlen(p->two->path));
diff --git a/diff.h b/diff.h
index ce5e8a8183..7dedd3bcd1 100644
--- a/diff.h
+++ b/diff.h
@@ -97,6 +97,7 @@ struct diff_flags {
unsigned stat_with_summary:1;
unsigned suppress_diff_headers:1;
unsigned dual_color_diffed_diffs:1;
+ unsigned read_literally:1;
};
static inline void diff_flags_or(struct diff_flags *a,
diff --git a/diffcore.h b/diffcore.h
index b651061c0e..363869447a 100644
--- a/diffcore.h
+++ b/diffcore.h
@@ -48,6 +48,7 @@ struct diff_filespec {
#define DIRTY_SUBMODULE_UNTRACKED 1
#define DIRTY_SUBMODULE_MODIFIED 2
unsigned is_stdin : 1;
+ unsigned read_literally : 1;
unsigned has_more_entries : 1; /* only appear in combined diff */
/* data should be considered "binary"; -1 means "don't know yet" */
signed int is_binary : 2;
diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh
index 6e0dd6f9e5..53e6bcdc19 100755
--- a/t/t4053-diff-no-index.sh
+++ b/t/t4053-diff-no-index.sh
@@ -137,4 +137,32 @@ test_expect_success 'diff --no-index from repo subdir with absolute paths' '
test_cmp expect actual
'
+test_expect_success 'diff --no-index --literally' '
+ echo "diff --git a/../../non/git/a b/../../non/git/b" >expect &&
+ test_expect_code 1 \
+ git -C repo/sub \
+ diff --literally ../../non/git/a ../../non/git/b >actual &&
+ head -n 1 <actual >actual.head &&
+ test_cmp expect actual.head
+'
+
+test_expect_success SYMLINKS 'diff --no-index --literally with symlinks' '
+ test_write_lines a b c >f1 &&
+ test_write_lines a d c >f2 &&
+ ln -s f1 s1 &&
+ ln -s f2 s2 &&
+ cat >expect <<-\EOF &&
+ diff --git a/s1 b/s2
+ --- a/s1
+ +++ b/s2
+ @@ -1,3 +1,3 @@
+ a
+ -b
+ +d
+ c
+ EOF
+ test_expect_code 1 git diff --no-index --literally s1 s2 >actual &&
+ test_cmp expect actual
+'
+
test_done
^ permalink raw reply related
* Re: [PATCH 2/3] setup: do not use invalid `repository_format`
From: brian m. carlson @ 2018-12-20 0:21 UTC (permalink / raw)
To: Jeff King; +Cc: Martin Ågren, git
In-Reply-To: <20181219153841.GB14802@sigill.intra.peff.net>
[-- Attachment #1: Type: text/plain, Size: 609 bytes --]
On Wed, Dec 19, 2018 at 10:38:41AM -0500, Jeff King wrote:
> Hmm. It looks like we never set repo_fmt.hash_algo to anything besides
> GIT_HASH_SHA1 anyway. I guess the existing field is really just there in
> preparation for us eventually respecting extensions.hashAlgorithm (or
> whatever it's called).
Yeah, it is.
I haven't tested, but since we just read the value of
extensions.objectFormat, this patch shouldn't have any effect on the
SHA-256 code. The default remains SHA-1 if a value isn't specified
somehow.
--
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]
^ permalink raw reply
* Re: [PATCH 1/3] setup: drop return value from `read_repository_format()`
From: brian m. carlson @ 2018-12-20 0:17 UTC (permalink / raw)
To: Jeff King; +Cc: Martin Ågren, git
In-Reply-To: <20181219152735.GA14802@sigill.intra.peff.net>
[-- Attachment #1: Type: text/plain, Size: 831 bytes --]
On Wed, Dec 19, 2018 at 10:27:35AM -0500, Jeff King wrote:
> I dunno. This is one of those dark corners of the code where we appear
> to do the wrong thing, but nobody seems to have noticed or cared much,
> and changing it runs the risk of breaking some obscure cases. I'm not
> sure if we should bite the bullet and try to address that, or just back
> away slowly and pretend we never looked at it. ;)
I will point out that with the SHA-256 work, reading the config file
becomes essential for SHA-256 repositories, because we need to know the
object format. Removing the config file leads to things blowing up in a
bad way (what specific bad way I don't remember).
That may influence the direction we want to take in this work, or not.
--
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]
^ permalink raw reply
* Re: Referring to commits in commit messages
From: Ævar Arnfjörð Bjarmason @ 2018-12-20 0:18 UTC (permalink / raw)
To: Jonathan Nieder
Cc: Duy Nguyen, Git Mailing List, Han-Wen Nienhuys,
Johannes Schindelin, Jacob Keller, Andrew Morton, Jeff Kirsher,
Linus Torvalds, Jeff King
In-Reply-To: <20181219221401.GC228469@google.com>
On Wed, Dec 19 2018, Jonathan Nieder wrote:
> Hi,
>
> Duy Nguyen wrote:
>> On Wed, Dec 19, 2018 at 6:04 PM Ævar Arnfjörð Bjarmason
>
>>> E.g. when composing
>>> https://public-inbox.org/git/878t0lfwrj.fsf@evledraar.gmail.com/ I
>>> remembered PERLLIB_EXTRA went back & forth between
>>> working/breaking/working with your/my/your patch, so:
>>>
>>> git log --grep=0386dd37b1
>>>
>>> Just found the chain up to my breaking change, but not your 7a7bfc7adc,
>>> which refers to that commit as v1.9-rc0~88^2.
> [...]
>> To follow email model, this sounds like a good trailer for related
>> commits, like In-Reply-To for email. We could even have special
>> trailer "Fixes" to indicate what commit is the problem that this
>> commit fixes.
>
> In Linux kernel land, Documentation/process/submitting-patches.rst
> contains the following:
>
> -- >8 --
> If your patch fixes a bug in a specific commit, e.g. you found an issue using
> ``git bisect``, please use the 'Fixes:' tag with the first 12 characters of
> the SHA-1 ID, and the one line summary. For example::
>
> Fixes: e21d2170f366 ("video: remove unnecessary platform_set_drvdata()")
>
> The following ``git config`` settings can be used to add a pretty format for
> outputting the above style in the ``git log`` or ``git show`` commands::
>
> [core]
> abbrev = 12
> [pretty]
> fixes = Fixes: %h (\"%s\")
> -- 8< --
>
> I like it because (1) the semantics are clear, (2) it's very concrete
> (e.g. "first 12 characters", (3) it goes in a trailer, where other
> bits intended for machine consumption already go.
>
> Should we adopt it? In other words, how about something like the
> following?
>
> If it seems like a good idea, I can add a commit message.
>
> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
>
> diff --git i/Documentation/SubmittingPatches w/Documentation/SubmittingPatches
> index ec8b205145..36ce1ac5d8 100644
> --- i/Documentation/SubmittingPatches
> +++ w/Documentation/SubmittingPatches
> @@ -367,6 +367,20 @@ If you like, you can put extra tags at the end:
> You can also create your own tag or use one that's in common usage
> such as "Thanks-to:", "Based-on-patch-by:", or "Mentored-by:".
>
> +If your patch fixes a bug in a specific commit, e.g. you found an issue using
> +``git bisect``, please use the 'Fixes:' tag with the first 12 characters of
> +the SHA-1 ID, and the one line summary. For example::
> +
> + Fixes: 539047c19e ("revert: introduce --abort to cancel a failed cherry-pick")
> +
> +The following ``git config`` settings can be used to add a pretty format for
> +outputting the above style in the ``git log`` or ``git show`` commands::
> +
> + [core]
> + abbrev = 12
> + [pretty]
> + fixes = Fixes: %h (\"%s\")
> +
> == Subsystems with dedicated maintainers
>
> Some parts of the system have dedicated maintainers with their own
The core.abbrev=12 part of this I don't think would be a good idea, and
have submitted a patch to the kernel to remove it:
https://lore.kernel.org/lkml/20181220000112.24891-1-avarab@gmail.com/
If we find ourselves wanting to tweak core.abbrev for git.git, we should
really take a step back and see if we can just fix the
find_unique_abbrev_r() behavior so neither us nor anyone else should
need to fiddle with it.
As noted on LKML I have upcoming patches to support core.abbrev relative
values, e.g. "+2" will give you really future-proof SHAs. That should be
Good Enough(TM) for most.
The only real improvement over the approximate_object_count() method I
can think of is something where in gc / index-pack (for clone) we write
out some statistics that allow us to later cheaply estimate the
long-term growth curve of the repository, and e.g. say that a short
SHA-1 should always be good for at least N years before it becomes
ambiguous.
OTOH we could also just say that if you're a repo with >= 11 characters
for abbreviation we might as well consider you in big boy territory, and
just snap it to say 16 and be done with it. We'll have problems with 32
bit ints somewhere in git overflowing before we'd roll over to "17".
^ permalink raw reply
* Re: [PATCH v3 1/4] pack-protocol.txt: accept error packets in any context
From: Josh Steadmon @ 2018-12-19 23:30 UTC (permalink / raw)
To: Jeff King; +Cc: Masaya Suzuki, git, Junio C Hamano
In-Reply-To: <20181217213310.GA14251@sigill.intra.peff.net>
On 2018.12.17 16:33, Jeff King wrote:
> On Thu, Dec 13, 2018 at 02:18:26PM -0800, Josh Steadmon wrote:
>
> > On 2018.12.12 17:17, Masaya Suzuki wrote:
> > > On Wed, Dec 12, 2018 at 3:02 AM Jeff King <peff@peff.net> wrote:
> > > > This ERR handling has been moved to a very low level. What happens if
> > > > we're passing arbitrary data via the packet_read() code? Could we
> > > > erroneously trigger an error if a packfile happens to have the bytes
> > > > "ERR " at a packet boundary?
> > > >
> > > > For packfiles via upload-pack, I _think_ we're OK, because we only
> > > > packetize it when a sideband is in use. In which case this would never
> > > > match, because we'd have "\1" in the first byte slot.
> > > >
> > > > But are there are other cases we need to worry about? Just
> > > > brainstorming, I can think of:
> > > >
> > > > 1. We also pass packetized packfiles between git-remote-https and
> > > > the stateless-rpc mode of fetch-pack/send-pack. And I don't think
> > > > we use sidebands there.
> > > >
> > > > 2. The packet code is used for long-lived clean/smudge filters these
> > > > days, which also pass arbitrary data.
> > > >
> > > > So I think it's probably not a good idea to unconditionally have callers
> > > > of packet_read_with_status() handle this. We'd need a flag like
> > > > PACKET_READ_RESPECT_ERR, and to trigger it from the appropriate callers.
> > >
> > > This is outside of the Git pack protocol so having a separate parsing
> > > mode makes sense to me.
> >
> > This sounds like it could be a significant refactoring. Should we go
> > back to V2 of this series, and then work on the new parsing mode
> > separately?
>
> Which one is v2? :)
>
> Just the remote-curl cleanups from me, and then your "die on server-side
> errors" patch?
Yes, that one :)
^ permalink raw reply
* Re: Referring to commits in commit messages
From: Jonathan Nieder @ 2018-12-19 23:29 UTC (permalink / raw)
To: Jeff King
Cc: Ævar Arnfjörð Bjarmason, git, Han-Wen Nienhuys,
Johannes Schindelin
In-Reply-To: <20181219224810.GA20888@sigill.intra.peff.net>
Jeff King wrote:
> On Wed, Dec 19, 2018 at 10:39:27AM -0800, Jonathan Nieder wrote:
>> Is there some rule about how long the hex string has to be for this to
>> work?
>
> In both cases, it has to be 7 characters.
Thanks.
[...]
>> The issue with this is that it is ambiguous about what the tag name is
>> referring to: does that mean that "git describe" and "git version"
>> tell me that v2.11.0 is the nearest *previous* release to that commit
>> or that "git name-rev" tells me that v2.11.0 is a nearby *subsequent*
>> release that contains it?
>
> Sure, it's ambiguous if you've never seen it. But if it becomes a
> convention in the project, then I don't think that's an obstacle.
I'm speaking from experience: this is hard for newcomers to grasp.
> I'm also not sure it really matters all that much either way. If you buy
> my argument that this is just about placing the general era of the
> commit in the mind of the reader, then "just before v2.11" or "just
> after v2.11" are about the same.
If it's that unreliable, I'd rather just have the hash, to be honest.
Ideally the full 40 characters, since that would make git name-rev
--stdin work. :)
[...]
>> I think a more promising approach is the Fixes trailer Duy mentioned,
>> which has been working well for the Linux kernel project. I'll follow
>> up in a reply to his message.
>
> I think that's a good idea if something is in fact being fixed. But
> there are many other reasons to refer to another commit in prose (or
> even outside of a commit message entirely).
Sure, but in those cases do we need the ability to query on them?
To me it seems similar to having a policy on how to reference people
in commit messages (e.g. "always include their email address"), so
that I can grep for a contributor to see how they were involved in a
patch. If it's not structured data, then at some point I stop
worrying so much about machine parsability.
Thanks,
Jonathan
^ permalink raw reply
* Re: Periodic hang during git index-pack
From: Jeff King @ 2018-12-19 23:22 UTC (permalink / raw)
To: Sitsofe Wheeler; +Cc: git
In-Reply-To: <CALjAwxiB1uDfg4iPFjh2dNibEZa5mJ0RwhCzt0R2b87NTVqWfA@mail.gmail.com>
On Wed, Dec 19, 2018 at 10:59:30PM +0000, Sitsofe Wheeler wrote:
> While using trying to use git to clone a remote repository git
> index-pack occasionally goes on to hang:
> [...]
> Looking at where it is stuck, git is doing read of a pipe:
>
> #0 0x00007fd1b845034e in __libc_read (fd=fd@entry=0,
> buf=buf@entry=0x55ab81e19d40 <input_buffer>, nbytes=nbytes@entry=4096)
> at ../sysdeps/unix/sysv/linux/read.c:27
> #1 0x000055ab81b51b23 in read (__nbytes=4096, __buf=0x55ab81e19d40
> <input_buffer>, __fd=0)
> at /usr/include/x86_64-linux-gnu/bits/unistd.h:44
Index-pack is reading the pack on stdin, so it's expecting more bytes.
Those bytes should be coming from the git-clone process, which is
relaying the bytes from the other side.
Check the backtrace of git-clone to see why it isn't feeding more data
(but note that it will generally have two threads -- one processing the
data from the remote, and one wait()ing for index-pack to finish).
My guess, though, is that you'll find that git-clone is simply waiting
on another pipe: the one from ssh.
-Peff
^ permalink raw reply
* [RFC PATCH 1/1] filter-options: Expand abbreviated numbers
From: Josh Steadmon @ 2018-12-19 23:21 UTC (permalink / raw)
To: git
In-Reply-To: <cover.1545261186.git.steadmon@google.com>
When communicating with a remote server or a subprocess, use expanded
numbers rather than abbreviated numbers in the object filter spec (e.g.
"limit:blob=1k" becomes "limit:blob=1024").
Update the protocol docs to note that clients should always perform this
expansion, to allow for more compatibility between server
implementations.
Signed-off-by: Josh Steadmon <steadmon@google.com>
---
Documentation/technical/protocol-v2.txt | 5 ++++-
builtin/clone.c | 6 +++++-
builtin/fetch.c | 7 ++++++-
fetch-pack.c | 15 ++++++++++++---
list-objects-filter-options.c | 20 ++++++++++++++++++--
list-objects-filter-options.h | 17 +++++++++++++++--
t/t6112-rev-list-filters-objects.sh | 17 +++++++++++++++++
transport-helper.c | 13 +++++++++----
upload-pack.c | 7 +++++--
9 files changed, 91 insertions(+), 16 deletions(-)
diff --git a/Documentation/technical/protocol-v2.txt b/Documentation/technical/protocol-v2.txt
index 09e4e0273f..d001372404 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -296,7 +296,10 @@ included in the client's request:
Request that various objects from the packfile be omitted
using one of several filtering techniques. These are intended
for use with partial clone and partial fetch operations. See
- `rev-list` for possible "filter-spec" values.
+ `rev-list` for possible "filter-spec" values. Clients MUST
+ translate abbreviated numbers (e.g. "1k") into fully-expanded
+ numbers (e.g. "1024") on the client side, so that the server
+ does not need to implement unit parsing.
If the 'ref-in-want' feature is advertised, the following argument can
be included in the client's request as well as the potential addition of
diff --git a/builtin/clone.c b/builtin/clone.c
index 15b142d646..8e05e8ad6c 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1130,9 +1130,13 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
option_upload_pack);
if (filter_options.choice) {
+ struct strbuf expanded_filter_spec = STRBUF_INIT;
+ expand_list_objects_filter_spec(&filter_options,
+ &expanded_filter_spec);
transport_set_option(transport, TRANS_OPT_LIST_OBJECTS_FILTER,
- filter_options.filter_spec);
+ expanded_filter_spec.buf);
transport_set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
+ strbuf_release(&expanded_filter_spec);
}
if (transport->smart_options && !deepen && !filter_options.choice)
diff --git a/builtin/fetch.c b/builtin/fetch.c
index e0140327aa..8b8bb64921 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1172,6 +1172,7 @@ static void add_negotiation_tips(struct git_transport_options *smart_options)
static struct transport *prepare_transport(struct remote *remote, int deepen)
{
struct transport *transport;
+
transport = transport_get(remote, NULL);
transport_set_verbosity(transport, verbosity, progress);
transport->family = family;
@@ -1191,9 +1192,13 @@ static struct transport *prepare_transport(struct remote *remote, int deepen)
if (update_shallow)
set_option(transport, TRANS_OPT_UPDATE_SHALLOW, "yes");
if (filter_options.choice) {
+ struct strbuf expanded_filter_spec = STRBUF_INIT;
+ expand_list_objects_filter_spec(&filter_options,
+ &expanded_filter_spec);
set_option(transport, TRANS_OPT_LIST_OBJECTS_FILTER,
- filter_options.filter_spec);
+ expanded_filter_spec.buf);
set_option(transport, TRANS_OPT_FROM_PROMISOR, "1");
+ strbuf_release(&expanded_filter_spec);
}
if (negotiation_tip.nr) {
if (transport->smart_options)
diff --git a/fetch-pack.c b/fetch-pack.c
index 9691046e64..485632fabe 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -329,9 +329,14 @@ static int find_common(struct fetch_negotiator *negotiator,
packet_buf_write(&req_buf, "deepen-not %s", s->string);
}
}
- if (server_supports_filtering && args->filter_options.choice)
+ if (server_supports_filtering && args->filter_options.choice) {
+ struct strbuf expanded_filter_spec = STRBUF_INIT;
+ expand_list_objects_filter_spec(&args->filter_options,
+ &expanded_filter_spec);
packet_buf_write(&req_buf, "filter %s",
- args->filter_options.filter_spec);
+ expanded_filter_spec.buf);
+ strbuf_release(&expanded_filter_spec);
+ }
packet_buf_flush(&req_buf);
state_len = req_buf.len;
@@ -1155,9 +1160,13 @@ static int send_fetch_request(struct fetch_negotiator *negotiator, int fd_out,
/* Add filter */
if (server_supports_feature("fetch", "filter", 0) &&
args->filter_options.choice) {
+ struct strbuf expanded_filter_spec = STRBUF_INIT;
print_verbose(args, _("Server supports filter"));
+ expand_list_objects_filter_spec(&args->filter_options,
+ &expanded_filter_spec);
packet_buf_write(&req_buf, "filter %s",
- args->filter_options.filter_spec);
+ expanded_filter_spec.buf);
+ strbuf_release(&expanded_filter_spec);
} else if (args->filter_options.choice) {
warning("filtering not recognized by server, ignoring");
}
diff --git a/list-objects-filter-options.c b/list-objects-filter-options.c
index 5285e7674d..9efb3e9902 100644
--- a/list-objects-filter-options.c
+++ b/list-objects-filter-options.c
@@ -18,8 +18,9 @@
* See Documentation/rev-list-options.txt for allowed values for <arg>.
*
* Capture the given arg as the "filter_spec". This can be forwarded to
- * subordinate commands when necessary. We also "intern" the arg for
- * the convenience of the current command.
+ * subordinate commands when necessary (although it's better to pass it through
+ * expand_list_objects_filter_spec() first). We also "intern" the arg for the
+ * convenience of the current command.
*/
static int gently_parse_list_objects_filter(
struct list_objects_filter_options *filter_options,
@@ -111,6 +112,21 @@ int opt_parse_list_objects_filter(const struct option *opt,
return parse_list_objects_filter(filter_options, arg);
}
+void expand_list_objects_filter_spec(
+ const struct list_objects_filter_options *filter,
+ struct strbuf *expanded_spec)
+{
+ strbuf_init(expanded_spec, strlen(filter->filter_spec));
+ if (filter->choice == LOFC_BLOB_LIMIT)
+ strbuf_addf(expanded_spec, "blob:limit=%lu",
+ filter->blob_limit_value);
+ else if (filter->choice == LOFC_TREE_DEPTH)
+ strbuf_addf(expanded_spec, "tree:%lu",
+ filter->tree_exclude_depth);
+ else
+ strbuf_addstr(expanded_spec, filter->filter_spec);
+}
+
void list_objects_filter_release(
struct list_objects_filter_options *filter_options)
{
diff --git a/list-objects-filter-options.h b/list-objects-filter-options.h
index 477cd97029..e3adc78ebf 100644
--- a/list-objects-filter-options.h
+++ b/list-objects-filter-options.h
@@ -2,6 +2,7 @@
#define LIST_OBJECTS_FILTER_OPTIONS_H
#include "parse-options.h"
+#include "strbuf.h"
/*
* The list of defined filters for list-objects.
@@ -20,8 +21,9 @@ struct list_objects_filter_options {
/*
* 'filter_spec' is the raw argument value given on the command line
* or protocol request. (The part after the "--keyword=".) For
- * commands that launch filtering sub-processes, this value should be
- * passed to them as received by the current process.
+ * commands that launch filtering sub-processes, or for communication
+ * over the network, don't use this value; use the result of
+ * expand_list_objects_filter_spec() instead.
*/
char *filter_spec;
@@ -62,6 +64,17 @@ int opt_parse_list_objects_filter(const struct option *opt,
N_("object filtering"), 0, \
opt_parse_list_objects_filter }
+/*
+ * Translates abbreviated numbers in the filter's filter_spec into their
+ * fully-expanded forms (e.g., "limit:blob=1k" becomes "limit:blob=1024").
+ *
+ * This form should be used instead of the raw filter_spec field when
+ * communicating with a remote process or subprocess.
+ */
+void expand_list_objects_filter_spec(
+ const struct list_objects_filter_options *filter,
+ struct strbuf *expanded_spec);
+
void list_objects_filter_release(
struct list_objects_filter_options *filter_options);
diff --git a/t/t6112-rev-list-filters-objects.sh b/t/t6112-rev-list-filters-objects.sh
index 18b0b14d5a..f96f551fb5 100755
--- a/t/t6112-rev-list-filters-objects.sh
+++ b/t/t6112-rev-list-filters-objects.sh
@@ -437,4 +437,21 @@ test_expect_success 'rev-list W/ missing=allow-any' '
git -C r1 rev-list --quiet --missing=allow-any --objects HEAD
'
+# Test expansion of filter specs.
+
+test_expect_success 'expand blob limit in protocol' '
+ git -C r2 config --local uploadpack.allowfilter 1 &&
+ GIT_TRACE_PACKET="$(pwd)/trace" git -c protocol.version=2 clone \
+ --filter=blob:limit=1k "file://$(pwd)/r2" limit &&
+ ! grep "blob:limit=1k" trace &&
+ grep "blob:limit=1024" trace
+'
+
+test_expect_success 'expand tree depth limit in protocol' '
+ GIT_TRACE_PACKET="$(pwd)/tree_trace" git -c protocol.version=2 clone \
+ --filter=tree:0k "file://$(pwd)/r2" tree &&
+ ! grep "tree:0k" tree_trace &&
+ grep "tree:0" tree_trace
+'
+
test_done
diff --git a/transport-helper.c b/transport-helper.c
index bf225c698f..01404bfac5 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -679,10 +679,15 @@ static int fetch(struct transport *transport,
if (data->transport_options.update_shallow)
set_helper_option(transport, "update-shallow", "true");
- if (data->transport_options.filter_options.choice)
- set_helper_option(
- transport, "filter",
- data->transport_options.filter_options.filter_spec);
+ if (data->transport_options.filter_options.choice) {
+ struct strbuf expanded_filter_spec = STRBUF_INIT;
+ expand_list_objects_filter_spec(
+ &data->transport_options.filter_options,
+ &expanded_filter_spec);
+ set_helper_option(transport, "filter",
+ expanded_filter_spec.buf);
+ strbuf_release(&expanded_filter_spec);
+ }
if (data->transport_options.negotiation_tips)
warning("Ignoring --negotiation-tip because the protocol does not support it.");
diff --git a/upload-pack.c b/upload-pack.c
index 5e81f1ff24..1c6d73e5a2 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -140,14 +140,17 @@ static void create_pack_file(const struct object_array *have_obj,
if (use_include_tag)
argv_array_push(&pack_objects.args, "--include-tag");
if (filter_options.filter_spec) {
+ struct strbuf expanded_filter_spec = STRBUF_INIT;
+ expand_list_objects_filter_spec(&filter_options,
+ &expanded_filter_spec);
if (pack_objects.use_shell) {
struct strbuf buf = STRBUF_INIT;
- sq_quote_buf(&buf, filter_options.filter_spec);
+ sq_quote_buf(&buf, expanded_filter_spec.buf);
argv_array_pushf(&pack_objects.args, "--filter=%s", buf.buf);
strbuf_release(&buf);
} else {
argv_array_pushf(&pack_objects.args, "--filter=%s",
- filter_options.filter_spec);
+ expanded_filter_spec.buf);
}
}
--
2.20.0.405.gbc1bbc6f85-goog
^ permalink raw reply related
* [RFC PATCH 0/1] Expand abbreviated filters
From: Josh Steadmon @ 2018-12-19 23:21 UTC (permalink / raw)
To: git
NOTE: this patch is based on top of md/list-objects-filter-by-depth
Currently, git clients pass filter specs verbatim over the network and
to subprocesses. We support various numeric abbreviations for parameters
on these limits (via git_parse_ulong()), but other implementations may
not support the same abbreviations (or may support them differently;
e.g., should "1k" == 1000 or 1024?). It would be better to only pass
fully-expanded numbers in this case, and keep the expansion logic
completely on the client side.
This patch updates the protocol-v2 doc to specify that clients MUST
expand abbreviations in filter specifications before passing them to
other processes. It adds a new function
"expand_list_objects_filter_spec()" in list-objects-filter-options.c
that implements the expansion logic, and updates users of the
filter_spec field to instead expand the spec first.
Josh Steadmon (1):
filter-options: Expand abbreviated numbers
Documentation/technical/protocol-v2.txt | 5 ++++-
builtin/clone.c | 6 +++++-
builtin/fetch.c | 7 ++++++-
fetch-pack.c | 15 ++++++++++++---
list-objects-filter-options.c | 20 ++++++++++++++++++--
list-objects-filter-options.h | 17 +++++++++++++++--
t/t6112-rev-list-filters-objects.sh | 17 +++++++++++++++++
transport-helper.c | 13 +++++++++----
upload-pack.c | 7 +++++--
9 files changed, 91 insertions(+), 16 deletions(-)
--
2.20.0.405.gbc1bbc6f85-goog
^ permalink raw reply
* Re: [PATCH 2/3] setup: do not use invalid `repository_format`
From: Jeff King @ 2018-12-19 23:17 UTC (permalink / raw)
To: Martin Ågren; +Cc: Git Mailing List, brian m . carlson
In-Reply-To: <CAN0heSpc_sap1cZktteRn3hyeuRx2w86Hd2kqxu4XFgHs75_Kw@mail.gmail.com>
On Wed, Dec 19, 2018 at 10:46:52PM +0100, Martin Ågren wrote:
> > 2. Arguably we should not even look at extensions.* unless we see a
> > version >= 1. But we do process them as we parse the config file.
> > This is mostly an oversight, I think. We have to handle them as we
> > see them, because they may come out of order with respect to the
> > repositoryformatversion field. But we could put them into a
> > string_list, and then only process them after we've decided which
> > version we have.
>
> I hadn't thought too much about this. I guess that for some simpler
> extensions--versions dependencies it would be feasible to first parse
> everything, then, depending on the version we've identified, forget
> about any "irrelevant" extensions. Again, nothing I've thought much
> about, and seems to be safely out of scope for this patch.
The decision is actually pretty straight-forward: if version < 1, ignore
extensions, otherwise respect them (and complain about any we don't know
about).
So I think we could just do in verify_repository_format() something
like:
if (version < 1) {
/* "undo" any extensions we might have parsed */
data->precious_objects = 0;
FREE_AND_NULL(data->partial_clone);
data->worktree_config = 0;
data->hash_algo = GIT_HASH_SHA1;
} else {
/* complain about unknown extension; we already do this! */
}
It's a little ugly to have to know about all the extensions here, but we
already initialize them in read_repository_format(). We could probably
factor that out into a shared function.
-Peff
^ permalink raw reply
* Periodic hang during git index-pack
From: Sitsofe Wheeler @ 2018-12-19 22:59 UTC (permalink / raw)
To: git
Hi,
While using trying to use git to clone a remote repository git
index-pack occasionally goes on to hang:
export GIT_TRACE=1
git clone -n remotehost:diffusion/LIBEDIT/libedit
22:43:48.948378 git.c:418 trace: built-in: git clone -n
remotehost:diffusion/LIBEDIT/libedit
Cloning into 'libedit'...
22:43:48.951196 run-command.c:643 trace: run_command: unset
GIT_DIR; ssh remotehost 'git-upload-pack
'\''diffusion/LIBEDIT/libedit'\'''
22:43:49.130517 run-command.c:643 trace: run_command: git
index-pack --stdin -v --fix-thin '--keep=fetch-pack 13914 on machine'
--check-self-contained-and-connected
remote: Enumerating objects: 178, done.
22:43:49.132535 git.c:418 trace: built-in: git
index-pack --stdin -v --fix-thin '--keep=fetch-pack 13914 on machine'
--check-self-contained-and-connected
remote: Counting objects: 100% (178/178), done.
remote: Compressing objects: 100% (81/81), done.
remote: Total 178 (delta 95), reused 178 (delta 95)
Receiving objects: 100% (178/178), 539.36 KiB | 16.85 MiB/s, done.
Resolving deltas: 100% (95/95), done.
22:43:49.209193 run-command.c:643 trace: run_command: git
rev-list --objects --stdin --not --all --quiet '--progress=Checking
connectivity'
22:43:49.211276 git.c:418 trace: built-in: git rev-list
--objects --stdin --not --all --quiet '--progress=Checking
connectivity'
Wed 19 Dec 22:43:49 GMT 2018
Wed 19 Dec 22:43:49 GMT 2018
22:43:49.220996 git.c:418 trace: built-in: git clone -n
remotehost:diffusion/LIBEDIT/libedit
Cloning into 'libedit'...
22:43:49.223462 run-command.c:643 trace: run_command: unset
GIT_DIR; ssh remotehost 'git-upload-pack
'\''diffusion/LIBEDIT/libedit'\'''
22:43:49.491004 run-command.c:643 trace: run_command: git
index-pack --stdin -v --fix-thin '--keep=fetch-pack 13925 on machine'
--check-self-contained-and-connected
remote: Enumerating objects: 178, done.
22:43:49.492988 git.c:418 trace: built-in: git
index-pack --stdin -v --fix-thin '--keep=fetch-pack 13925 on machine'
--check-self-contained-and-connected
remote: Counting objects: 100% (178/178), done.
remote: Compressing objects: 100% (81/81), done.
remote: Total 178 (delta 95), reused 178 (delta 95)
Receiving objects: 100% (178/178), 539.36 KiB | 17.40 MiB/s, done.
Resolving deltas: 100% (95/95), done.
22:43:49.569331 run-command.c:643 trace: run_command: git
rev-list --objects --stdin --not --all --quiet '--progress=Checking
connectivity'
22:43:49.571431 git.c:418 trace: built-in: git rev-list
--objects --stdin --not --all --quiet '--progress=Checking
connectivity'
Wed 19 Dec 22:43:49 GMT 2018
Wed 19 Dec 22:43:49 GMT 2018
22:43:49.581282 git.c:418 trace: built-in: git clone -n
remotehost:diffusion/LIBEDIT/libedit
Cloning into 'libedit'...
22:43:49.584019 run-command.c:643 trace: run_command: unset
GIT_DIR; ssh remotehost 'git-upload-pack
'\''diffusion/LIBEDIT/libedit'\'''
22:43:49.818490 run-command.c:643 trace: run_command: git
index-pack --stdin -v --fix-thin '--keep=fetch-pack 13936 on machine'
--check-self-contained-and-connected
remote: Enumerating objects: 178, done.
22:43:49.820477 git.c:418 trace: built-in: git
index-pack --stdin -v --fix-thin '--keep=fetch-pack 13936 on machine'
--check-self-contained-and-connected
remote: Counting objects: 100% (178/178), done.
remote: Compressing objects: 100% (81/81), done.
Receiving objects: 64% (114/178)
Most of the time cloning complets fine but every now and then it will
hang indefinitely in the receiving objects stage. This happens with
git version 2.17.1 from Ubuntu 18.04 and a hand compiled git version
2.20.1. The remote end is running through Phabricator sshd scripts
(both 2.17.1 and 2.20.1 have been tried on the remote end). ssh
multiplexing is on:
ControlMaster auto
ControlPersist 120
ControlPath /tmp/ssh-%r@%h:%p
ServerAliveInterval 60
Looking at where it is stuck, git is doing read of a pipe:
#0 0x00007fd1b845034e in __libc_read (fd=fd@entry=0,
buf=buf@entry=0x55ab81e19d40 <input_buffer>, nbytes=nbytes@entry=4096)
at ../sysdeps/unix/sysv/linux/read.c:27
#1 0x000055ab81b51b23 in read (__nbytes=4096, __buf=0x55ab81e19d40
<input_buffer>, __fd=0)
at /usr/include/x86_64-linux-gnu/bits/unistd.h:44
#2 xread (fd=0, buf=0x55ab81e19d40 <input_buffer>, len=4096) at wrapper.c:260
#3 0x000055ab819d1069 in fill (min=min@entry=1) at builtin/index-pack.c:255
#4 0x000055ab819d123a in unpack_entry_data (offset=268408,
size=size@entry=14675, type=<optimised out>,
oid=oid@entry=0x55ab837cff40)
at builtin/index-pack.c:445
#5 0x000055ab819d1bc7 in unpack_raw_entry (oid=0x55ab837cff40,
ref_oid=0x7fffee137440, ofs_offset=0x55ab837ce350, obj=0x55ab837cff40)
at builtin/index-pack.c:526
#6 parse_pack_objects (hash=0x7fffee1374f0 "\001") at builtin/index-pack.c:1113
#7 cmd_index_pack (argc=<optimised out>, argv=<optimised out>,
prefix=<optimised out>) at builtin/index-pack.c:1775
#8 0x000055ab819934e1 in run_builtin (argv=<optimised out>,
argc=<optimised out>, p=<optimised out>) at git.c:421
#9 handle_builtin (argc=<optimised out>, argv=<optimised out>) at git.c:647
#10 0x000055ab81994515 in run_argv (argv=0x7fffee137690,
argcp=0x7fffee13769c) at git.c:701
#11 cmd_main (argc=<optimised out>, argv=<optimised out>) at git.c:798
#12 0x000055ab8199318f in main (argc=7, argv=0x7fffee137908) at common-main.c:45
Does anyone know what might be wrong and/or how to debug this?
--
Sitsofe | http://sucs.org/~sits/
^ permalink raw reply
* Re: Referring to commits in commit messages
From: Jeff King @ 2018-12-19 22:48 UTC (permalink / raw)
To: Jonathan Nieder
Cc: Ævar Arnfjörð Bjarmason, git, Han-Wen Nienhuys,
Johannes Schindelin
In-Reply-To: <20181219183927.GA228469@google.com>
On Wed, Dec 19, 2018 at 10:39:27AM -0800, Jonathan Nieder wrote:
> Jeff King wrote:
>
> > - web interfaces like GitHub won't linkify this type of reference
> > (whereas they will for something that looks like a hex object id)
> >
> > - my terminal makes it easy to select hex strings, but doesn't
> > understand this git-describe output :)
> >
> > These tools _could_ be taught a regex like /v(\d+.)(-rc\d+)?([~^]+d)*/.
> > But matching hex strings is a lot simpler, and works across many
> > projects.
>
> Is there some rule about how long the hex string has to be for this to
> work?
In both cases, it has to be 7 characters. In my experience, it doesn't
produce a lot of false positives (in the case of GitHub, I believe it
actually confirms that it's a real commit; in my terminal, it highlights
anything plausible).
> > In commit 1234abcd (the subject line, 2016-01-01, v2.11.0), we did
> > blah blah blah
>
> The issue with this is that it is ambiguous about what the tag name is
> referring to: does that mean that "git describe" and "git version"
> tell me that v2.11.0 is the nearest *previous* release to that commit
> or that "git name-rev" tells me that v2.11.0 is a nearby *subsequent*
> release that contains it?
Sure, it's ambiguous if you've never seen it. But if it becomes a
convention in the project, then I don't think that's an obstacle.
I'm also not sure it really matters all that much either way. If you buy
my argument that this is just about placing the general era of the
commit in the mind of the reader, then "just before v2.11" or "just
after v2.11" are about the same.
> Of course the latter is the only answer that's useful, but in practice
> the former is what people tend to do when they are trying to follow a
> convention like this. So we'd need some automatic linting to make it
> useful.
I thought we were just talking about an informational message in one
human's writing, that would be read and interpreted by another human
(the commit id is the thing that remains machine-readable). Automatic
linting seems a bit overboard...
> I think a more promising approach is the Fixes trailer Duy mentioned,
> which has been working well for the Linux kernel project. I'll follow
> up in a reply to his message.
I think that's a good idea if something is in fact being fixed. But
there are many other reasons to refer to another commit in prose (or
even outside of a commit message entirely).
-Peff
^ permalink raw reply
* Re: Referring to commits in commit messages
From: Jonathan Nieder @ 2018-12-19 22:14 UTC (permalink / raw)
To: Duy Nguyen
Cc: Ævar Arnfjörð Bjarmason, Git Mailing List,
Han-Wen Nienhuys, Johannes Schindelin
In-Reply-To: <CACsJy8DdgjjQLEn=O7ePBo7ndLuv22RGQA3nM1Lyizz=59Pj9Q@mail.gmail.com>
Hi,
Duy Nguyen wrote:
> On Wed, Dec 19, 2018 at 6:04 PM Ævar Arnfjörð Bjarmason
>> E.g. when composing
>> https://public-inbox.org/git/878t0lfwrj.fsf@evledraar.gmail.com/ I
>> remembered PERLLIB_EXTRA went back & forth between
>> working/breaking/working with your/my/your patch, so:
>>
>> git log --grep=0386dd37b1
>>
>> Just found the chain up to my breaking change, but not your 7a7bfc7adc,
>> which refers to that commit as v1.9-rc0~88^2.
[...]
> To follow email model, this sounds like a good trailer for related
> commits, like In-Reply-To for email. We could even have special
> trailer "Fixes" to indicate what commit is the problem that this
> commit fixes.
In Linux kernel land, Documentation/process/submitting-patches.rst
contains the following:
-- >8 --
If your patch fixes a bug in a specific commit, e.g. you found an issue using
``git bisect``, please use the 'Fixes:' tag with the first 12 characters of
the SHA-1 ID, and the one line summary. For example::
Fixes: e21d2170f366 ("video: remove unnecessary platform_set_drvdata()")
The following ``git config`` settings can be used to add a pretty format for
outputting the above style in the ``git log`` or ``git show`` commands::
[core]
abbrev = 12
[pretty]
fixes = Fixes: %h (\"%s\")
-- 8< --
I like it because (1) the semantics are clear, (2) it's very concrete
(e.g. "first 12 characters", (3) it goes in a trailer, where other
bits intended for machine consumption already go.
Should we adopt it? In other words, how about something like the
following?
If it seems like a good idea, I can add a commit message.
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
diff --git i/Documentation/SubmittingPatches w/Documentation/SubmittingPatches
index ec8b205145..36ce1ac5d8 100644
--- i/Documentation/SubmittingPatches
+++ w/Documentation/SubmittingPatches
@@ -367,6 +367,20 @@ If you like, you can put extra tags at the end:
You can also create your own tag or use one that's in common usage
such as "Thanks-to:", "Based-on-patch-by:", or "Mentored-by:".
+If your patch fixes a bug in a specific commit, e.g. you found an issue using
+``git bisect``, please use the 'Fixes:' tag with the first 12 characters of
+the SHA-1 ID, and the one line summary. For example::
+
+ Fixes: 539047c19e ("revert: introduce --abort to cancel a failed cherry-pick")
+
+The following ``git config`` settings can be used to add a pretty format for
+outputting the above style in the ``git log`` or ``git show`` commands::
+
+ [core]
+ abbrev = 12
+ [pretty]
+ fixes = Fixes: %h (\"%s\")
+
== Subsystems with dedicated maintainers
Some parts of the system have dedicated maintainers with their own
^ permalink raw reply related
* Re: [PATCH] stripspace: allow -s/-c outside git repository
From: Martin Ågren @ 2018-12-19 21:52 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Jonathan Nieder, Git Mailing List, Han-Wen Nienhuys
In-Reply-To: <nycvar.QRO.7.76.6.1812181258550.43@tvgsbejvaqbjf.bet>
On Tue, 18 Dec 2018 at 13:00, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> > Makes me wonder if `setup_git_directory_gently()` should BUG if it
> > receives NULL. That would require some reshuffling in setup.c, since
> > `setup_git_directory()` calls out to it with NULL... Just thinking out
> > loud. In any case, and more importantly, this is the last user providing
> > NULL except for the one in `setup_git_directory()`.
>
> We could rename `setup_git_directory_gently()` to
> `setup_git_directory_gently_2()`, make that `static` and then call it from
> `setup_git_directory_gently()` and `setup_git_directory()`.
Thanks for the hint. I'm currently recuperating from having been lost in
a nearby corner of setup.c, so I´ll leave this tightening as a
low-hanging fruit for "someone else."
Martin
^ permalink raw reply
* Re: [PATCH 3/3] setup: add `clear_repository_format()`
From: Martin Ågren @ 2018-12-19 21:49 UTC (permalink / raw)
To: Jeff King; +Cc: Git Mailing List, brian m . carlson
In-Reply-To: <20181219154853.GC14802@sigill.intra.peff.net>
On Wed, 19 Dec 2018 at 16:48, Jeff King <peff@peff.net> wrote:
>
> On Tue, Dec 18, 2018 at 08:25:28AM +0100, Martin Ågren wrote:
>
> > +void clear_repository_format(struct repository_format *format)
> > +{
> > + string_list_clear(&format->unknown_extensions, 0);
> > + free(format->work_tree);
> > + free(format->partial_clone);
> > + memset(format, 0, sizeof(*format));
> > }
>
> For the callers that actually pick the values out, I think it might be a
> little less error-prone if they actually copied the strings and then
> called clear_repository_format(). That avoids leaks of values that they
> didn't know or care about (and the cost of an extra strdup for
> repository setup is not a big deal).
>
> Something like this on top of your patch, I guess (with the idea being
> that functions which return an error would clear the format, but a
> "successful" one would get returned back up the stack to
> setup_git_directory_gently(), which then clears it before returning.
Thanks for the suggestion. I'll ponder 1) how to go about this
robustifying, 2) how to present the result as part of a v2 series.
To Junio on the sidelines in a cast (hope you're feeling better!):
you can expect a v2 of this series.
Martin
^ permalink raw reply
* Re: [PATCH 2/3] setup: do not use invalid `repository_format`
From: Martin Ågren @ 2018-12-19 21:46 UTC (permalink / raw)
To: Jeff King; +Cc: Git Mailing List, brian m . carlson
In-Reply-To: <20181219153841.GB14802@sigill.intra.peff.net>
On Wed, 19 Dec 2018 at 16:38, Jeff King <peff@peff.net> wrote:
>
> On Tue, Dec 18, 2018 at 08:25:27AM +0100, Martin Ågren wrote:
>
> > Check that `version` is non-negative before using `hash_algo`.
> Hmm. It looks like we never set repo_fmt.hash_algo to anything besides
> GIT_HASH_SHA1 anyway. I guess the existing field is really just there in
> preparation for us eventually respecting extensions.hashAlgorithm (or
> whatever it's called).
That was my understanding as well. Maybe I should have spelled it out.
I think of the diff of this patch as "let's check `foo->valid` before we
`use(foo->bar)`", which should only be able to regress in case foo isn't
valid. And ...
> Given what I said in my previous email about repos with a missing
> "version" field, I wondered if this patch would be breaking config like:
>
> [core]
> # no repositoryformatversion!
> [extensions]
> hashAlgorithm = sha256
>
> But I'd argue that:
>
> 1. That's pretty dumb config that we shouldn't need to support. Even
> if we care about handling the missing version for historical repos,
> they wouldn't be talking sha256.
... this matches my thinking.
> 2. Arguably we should not even look at extensions.* unless we see a
> version >= 1. But we do process them as we parse the config file.
> This is mostly an oversight, I think. We have to handle them as we
> see them, because they may come out of order with respect to the
> repositoryformatversion field. But we could put them into a
> string_list, and then only process them after we've decided which
> version we have.
I hadn't thought too much about this. I guess that for some simpler
extensions--versions dependencies it would be feasible to first parse
everything, then, depending on the version we've identified, forget
about any "irrelevant" extensions. Again, nothing I've thought much
about, and seems to be safely out of scope for this patch.
> So I think your patch is doing the right thing, and won't hurt any real
> cases. But (of course) there are more opportunities to clean things up.
^ permalink raw reply
* Re: [PATCH 0/1] add author and committer configuration settings
From: Jonathan Nieder @ 2018-12-19 21:46 UTC (permalink / raw)
To: William Hubbs; +Cc: git, chutzpah
In-Reply-To: <20181219183939.16358-1-williamh@gentoo.org>
Hi,
William Hubbs wrote:
> this is my first patch for git, so please be gentle. ;-)
Thanks for contributing!
> I am in a situation where I need to use different authorship information
> for some repositories I commit to.
>
> Git already supports setting different authorship and committer
> information with environment variables, but this setting is global so if
> you want to change it per repository you need to use a separate tool.
>
> This patch adds support to git config for author.email, author.name,
> committer.email and committer.name settings so this information
> can be set per repository. It applies to current master.
The above information (except for "It applies to current master") is
very useful to have when looking back at the change in history. When
sending the next version of this patch in response to others'
comments, can you replace the commit message with something like it?
In other words, it is very useful for the commit message to include
this kind of information about the "why" behind a change, beyond the
"what" that the patch itself already provides.
> Thanks much for your reviews, and I would like to publically thank dscho
> from #git-devel for assisting me in preparing this patch.
>
> Also, since I use a screen reader, it would be very helpful if you put
> your comments in your replies as close to the affected code as possible,
> preferably directly below it.
Fortunately, this is already common practice here, but the reminder is
very welcome.
By the way, if you have other feedback about Git accessibility through
a screen reader (both the project and the tool), I would be very
interested to hear. Presumably in a new thread. :)
Thanks and hope that helps,
Jonathan
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox