Git development
 help / color / mirror / Atom feed
* Re: [PATCH 2/9] commit-graph: write changed paths bloom filters
From: Jakub Narebski @ 2020-01-06 18:44 UTC (permalink / raw)
  To: Garima Singh via GitGitGadget
  Cc: git, Derrick Stolee, SZEDER Gábor, Jonathan Tan,
	Jeff Hostetler, Taylor Blau, Jeff King, Junio C Hamano,
	Garima Singh
In-Reply-To: <e52c7ad37a306891487bd79a09b040bfb657d723.1576879520.git.gitgitgadget@gmail.com>

"Garima Singh via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Garima Singh <garima.singh@microsoft.com>
>
> The changed path bloom filters help determine which paths changed between a
> commit and its first parent. We already have the "--changed-paths" option
> for the "git commit-graph write" subcommand, now actually compute them under
> that option. The COMMIT_GRAPH_WRITE_BLOOM_FILTERS flag enables this
> computation.
>
> RFC Notes: Here are some details about the implementation and I would love
> to know your thoughts and suggestions for improvements here.
>
> For details on what bloom filters are and how they work, please refer to
> Dr. Derrick Stolee's blog post [1].
> [1] https://devblogs.microsoft.com/devops/super-charging-the-git-commit-graph-iv-bloom-filters/
>
> 1. The implementation sticks to the recommended values of 7 and 10 for the
>    number of hashes and the size of each entry, as described in the blog.

Please provide references to original work for this.  Derrick Stolee
blog post references the following work:

  Flavio Bonomi, Michael Mitzenmacher, Rina Panigrahy, Sushil Singh, George Varghese
  "An Improved Construction for Counting Bloom Filters"
  http://theory.stanford.edu/~rinap/papers/esa2006b.pdf
  https://doi.org/10.1007/11841036_61

However, we do not use Counting Bloom Filters, but ordinary Bloom
Filters, if used in untypical way: instead of testing many elements
(keys) against single filter, we test single element (path) against
mainy filters.

Also, I'm not sure that values 10 bits per entry and 7 hash functions
are recommended; the work states:

  "For example, when n/m = 10 and k = 7 the false positive probability
  is just over 0.008."

Given false positive probablity we can calculate best choice for n/m and
k.

On the other hand in https://arxiv.org/abs/1912.08258 we have

  "For efficient memory usage, a Bloom filter with a false-positive
   probability ϵ should use about − log_2(ϵ) hash functions
   [Broder2004]. At a false-positive probability of 1%, seven hash
   functions are thus required."

So k=7 being optimal is somewhat confirmed.

>    The implementation while not completely open to it at the moment, is flexible
>    enough to allow for tweaking these settings in the future.

All right.

>    Note: The performance gains we have observed so far with these values is
>    significant enough to not that we did not need to tweak these settings.
                           ^^^
s/not/note/

Did you try to tweak settings, i.e. numbers of bits per entry, number
of hash functions (which is derivative of the former - at least the
optimal number), the size of the block, the cutoff threshold value?
It is not needed to be in this patch series - fine tuning is probably
better left for later.

>    The cover letter of this series has the details and the commit where we have
>    git log use bloom filters.

The second part of this sentence, from "and the commit..." is a bit
unclear.  Did you mean here that the future / subsequent commit in this
patch series that makes Git actually use Bloom filters in `git log --
<path>` will have more details in its commit message?

> 2. As described in the blog and the linked technical paper therin, we do not need
                                                             ^^^^^^
s/therin/therein/

>    7 independent hashing functions. We use the Murmur3 hashing scheme - seed it
>    twice and then combine those to procure an arbitrary number of hash values.

The "linked technical paper" in the blog post (which I would prefer to
have linked directly to in the commit message) is

  Peter C. Dillinger and Panagiotis Manolios
  "Bloom Filters in Probabilistic Verification"
  http://www.ccs.neu.edu/home/pete/pub/bloom-filters-verification.pdf
  https://doi.org/10.1007/978-3-540-30494-4_26

Sidenote: it looks like it is a reference from Wikipedia on Bloom filters.
This is according to authors the original paper with the _double hashing_
technique.

They also examine in much detail the optimal number of hash functions.

> 3. The filters are sized according to the number of changes in the each commit,
>    with minimum size of one 64 bit word.

Do I understand it correctly that the size of filter is 10*(number of
changed files) bits, rounded up to nearest multiple of 64?

How do you count renames and copies?  As two changes?

Do I understand it correctly that commit with no changes in it (which
can rarely happen) would have 64-bits i.e. 8-bytes Bloom filter of all
zeros: 0x0000000000000000?

How merges are handled?  Does the filter uses all changed files, or just
changes compared to first parent?

>
> [Call for advice] We currently cap writing bloom filters for commits with
> atmost 512 changed files. In the current implementation, we compute the diff,
> and then just throw it away once we see it has more than 512 changes.
> Any suggestiongs on how to reduce the work we are doing in this case are more
> than welcome.

This got solved in "[PATCH] diff: halt tree-diff early after max_changes"
https://public-inbox.org/git/e9a4e4ff-5466-dc39-c3f5-c9a8b8f2f11d@gmail.com/

> [Call for advice] Would the git community like this commit to be split up into
> more granular commits? This commit could possibly be split out further with the
> bloom.c code in its own commit, to be used by the commit-graph in a subsequent
> commit. While I prefer it being contained in one commit this way, I am open to
> suggestions.

I think it might be a good idea to split this commit into purely Bloom
filter implementation (bloom.c) AND unit tests for Bloom filter itself
(which would probably involve some new test-tool).

I have not read further messages in the series [edit: they don't], so I
don't know if such tests already exist or not.  One could test for
negative match, maybe also (for specific choice of hash function) for
positive and maybe even false positive match, for filter size depending
on the number of changes, for changes cap (maybe), maybe also for
no-changes scenario.


As for splitting the main part of the series, I would envision it in the
following way (which is of course only one possibility):

1. Implementation of generic-ish Bloom filter (with elements being
   strings / paths, and optimized to test single key against many
   filters, each taking small-ish space, variable size filter, limit on
   maximum number of elements).

   Technical documentation in comments in bloom.h (description of API)
   and bloom.c (details of the algorithm, with references).

   TODO: test-tool and unit tests.

2. Using per-commit Bloom filter(s) to store changeset information
   i.e. changed paths.  This would implement in-memory storage (on slab)
   and creating Bloom filter out of commit and repository information.

   Perhaps this should also get its own unit tests (that Bloom filter
   catches changed files, and excluding false positivess catches
   unchanged files).

3. Storing per-commit Bloom filters in the commit-graph file:

   a.) writing Bloom filters data to commit-graph file, which means
       designing the chunk(s) format,
   b.) verifying Bloom filter chunks, at least sanity-checks
   c.) reading Bloom filters from commit-graph file into memory

   Perhaps also some integration tests that the information is stored
   and retrieved correctly, and that verifying finds bugs in
   intentionally corrupted Bloom filter chunks.

4. Using Bloom filters to speed up `git log -- <path>` (and similar
   commands).

   It would be nice to have some functional tests, and maybe some
   performance tests, if possible.


> [Call for advice] Would a technical document explaining the exact details of
> the bloom filter implemenation and the hashing calculations be helpful? I will
> be adding details into Documentation/technical/commit-graph-format.txt, but the
> bloom filter code is an independent subsystem and could be used outside of the
> commit-graph feature. Is it worth a separate document, or should we apply "You
> Ain't Gonna Need It" principles?

As nowadays technical reference documentation is being moved from
Documentation/technical/api-*.txt to appropriate header files, maybe the
documentation of Bloom filter API (and some technical documentation and
references) be put in bloom.h?  See for example comments in strbuf.h.

> [Call for advice] I plan to add unit tests for bloom.c, specifically to ensure
> that the hash algorithm and bloom key calculations are stable across versions.

Ah, so the unit tests for bloom.c does not exist, yet...

> Signed-off-by: Garima Singh <garima.singh@microsoft.com>
> Helped-by: Derrick Stolee <dstolee@microsoft.com>
> ---
>  Makefile       |   1 +
>  bloom.c        | 201 +++++++++++++++++++++++++++++++++++++++++++++++++
>  bloom.h        |  46 +++++++++++
>  commit-graph.c |  32 +++++++-
>  4 files changed, 279 insertions(+), 1 deletion(-)
>  create mode 100644 bloom.c
>  create mode 100644 bloom.h
>
> diff --git a/Makefile b/Makefile
> index 42a061d3fb..9d5e26f5d6 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -838,6 +838,7 @@ LIB_OBJS += base85.o
>  LIB_OBJS += bisect.o
>  LIB_OBJS += blame.o
>  LIB_OBJS += blob.o
> +LIB_OBJS += bloom.o
>  LIB_OBJS += branch.o
>  LIB_OBJS += bulk-checkin.o
>  LIB_OBJS += bundle.o

I'll put bloom.h first, to make it easier to review.

> diff --git a/bloom.h b/bloom.h
> new file mode 100644
> index 0000000000..ba8ae70b67
> --- /dev/null
> +++ b/bloom.h
> @@ -0,0 +1,46 @@
> +#ifndef BLOOM_H
> +#define BLOOM_H
> +
> +struct commit;
> +struct repository;

This would probably be missing if this patch was split in two:
introducing Bloom filter and saving Bloom filter in the repository
metadata (in commit-graphh file).

> +

O.K., the names of fields are descriptive enough so that this struct
doesn't need detailed description in comment (like the next one).

> +struct bloom_filter_settings {
> +	uint32_t hash_version;

Do we need full half-word for hash version?

> +	uint32_t num_hashes;

Do we need full 32-bits for number of hashes?  The "Bloom Filters in
Probabilistic Verification" paper mentioned in Stolee blog states that
no one should need number of hashes greater than k=32 - the accuracy is
so high that it doesn't matter that it is not optimal.

  "Notice one last thing about Bloom filters in verification, if $m$ is
   several gigabytes or less and $m/n$ calls for more than about 32 index
   functions, the accuracy is going to be so high that there is not much
   reason to use more than 32—for the next several years at least. In
   response to this, 3SPIN currently limits the user to $k = 32$. The
   point of this observation is that we do not have to worry about the
   runtime cost of $k$ being on the order of 64 or 100, because those
   choices do not really buy us anything over 32."

Here 'm' is the number of bits in Bloom filter, and m/n is number of
bits per element added to filter.

> +	uint32_t bits_per_entry;

All right, we wouldn't really want large Bloom filters, as we use one
filter per commit to match againts one key, not single Bloom filter to
match againts many keys.

> +};
> +
> +#define DEFAULT_BLOOM_FILTER_SETTINGS { 1, 7, 10 }
> +
> +/*
> + * A bloom_filter struct represents a data segment to
> + * use when testing hash values. The 'len' member
> + * dictates how many uint64_t entries are stored in
> + * 'data'.
> + */
> +struct bloom_filter {
> +	uint64_t *data;
> +	int len;
> +};

O.K., so it is single variable-sized (in 64-bit increments) Bloom filter
bit vector (bitmap).

> +
> +/*
> + * A bloom_key represents the k hash values for a
> + * given hash input. These can be precomputed and
> + * stored in a bloom_key for re-use when testing
> + * against a bloom_filter.
> + */
> +struct bloom_key {
> +	uint32_t *hashes;
> +};

That is smart.  I wonder however if it wouldn't be a good idea to
'typedef' a hash function return type.

I repeat myself: in Git case we have one key that we try to match
against many Bloom filters which are never updated, while in an ordinary
case many keys are matched against single Bloom filter - in many cases
updated (with keys inserted to Bloom filter).

I wonder if somebody from academia have examined such situation.
I couldn't find a good search query.


Sidenote: perhaps Xor or Xor+ filters from Graf & Lemire (2019)
https://arxiv.org/abs/1912.08258 would be better solution - they also
assume unchanging filter.  Though they are a very fresh proposal;
also construction time might be important for Git.
https://github.com/FastFilter/xor_singleheader

> +
> +void load_bloom_filters(void);
> +
> +struct bloom_filter *get_bloom_filter(struct repository *r,
> +				      struct commit *c);

Those two functions really need API documentation on how they are used,
if they are to be used in any other role, especially what is their
calling convention?  Why load_bloom_filters() doesn't take any
parameters?

Anyway, if this patch would be split into pure Bloom filter
implementation and Git use^W store of Bloom filters, then this would be
left for the latter patch.

> +
> +void fill_bloom_key(const char *data,
> +		    int len,
> +		    struct bloom_key *key,
> +		    struct bloom_filter_settings *settings);

It is a bit strange that two basic Bloom filter operations, namely
adding element to Bloom filter (and constructing Bloom filter), and
testing whether element is in Bloom filter are not part of a public
API...

This function should probably be documented, in particular the fact that
*key is an in/out parameter.  This could also be a good place to
document the mechanism itself (i.e. our implementation of Bloom filter,
with references), though it might be better to keep the details of how
it works in the bloom.c - close to the actual source (while keeping
description of API in bloom.h comments).

> +
> +#endif
> diff --git a/bloom.c b/bloom.c
> new file mode 100644
> index 0000000000..08328cc381
> --- /dev/null
> +++ b/bloom.c
> @@ -0,0 +1,201 @@
> +#include "git-compat-util.h"
> +#include "bloom.h"
> +#include "commit-graph.h"
> +#include "object-store.h"
> +#include "diff.h"
> +#include "diffcore.h"
> +#include "revision.h"
> +#include "hashmap.h"
> +
> +#define BITS_PER_BLOCK 64
> +
> +define_commit_slab(bloom_filter_slab, struct bloom_filter);
> +
> +struct bloom_filter_slab bloom_filters;

All right, so the Bloom filter data would be on slab.  This should
probably be mentioned in the commit message, like in
https://lore.kernel.org/git/61559c5b-546e-d61b-d2e1-68de692f5972@gmail.com/

Sidenote: If I remember correctly one of the unmet prerequisites for
switching to generation numbers v2 (corrected commit date with monotonic
offsets) was moving 'generation' field out of 'struct commit' and on to
slab (possibly also 'graph_pos'), and certainly having 'corrected_date'
on slab (Inside-Out Object style).  Which probably could be done with
Coccinelle script...

> +
> +struct pathmap_hash_entry {
> +    struct hashmap_entry entry;
> +    const char path[FLEX_ARRAY];
> +};

Hmmm... I wonder why use hashmap and not string_list.  This is for
adding path with leading directories to the Bloom filter, isn't it?

> +
> +static uint32_t rotate_right(uint32_t value, int32_t count)
> +{
> +	uint32_t mask = 8 * sizeof(uint32_t) - 1;
> +	count &= mask;
> +	return ((value >> count) | (value << ((-count) & mask)));
> +}

Does it actually work with count being negative?  Shouldn't 'count' be
of unsigned type, and if int32_t is needed, perhaps add an assertion (if
needed)?  I think it does not.

It looks like it is John Regehr [2] safe and compiler-friendly
implementation, with explicit 8 in place of CHAR_BIT from <limits.h>,
which should compile to "rotate" assembly instruction... it looks like
it is the case, see https://godbolt.org/z/5JP1Jb (at least for C++
compiler).

[2]: https://en.wikipedia.org/wiki/Circular_shift


I wonder if this should, in the future, be a part of 'compat/', maybe
even using compiler intrinsics for "rotate right" if available (see
https://stackoverflow.com/a/776523/46058).  But that might be outside of
the scope of this patch (perhaps outside of choosing function name).

> +

It would be nice to have reference to the source of algorithm, or to the
code that was borrowed for this in the header comment for the following
function.

I will be comparing the algorithm itself in Wikipedia
https://en.wikipedia.org/wiki/MurmurHash#Algorithm
and its implementation in C in qLibc library (BSD licensed)
https://github.com/wolkykim/qlibc/blob/03a8ce035391adf88d6d755f9a26967c16a1a567/src/utilities/qhash.c#L258

> +static uint32_t seed_murmur3(uint32_t seed, const char *data, int len)
> +{
> +	const uint32_t c1 = 0xcc9e2d51;
> +	const uint32_t c2 = 0x1b873593;
> +	const int32_t r1 = 15;
> +	const int32_t r2 = 13;

Those two: r1 and r1, probably should be both uint32_t type.

> +	const uint32_t m = 5;
> +	const uint32_t n = 0xe6546b64;
> +	int i;
> +	uint32_t k1 = 0;
> +	const char *tail;

*tail should probably be 'uint8_t', not 'char', isn't it?

> +
> +	int len4 = len / sizeof(uint32_t);

All length variables and parameters, i.e. `len`, `len4`, `i`, could
possibly be `size_t` and not `int` type.

> +
> +	const uint32_t *blocks = (const uint32_t*)data;
> +

Some implementations copy `seed` (or assume seed=0) to the local
variable named `h` or `hash`.

> +	uint32_t k;
> +	for (i = 0; i < len4; i++)
> +	{
> +		k = blocks[i];
> +		k *= c1;
> +		k = rotate_right(k, r1);

Shouldn't it be *rotate_left* (ROL), not rotate_right (ROR)???
This affects all cases / uses.

> +		k *= c2;
> +
> +		seed ^= k;
> +		seed = rotate_right(seed, r2) * m + n;
> +	}
> +
> +	tail = (data + len4 * sizeof(uint32_t));
> +

We could have reused variable `k`, like the implementation in qLibc
does, instead of introducing new `k1` variable, but this way it is more
clean.  Or name it `remainingBytes` instead of `k1`

> +	switch (len & (sizeof(uint32_t) - 1))
> +	{
> +	case 3:
> +		k1 ^= ((uint32_t)tail[2]) << 16;
> +		/*-fallthrough*/
> +	case 2:
> +		k1 ^= ((uint32_t)tail[1]) << 8;
> +		/*-fallthrough*/
> +	case 1:
> +		k1 ^= ((uint32_t)tail[0]) << 0;
> +		k1 *= c1;
> +		k1 = rotate_right(k1, r1);
> +		k1 *= c2;
> +		seed ^= k1;
> +		break;
> +	}
> +
> +	seed ^= (uint32_t)len;
> +	seed ^= (seed >> 16);
> +	seed *= 0x85ebca6b;
> +	seed ^= (seed >> 13);
> +	seed *= 0xc2b2ae35;
> +	seed ^= (seed >> 16);
> +
> +	return seed;
> +}
> +

It would be nice to have header comment describing what this function is
intended to actually do.

> +static inline uint64_t get_bitmask(uint32_t pos)
> +{
> +	return ((uint64_t)1) << (pos & (BITS_PER_BLOCK - 1));
> +}

Sidenote: I wonder if ewah/bitmap.c implements something similar.
Certainly possible consolidation, if any possible exists, should be left
for the future.

> +
> +void fill_bloom_key(const char *data,
> +		    int len,
> +		    struct bloom_key *key,
> +		    struct bloom_filter_settings *settings)
> +{
> +	int i;
> +	uint32_t seed0 = 0x293ae76f;
> +	uint32_t seed1 = 0x7e646e2c;

Where did those constants came from?  It would be nice to have a
reference either in header comment (in bloom.h or bloom.c), or in a
commit message, or both.

Note that above *constants* are each used only once.

> +
> +	uint32_t hash0 = seed_murmur3(seed0, data, len);
> +	uint32_t hash1 = seed_murmur3(seed1, data, len);

Those are constant values, so perhaps they should be `const uint32_t`.

> +
> +	key->hashes = (uint32_t *)xcalloc(settings->num_hashes, sizeof(uint32_t));
> +	for (i = 0; i < settings->num_hashes; i++)
> +		key->hashes[i] = hash0 + i * hash1;

It looks like this code implements the double hashing technique given in
Eq. (4) in http://www.ccs.neu.edu/home/pete/pub/bloom-filters-verification.pdf
that is "Bloom Filters in Probabilistic Verification".

Note that Dillinger and Manolios in this paper propose also _enhanced_
double hashing algorithm (Algorithm 2 on page 11), which has closed form
given by Eq. (6) - with better science-theoretical properties at
similar cost.


It might be a good idea to explicitly state in the header comment that
all arithmetic is performed with unsigned 32-bit integers, which means
that operations are performed modulo 2^32.  Or it might not be needed.

> +}
> +
> +static void add_key_to_filter(struct bloom_key *key,
> +			      struct bloom_filter *filter,
> +			      struct bloom_filter_settings *settings)
> +{
> +	int i;
> +	uint64_t mod = filter->len * BITS_PER_BLOCK;
> +
> +	for (i = 0; i < settings->num_hashes; i++) {
> +		uint64_t hash_mod = key->hashes[i] % mod;
> +		uint64_t block_pos = hash_mod / BITS_PER_BLOCK;

All right.  Because Bloom filters for different commits (and the same
key) may have different lengths, we can perform modulo operation only
here.  `hash_mod` is i-th hash modulo size of filter, and `block_pos` is
the block the '1' bit would go into.

> +
> +		filter->data[block_pos] |= get_bitmask(hash_mod);

I'm not quite convinced that get_bitmask() is a good name: this function
returns bitmap with hash_mod's bit set to 1.  On the other hand it
doesn't matter, because it is static (file-local) helper function.

Never mind then.

> +	}
> +}
> +
> +void load_bloom_filters(void)
> +{
> +	init_bloom_filter_slab(&bloom_filters);
> +}

Why *load* if all it does is initialize?

> +
> +struct bloom_filter *get_bloom_filter(struct repository *r,
> +				      struct commit *c)

I will not comment on this function; see Jeff King reply and Derrick
Stolee reply.

> +{
[...]
> +}
> \ No newline at end of file

Why there is no newline at the end of the file?  Accident?

> diff --git a/commit-graph.c b/commit-graph.c
> index e771394aff..61e60ff98a 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -16,6 +16,7 @@
>  #include "hashmap.h"
>  #include "replace-object.h"
>  #include "progress.h"
> +#include "bloom.h"
>  
>  #define GRAPH_SIGNATURE 0x43475048 /* "CGPH" */
>  #define GRAPH_CHUNKID_OIDFANOUT 0x4f494446 /* "OIDF" */
> @@ -794,9 +795,11 @@ struct write_commit_graph_context {
>  	unsigned append:1,
>  		 report_progress:1,
>  		 split:1,
> -		 check_oids:1;
> +		 check_oids:1,
> +		 bloom:1;

Very minor nitpick: why `bloom` and not `bloom_filter`?

>  
>  	const struct split_commit_graph_opts *split_opts;
> +	uint32_t total_bloom_filter_size;

All right, I guess size of all Bloom filters would fit in uint32_t, no
need for size_t, is it?

Shouldn't it be total_bloom_filters_size -- it is not a single Bloom
filter, but many (minor nitpick)?

>  };
>  
>  static void write_graph_chunk_fanout(struct hashfile *f,
> @@ -1139,6 +1142,28 @@ static void compute_generation_numbers(struct write_commit_graph_context *ctx)
>  	stop_progress(&ctx->progress);
>  }
>  
> +static void compute_bloom_filters(struct write_commit_graph_context *ctx)
> +{
> +	int i;
> +	struct progress *progress = NULL;
> +
> +	load_bloom_filters();
> +
> +	if (ctx->report_progress)
> +		progress = start_progress(
> +			_("Computing commit diff Bloom filters"),
> +			ctx->commits.nr);
> +
> +	for (i = 0; i < ctx->commits.nr; i++) {
> +		struct commit *c = ctx->commits.list[i];
> +		struct bloom_filter *filter = get_bloom_filter(ctx->r, c);
> +		ctx->total_bloom_filter_size += sizeof(uint64_t) * filter->len;

Wouldn't it be more future proof instead of using `sizeof(uint64_t)` to
use `sizeof(filter->data[0])` here?  This may be not worth it, and be
less readable (we have hard-coded use of 64-bits blocks in other places).

> +		display_progress(progress, i + 1);
> +	}
> +
> +	stop_progress(&progress);
> +}
> +
>  static int add_ref_to_list(const char *refname,
>  			   const struct object_id *oid,
>  			   int flags, void *cb_data)
> @@ -1791,6 +1816,8 @@ int write_commit_graph(const char *obj_dir,
>  	ctx->split = flags & COMMIT_GRAPH_WRITE_SPLIT ? 1 : 0;
>  	ctx->check_oids = flags & COMMIT_GRAPH_WRITE_CHECK_OIDS ? 1 : 0;
>  	ctx->split_opts = split_opts;
> +	ctx->bloom = flags & COMMIT_GRAPH_WRITE_BLOOM_FILTERS ? 1 : 0;

All right, this flag was defined in [PATCH 1/9].

The ordering of setting `ctx` members looks a bit strange.  Now it is
neither check `flags` firsts, neither keep related stuff together (see
ctx->split vs ctx->split_opts).  This is a very minor nitpick.

> +	ctx->total_bloom_filter_size = 0;
>  
>  	if (ctx->split) {
>  		struct commit_graph *g;
> @@ -1885,6 +1912,9 @@ int write_commit_graph(const char *obj_dir,
>  
>  	compute_generation_numbers(ctx);
>  
> +	if (ctx->bloom)
> +		compute_bloom_filters(ctx);
> +
>  	res = write_commit_graph_file(ctx);
>  
>  	if (ctx->split)

Regards,
-- 
Jakub Narębski

^ permalink raw reply

* Re: git-am doesn't strip CRLF line endings when the mbox is base64-encoded
From: George Dunlap @ 2020-01-06 18:10 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <xmqqpnfwa4y9.fsf@gitster-ct.c.googlers.com>

On 1/6/20 5:07 PM, Junio C Hamano wrote:
> George Dunlap <george.dunlap@citrix.com> writes:
> 
>> On 12/18/19 12:15 PM, George Dunlap wrote:
>>> On 12/18/19 11:42 AM, George Dunlap wrote:
>>>> Using git 2.24.0 from Debian testing.
>>>>
>>>> It seems that git-am will strip CRLF endings from mails before applying
>>>> patches when the mail isn't encoded in any way.  It will also decode
>>>> base64-encoded mails.  But it won't strip CRLF endings from
>>>> base64-encoded mails.
>>>>
>>>> Attached are two mbox files for two different recent series.
>>>> plainenc.am applies cleanly with `git am`, while base64enc.am doesn't.
>>>>
>>>> Poking around the man pages, it looks like part of the issue might be
>>>> that the CRLF stripping is done in `git mailsplit`, before the base64
>>>> encoding, rather than after.
>>>
>>> Poking around -- it looks like the CRLF stripping would be better done
>>> in `git mailinfo` after the decoding.
>>
>> Anyone want to take this up?  I mean, I could try to send a patch, but
>> since I've never looked at the git source code before, I'm sure it would
>> take me about 10x as much effort for me to do it as for someone already
>> familiar with the codebase.
> 
> Even before writing a patch, somebody needs to come up with a
> sensible design first.  --[no-]keep-cr is about "because transfer of
> e-mail messages between MTAs and to the receiving MUA is defined in
> terms of CRLF delimited lines per RFC, Git cannot tell if the CRLF
> in the input was meant to be part of the patch (i.e. the diff is
> describing a change between preimage and postimage of a file that
> uses CRLF line endings) or they are cruft added during transit.  By
> default we favor LF endings so we will strip, but we leave an option
> to keep CRs at the end of lines".  
> 
> What you are asking for is quite different, isn't it?  "We know the
> CRLF in the payload is from the original because they were protected
> from getting munged during the transfer by being MIME-encased.
> Please tell Git to preprocess that payload to convert CRLF to LF
> before treating it as a patch".

Actually that's not true. :-)   The RFC specifies CRLF for text sections
regardless of the encoding:

8<----
   The canonical form of any MIME "text" subtype MUST always represent a
   line break as a CRLF sequence.  Similarly, any occurrence of CRLF in
   MIME "text" MUST represent a line break.  Use of CR and LF outside of
   line break sequences is also forbidden.
----->8

[1] https://tools.ietf.org/html/rfc2046#section-4.1.1

Just as for plaintext encoding, we do not, in fact, know that CRLF is in
the original; in the example I included above, I'm confident that CRLF
is *not* in the original, but was added by an MTA afterwards.

As such, at the moment what `mailsplit` does is generate a load of
non-RFC-compliant email messages (since text-plain encoded messages
won't have CRLF, in contravention of the spec), and `mailinfo`
incorrectly interprets base64-encoded sections.  Moving the CR-stripping
from mailsplit to mailinfo would make them both more RFC-compliant.

(Sorry this wasn't in the original report -- I've been doing a lot more
digging since then.)

> So, if you are imagining to change the meaning of --[no-]keep-cr, I
> do not think it will fly (that is why I said that we need a sensible
> design before a patch).
> 
> And by stepping back a bit like so, and once we start viewing this
> as "after receiving a piece of e-mail from MUA (where --[no-]keep-cr
> may affect the outermost CRLF line endings) and unwrapping possible
> MIME-encasing, we can optionally tell Git to pass the payload
> further through a preprocess filter", we'd realize that this does
> not have to be limited to just running dos2unix (you may want to run
> iconv to fix encodings, for example), which would mean that the new
> flag may not just want to be --strip-cr, which is too limiting, but
> rather want to be --filter-message=<how> where <how> could be one of
> the canned preprocess filter (among which your dos2unix may exist)
> or an external script.
> 
> I am not saying that "--filter-message=<how>" must be the "sensible
> design" I mentioned at the beginning of this message---the above is
> to illustrate what kind of thought needs to go in before even the
> first line of the patch gets written.
As I mentioned in another thread, there's already an `applypatch-msg`
hook which can be used to do arbitrary modifications on commit messages
before applying.  Another way to fix this would be to add an
`applypatch-patch` hook, which allowed you to do arbitrary modifications
on the patch before applying.

I certainly think that `applypatch-patch` would be a useful thing to
add.  But since making `mailsplit` and `mailinfo` more RFC-compliant is
both good in itself, and probably easier, I still think that's the best
thing to do first.

 -George

^ permalink raw reply

* Re: [PATCH 1/1] worktree: delete branches auto-created by 'worktree add'
From: Pratyush Yadav @ 2020-01-06 18:01 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List
In-Reply-To: <CAPig+cQmqKiYWDWFH5eK2S6XPOi2t2+8Oas8yZa8R=bKLym3wQ@mail.gmail.com>

On 05/01/20 11:20PM, Eric Sunshine wrote:
> On Fri, Dec 27, 2019 at 6:05 AM Eric Sunshine <sunshine@sunshineco.com> wrote:
> > On Sat, Dec 14, 2019 at 11:16 AM Pratyush Yadav <me@yadavpratyush.com> wrote:
> > > When no branch name is supplied to 'worktree add', it creates a new
> > > branch based on the name of the directory the new worktree is located
> > > in. But when the worktree is later removed, that created branch is left
> > > over.
> >
> > This is describing the existing (intentional) behavior but doesn't
> > explain why this might be annoying or problematic. To help sell the
> > patch, it might make sense to say something about how the behavior can
> > trip up newcomers to git-worktree, leaving them to wonder why they are
> > accumulating so many branches that they weren't aware they created. A
> > comment about why you think "git worktree add -d foo" is not a viable
> > way to side-step the creation of unwanted branches might also be
> > worthwhile.
> 
> As an alternative to this patch, would the simpler approach of
> improving git-worktree documentation to do a better job of pointing
> people at -d/--detach as a way to side-step unwanted branch creation
> make sense? That is, at minimum, enhance the "Description" section to
> prominently talk about throwaway worktrees (created with -d/--detach),
> and add an example to the "Examples" section (perhaps as the first
> example) showing creation/use/deletion of a throwaway worktree.
> 
> Some points in favor of just updating the documentation to address
> this issue (rather than implementing the new behavior suggested by
> this patch) include:
> 
> * far simpler; no code to implement or debug
> 
> * no (surprising) behavior changes
> 
> * "git worktree add -d foo" is about as easy to type and remember as
>   "git worktree add foo"

FYI, I'm running Git v2.24.1 and 'git worktree add' doesn't accept the 
option '-d'. It only accepts '--detach'. And looking at the current 
'next', I don't see the option mentioned in git-worktree.txt. So at the 
very least, we should start by actually adding the option.
 
> Of lesser importance, it might make sense, as a followup, to add a
> configuration which changes the default behavior to detach instead of
> auto-creating a branch. I wonder if this could be piggybacked on the
> existing "worktree.guessremote" configuration. Or rather,
> retire/deprecate that configuration and add a new one which affects
> DWIM'ing behavior such that it becomes multi-state. Some possible
> values for the new configuration: "auto" (or "dwim" or whatever),
> "guessremote", "detach". (I haven't thought this through thoroughly,
> so there might be holes in my suggestion.)

Honestly, coupled with a configuration variable this alternative fits my 
use-case really well.

I think 'guessremote' does not describe very well what the config 
variable would actually do. So I think deprecating it would be a better 
idea.

Does 'worktree.newBranch' sound like a good name? (Disclaimer: I am 
terrible at naming things).
 
> There's at least one point not in favor of merely updating the
> documentation to promote -d/--detach more heavily, and that is that
> (presumably) the concept of detached HEAD is perceived as an advanced
> topic, so it may not be suitable for the newcomer or casual user.

I'm basing this off no data so take it with a grain of salt, but I think 
people who know Git enough to understand the concept of multiple 
worktrees should also understand what a detached HEAD is. And even if 
they already don't know what it is, they should have no trouble quickly 
reading one of the many great explanations available with a simple 
Google search.

My argument in favor of auto-deletion is that we should still try to 
have sane defaults. Leaving behind a branch the user didn't explicitly 
create and didn't use doesn't sound like a sane default to me.

The configuration variable path is easier and suits my needs really 
well, so I am inclined to just go with it. But making the whole user 
experience better for everyone is still something worthwhile. But then 
again, introducing a backwards-incompatible change might not be the best 
idea. So, I dunno.

-- 
Regards,
Pratyush Yadav

^ permalink raw reply

* Re: [PATCH 1/1] sequencer: comment out the 'squash!' line
From: Mike Rappazzo @ 2020-01-06 17:34 UTC (permalink / raw)
  To: phillip.wood; +Cc: Michael Rappazzo via GitGitGadget, Git List, Junio C Hamano
In-Reply-To: <13b47c13-7a8b-a205-0cdb-5fdcb8d42412@gmail.com>

On Mon, Jan 6, 2020 at 12:10 PM Phillip Wood <phillip.wood123@gmail.com> wrote:
>
> Hi Michael
>
> On 06/01/2020 16:04, Michael Rappazzo via GitGitGadget wrote:
> > From: Michael Rappazzo <rappazzo@gmail.com>
> >
> > When performing a squash commit, the commit comments are combined into a
> > single commit.  Since the subject line of the squash commit is used to
> > identify the squash-to target commit, it cannot offer any useful
> > contribution to the new commit message.  Therefore, the squash commit
> > subject line it commented out from the combined message (much like a
> > fixup commit's full comment).
>
> I like the idea but I think it would be better to only comment out the
> subject of the commit message if it starts with squash! for fixup!
> otherwise it may well be a useful part of the message. For correctness I
> think it would be better to comment out the subject (everything before
> the first blank line as returned by `git log --pretty=%s`) rather than
> just the first line. I've actually implemented this as part of a longer
> series that I've never got round to posting to the list[1] - feel free
> to use any of that which you find useful. That commit also shows an
> alternative way to change the --autosquash tests.
>
> [1]
> https://github.com/phillipwood/git/commit/b91b492e4aba1ac8d244859361379d5063cfc2b8

That makes sense.  Since your implementation is probably better, it
seems like the
only thing useful left from mine is the test that I added.  I'll look
to resubmit the patch
with your commit.

> > The body of a squash commit may contain additional content to add to the
> > commit message, so this part of the squash commit message is retained.
> >
> > Since this change what the expected post-rebase commit comment would look
> > like, related test expectations are adjusted to reflect the the new
> > expectation.  A new test is added for the new expectation.
> >
> > Signed-off-by: Michael Rappazzo <rappazzo@gmail.com>
> > ---
> >   sequencer.c                   |  1 +
> >   t/t3404-rebase-interactive.sh |  4 +---
> >   t/t3415-rebase-autosquash.sh  | 36 +++++++++++++++++++++++++++--------
> >   t/t3900-i18n-commit.sh        |  4 ----
> >   4 files changed, 30 insertions(+), 15 deletions(-)
> >
> > diff --git a/sequencer.c b/sequencer.c
> > index 763ccbbc45..e5602686d7 100644
> > --- a/sequencer.c
> > +++ b/sequencer.c
> > @@ -1756,6 +1756,7 @@ static int update_squash_messages(struct repository *r,
> >               strbuf_addf(&buf, _("This is the commit message #%d:"),
> >                           ++opts->current_fixup_count + 1);
> >               strbuf_addstr(&buf, "\n\n");
> > +             strbuf_addf(&buf, "%c ", comment_line_char);
> >               strbuf_addstr(&buf, body);
> >       } else if (command == TODO_FIXUP) {
> >               strbuf_addf(&buf, "\n%c ", comment_line_char);
> > diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
> > index ae6e55ce79..57d178d431 100755
> > --- a/t/t3404-rebase-interactive.sh
> > +++ b/t/t3404-rebase-interactive.sh
> > @@ -513,8 +513,6 @@ test_expect_success C_LOCALE_OUTPUT 'squash and fixup generate correct log messa
> >       cat >expect-squash-fixup <<-\EOF &&
> >       B
> >
> > -     D
> > -
> >       ONCE
> >       EOF
> >       git checkout -b squash-fixup E &&
> > @@ -1325,7 +1323,7 @@ test_expect_success 'rebase -i commits that overwrite untracked files (squash)'
> >       test_cmp_rev HEAD F &&
> >       rm file6 &&
> >       git rebase --continue &&
> > -     test $(git cat-file commit HEAD | sed -ne \$p) = I &&
> > +     test $(git cat-file commit HEAD | sed -ne \$p) = F &&
> >       git reset --hard original-branch2
> >   '
> >
> > diff --git a/t/t3415-rebase-autosquash.sh b/t/t3415-rebase-autosquash.sh
> > index 22d218698e..51c5a94aea 100755
> > --- a/t/t3415-rebase-autosquash.sh
> > +++ b/t/t3415-rebase-autosquash.sh
> > @@ -59,7 +59,6 @@ test_auto_squash () {
> >       git add -u &&
> >       test_tick &&
> >       git commit -m "squash! first" &&
> > -
> >       git tag $1 &&
> >       test_tick &&
> >       git rebase $2 -i HEAD^^^ &&
> > @@ -67,7 +66,7 @@ test_auto_squash () {
> >       test_line_count = 3 actual &&
> >       git diff --exit-code $1 &&
> >       test 1 = "$(git cat-file blob HEAD^:file1)" &&
> > -     test 2 = $(git cat-file commit HEAD^ | grep first | wc -l)
> > +     test 1 = $(git cat-file commit HEAD^ | grep first | wc -l)
> >   }
> >
> >   test_expect_success 'auto squash (option)' '
> > @@ -82,6 +81,27 @@ test_expect_success 'auto squash (config)' '
> >       test_must_fail test_auto_squash final-squash-config-false
> >   '
> >
> > +test_expect_success 'auto squash includes squash body but not squash directive' '
> > +     git reset --hard base &&
> > +     echo 1 >file1 &&
> > +     git add -u &&
> > +     test_tick &&
> > +     git commit -m "squash! first
> > +
> > +Additional Body" &&
> git commit --squash=first -m "Additional Body"
> would avoid the multi line argument
>
> > +     git tag squash-with-body &&
> > +     test_tick &&
> > +     git rebase --autosquash -i HEAD^^^ &&
> > +     git log --oneline >actual &&
> > +     git log --oneline --format="%s%n%b" >actual-full &&
>
> git log --format=%B ?

The difference is that %B has the extra blank line.  The check below wouldn't
see the difference, so I guess %B is easier to read.

>
> > +     test_line_count = 3 actual &&
> > +     git diff --exit-code squash-with-body &&
> > +     test 1 = "$(git cat-file blob HEAD^:file1)" &&
> > +     test 1 = $(git cat-file commit HEAD^ | grep first | wc -l) &&
> > +     test 0 = $(grep squash actual-full | wc -l) &&
>
> grep -v squash actual-full
> is simpler I think
>
> Best Wishes
>
> Phillip
>
> > +     test 1 = $(grep Additional actual-full | wc -l)
> > +'
> > +
> >   test_expect_success 'misspelled auto squash' '
> >       git reset --hard base &&
> >       echo 1 >file1 &&
> > @@ -114,7 +134,7 @@ test_expect_success 'auto squash that matches 2 commits' '
> >       test_line_count = 4 actual &&
> >       git diff --exit-code final-multisquash &&
> >       test 1 = "$(git cat-file blob HEAD^^:file1)" &&
> > -     test 2 = $(git cat-file commit HEAD^^ | grep first | wc -l) &&
> > +     test 1 = $(git cat-file commit HEAD^^ | grep first | wc -l) &&
> >       test 1 = $(git cat-file commit HEAD | grep first | wc -l)
> >   '
> >
> > @@ -152,7 +172,7 @@ test_expect_success 'auto squash that matches a sha1' '
> >       test_line_count = 3 actual &&
> >       git diff --exit-code final-shasquash &&
> >       test 1 = "$(git cat-file blob HEAD^:file1)" &&
> > -     test 1 = $(git cat-file commit HEAD^ | grep squash | wc -l)
> > +     test 0 = $(git cat-file commit HEAD^ | grep squash | wc -l)
> >   '
> >
> >   test_expect_success 'auto squash that matches longer sha1' '
> > @@ -168,7 +188,7 @@ test_expect_success 'auto squash that matches longer sha1' '
> >       test_line_count = 3 actual &&
> >       git diff --exit-code final-longshasquash &&
> >       test 1 = "$(git cat-file blob HEAD^:file1)" &&
> > -     test 1 = $(git cat-file commit HEAD^ | grep squash | wc -l)
> > +     test 0 = $(git cat-file commit HEAD^ | grep squash | wc -l)
> >   '
> >
> >   test_auto_commit_flags () {
> > @@ -192,7 +212,7 @@ test_expect_success 'use commit --fixup' '
> >   '
> >
> >   test_expect_success 'use commit --squash' '
> > -     test_auto_commit_flags squash 2
> > +     test_auto_commit_flags squash 1
> >   '
> >
> >   test_auto_fixup_fixup () {
> > @@ -228,7 +248,7 @@ test_auto_fixup_fixup () {
> >               test 1 = $(git cat-file commit HEAD^ | grep first | wc -l)
> >       elif test "$1" = "squash"
> >       then
> > -             test 3 = $(git cat-file commit HEAD^ | grep first | wc -l)
> > +             test 1 = $(git cat-file commit HEAD^ | grep first | wc -l)
> >       else
> >               false
> >       fi
> > @@ -268,7 +288,7 @@ test_expect_success C_LOCALE_OUTPUT 'autosquash with custom inst format' '
> >       test_line_count = 3 actual &&
> >       git diff --exit-code final-squash-instFmt &&
> >       test 1 = "$(git cat-file blob HEAD^:file1)" &&
> > -     test 2 = $(git cat-file commit HEAD^ | grep squash | wc -l)
> > +     test 0 = $(git cat-file commit HEAD^ | grep squash | wc -l)
> >   '
> >
> >   test_expect_success 'autosquash with empty custom instructionFormat' '
> > diff --git a/t/t3900-i18n-commit.sh b/t/t3900-i18n-commit.sh
> > index d277a9f4b7..bfab245eb3 100755
> > --- a/t/t3900-i18n-commit.sh
> > +++ b/t/t3900-i18n-commit.sh
> > @@ -226,10 +226,6 @@ test_commit_autosquash_multi_encoding () {
> >               git rev-list HEAD >actual &&
> >               test_line_count = 3 actual &&
> >               iconv -f $old -t UTF-8 "$TEST_DIRECTORY"/t3900/$msg >expect &&
> > -             if test $flag = squash; then
> > -                     subject="$(head -1 expect)" &&
> > -                     printf "\nsquash! %s\n" "$subject" >>expect
> > -             fi &&
> >               git cat-file commit HEAD^ >raw &&
> >               (sed "1,/^$/d" raw | iconv -f $new -t utf-8) >actual &&
> >               test_cmp expect actual
> >

^ permalink raw reply

* Re: [PATCH 0/1] sequencer: comment out the 'squash!' line
From: Junio C Hamano @ 2020-01-06 17:32 UTC (permalink / raw)
  To: Michael Rappazzo via GitGitGadget; +Cc: git
In-Reply-To: <pull.511.git.1578326648.gitgitgadget@gmail.com>

"Michael Rappazzo via GitGitGadget" <gitgitgadget@gmail.com> writes:

> Since this change what the expected post-rebase commit comment would look
> like, related test expectations are adjusted to reflect the the new
> expectation. A new test is added for the new expectation.

Doesn't that mean automated tools people may have written require
similar adjustment to continue working correctly if this change is
applied?

Can you tell us more about your expected use case?  I am imagining
that most people use the log messages from both/all commits being
squashed when manually editing to perfect the final log message (as
opposed to mechanically processing the concatenated message), so it
shouldn't matter if the squash! title is untouched or commented out
to them, and those (probably minority) who are mechanical processing
will be hurt with this change, so I do not quite see the point of
this patch.

Thanks.

>
> Signed-off-by: Michael Rappazzo rappazzo@gmail.com [rappazzo@gmail.com]
>
> Michael Rappazzo (1):
>   sequencer: comment out the 'squash!' line
>
>  sequencer.c                   |  1 +
>  t/t3404-rebase-interactive.sh |  4 +---
>  t/t3415-rebase-autosquash.sh  | 36 +++++++++++++++++++++++++++--------
>  t/t3900-i18n-commit.sh        |  4 ----
>  4 files changed, 30 insertions(+), 15 deletions(-)
>
>
> base-commit: 8679ef24ed64018bb62170c43ce73e0261c0600a
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-511%2Frappazzo%2Fcomment-squash-subject-line-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-511/rappazzo/comment-squash-subject-line-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/511

^ permalink raw reply

* Re: [PATCH 1/1] doc: submodule: fix typo for command absorbgitdirs
From: Junio C Hamano @ 2020-01-06 17:18 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, thomas menzel
In-Reply-To: <b032358ca97df3bd5605ff356196f82a1f16a320.1578322213.git.gitgitgadget@gmail.com>

When anybody tries to respond to the quoted message [*1*] that came
from GGG in the way recommended for the list, the
*.noreply.github.com address gets placed on the Cc:, which is
probably not wanted.

Can GGG do something about it?  As GitHub specific service, it is
not wrong for it to know what noreply.github.com means so dropping
such an address from its Cc: line shouldn't be too difficult.

Thanks.


Reference *1* 

<b032358ca97df3bd5605ff356196f82a1f16a320.1578322213.git.gitgitgadget@gmail.com>


"elonderin via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: elonderin <8225602+elonderin@users.noreply.github.com>
>
> The sentence wants to talk about the superproject's possesive, not plural form.
>
> Signed-off-by: thomas menzel dev@tomsit.de
> ---
>  Documentation/git-submodule.txt | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
> index 22425cbc76..5232407f68 100644
> --- a/Documentation/git-submodule.txt
> +++ b/Documentation/git-submodule.txt
> @@ -248,7 +248,7 @@ registered submodules, and sync any nested submodules within.
>  
>  absorbgitdirs::
>  	If a git directory of a submodule is inside the submodule,
> -	move the git directory of the submodule into its superprojects
> +	move the git directory of the submodule into its superproject's
>  	`$GIT_DIR/modules` path and then connect the git directory and
>  	its working directory by setting the `core.worktree` and adding
>  	a .git file pointing to the git directory embedded in the

^ permalink raw reply

* Re: [PATCH 1/1] doc: submodule: fix typo for command absorbgitdirs
From: Junio C Hamano @ 2020-01-06 17:13 UTC (permalink / raw)
  To: elonderin via GitGitGadget; +Cc: git, thomas menzel
In-Reply-To: <b032358ca97df3bd5605ff356196f82a1f16a320.1578322213.git.gitgitgadget@gmail.com>

"elonderin via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: elonderin <8225602+elonderin@users.noreply.github.com>
>
> The sentence wants to talk about the superproject's possesive, not plural form.
>
> Signed-off-by: thomas menzel dev@tomsit.de

The author of the patch must sign-off, so the above three lines
should look more like so:

	From: Thomas Menzel <dev@tomsit.de>

	The sentence wants to talk about the superproject's
	possesive, not plural form.

	From: Thomas Menzel <dev@tomsit.de>

The content of the patch obviously is correct.

Thanks.

> ---
>  Documentation/git-submodule.txt | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
> index 22425cbc76..5232407f68 100644
> --- a/Documentation/git-submodule.txt
> +++ b/Documentation/git-submodule.txt
> @@ -248,7 +248,7 @@ registered submodules, and sync any nested submodules within.
>  
>  absorbgitdirs::
>  	If a git directory of a submodule is inside the submodule,
> -	move the git directory of the submodule into its superprojects
> +	move the git directory of the submodule into its superproject's
>  	`$GIT_DIR/modules` path and then connect the git directory and
>  	its working directory by setting the `core.worktree` and adding
>  	a .git file pointing to the git directory embedded in the

^ permalink raw reply

* Re: [PATCH 1/1] sequencer: comment out the 'squash!' line
From: Phillip Wood @ 2020-01-06 17:10 UTC (permalink / raw)
  To: Michael Rappazzo via GitGitGadget, git; +Cc: Junio C Hamano, Michael Rappazzo
In-Reply-To: <b262a9d099b882339e9cb930b0a09fd5fe6734b0.1578326648.git.gitgitgadget@gmail.com>

Hi Michael

On 06/01/2020 16:04, Michael Rappazzo via GitGitGadget wrote:
> From: Michael Rappazzo <rappazzo@gmail.com>
> 
> When performing a squash commit, the commit comments are combined into a
> single commit.  Since the subject line of the squash commit is used to
> identify the squash-to target commit, it cannot offer any useful
> contribution to the new commit message.  Therefore, the squash commit
> subject line it commented out from the combined message (much like a
> fixup commit's full comment).

I like the idea but I think it would be better to only comment out the 
subject of the commit message if it starts with squash! for fixup! 
otherwise it may well be a useful part of the message. For correctness I 
think it would be better to comment out the subject (everything before 
the first blank line as returned by `git log --pretty=%s`) rather than 
just the first line. I've actually implemented this as part of a longer 
series that I've never got round to posting to the list[1] - feel free 
to use any of that which you find useful. That commit also shows an 
alternative way to change the --autosquash tests.

[1] 
https://github.com/phillipwood/git/commit/b91b492e4aba1ac8d244859361379d5063cfc2b8

> The body of a squash commit may contain additional content to add to the
> commit message, so this part of the squash commit message is retained.
> 
> Since this change what the expected post-rebase commit comment would look
> like, related test expectations are adjusted to reflect the the new
> expectation.  A new test is added for the new expectation.
> 
> Signed-off-by: Michael Rappazzo <rappazzo@gmail.com>
> ---
>   sequencer.c                   |  1 +
>   t/t3404-rebase-interactive.sh |  4 +---
>   t/t3415-rebase-autosquash.sh  | 36 +++++++++++++++++++++++++++--------
>   t/t3900-i18n-commit.sh        |  4 ----
>   4 files changed, 30 insertions(+), 15 deletions(-)
> 
> diff --git a/sequencer.c b/sequencer.c
> index 763ccbbc45..e5602686d7 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -1756,6 +1756,7 @@ static int update_squash_messages(struct repository *r,
>   		strbuf_addf(&buf, _("This is the commit message #%d:"),
>   			    ++opts->current_fixup_count + 1);
>   		strbuf_addstr(&buf, "\n\n");
> +		strbuf_addf(&buf, "%c ", comment_line_char);
>   		strbuf_addstr(&buf, body);
>   	} else if (command == TODO_FIXUP) {
>   		strbuf_addf(&buf, "\n%c ", comment_line_char);
> diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
> index ae6e55ce79..57d178d431 100755
> --- a/t/t3404-rebase-interactive.sh
> +++ b/t/t3404-rebase-interactive.sh
> @@ -513,8 +513,6 @@ test_expect_success C_LOCALE_OUTPUT 'squash and fixup generate correct log messa
>   	cat >expect-squash-fixup <<-\EOF &&
>   	B
>   
> -	D
> -
>   	ONCE
>   	EOF
>   	git checkout -b squash-fixup E &&
> @@ -1325,7 +1323,7 @@ test_expect_success 'rebase -i commits that overwrite untracked files (squash)'
>   	test_cmp_rev HEAD F &&
>   	rm file6 &&
>   	git rebase --continue &&
> -	test $(git cat-file commit HEAD | sed -ne \$p) = I &&
> +	test $(git cat-file commit HEAD | sed -ne \$p) = F &&
>   	git reset --hard original-branch2
>   '
>   
> diff --git a/t/t3415-rebase-autosquash.sh b/t/t3415-rebase-autosquash.sh
> index 22d218698e..51c5a94aea 100755
> --- a/t/t3415-rebase-autosquash.sh
> +++ b/t/t3415-rebase-autosquash.sh
> @@ -59,7 +59,6 @@ test_auto_squash () {
>   	git add -u &&
>   	test_tick &&
>   	git commit -m "squash! first" &&
> -
>   	git tag $1 &&
>   	test_tick &&
>   	git rebase $2 -i HEAD^^^ &&
> @@ -67,7 +66,7 @@ test_auto_squash () {
>   	test_line_count = 3 actual &&
>   	git diff --exit-code $1 &&
>   	test 1 = "$(git cat-file blob HEAD^:file1)" &&
> -	test 2 = $(git cat-file commit HEAD^ | grep first | wc -l)
> +	test 1 = $(git cat-file commit HEAD^ | grep first | wc -l)
>   }
>   
>   test_expect_success 'auto squash (option)' '
> @@ -82,6 +81,27 @@ test_expect_success 'auto squash (config)' '
>   	test_must_fail test_auto_squash final-squash-config-false
>   '
>   
> +test_expect_success 'auto squash includes squash body but not squash directive' '
> +	git reset --hard base &&
> +	echo 1 >file1 &&
> +	git add -u &&
> +	test_tick &&
> +	git commit -m "squash! first
> +
> +Additional Body" &&
git commit --squash=first -m "Additional Body"
would avoid the multi line argument

> +	git tag squash-with-body &&
> +	test_tick &&
> +	git rebase --autosquash -i HEAD^^^ &&
> +	git log --oneline >actual &&
> +	git log --oneline --format="%s%n%b" >actual-full &&

git log --format=%B ?

> +	test_line_count = 3 actual &&
> +	git diff --exit-code squash-with-body &&
> +	test 1 = "$(git cat-file blob HEAD^:file1)" &&
> +	test 1 = $(git cat-file commit HEAD^ | grep first | wc -l) &&
> +	test 0 = $(grep squash actual-full | wc -l) &&

grep -v squash actual-full
is simpler I think

Best Wishes

Phillip

> +	test 1 = $(grep Additional actual-full | wc -l)
> +'
> +
>   test_expect_success 'misspelled auto squash' '
>   	git reset --hard base &&
>   	echo 1 >file1 &&
> @@ -114,7 +134,7 @@ test_expect_success 'auto squash that matches 2 commits' '
>   	test_line_count = 4 actual &&
>   	git diff --exit-code final-multisquash &&
>   	test 1 = "$(git cat-file blob HEAD^^:file1)" &&
> -	test 2 = $(git cat-file commit HEAD^^ | grep first | wc -l) &&
> +	test 1 = $(git cat-file commit HEAD^^ | grep first | wc -l) &&
>   	test 1 = $(git cat-file commit HEAD | grep first | wc -l)
>   '
>   
> @@ -152,7 +172,7 @@ test_expect_success 'auto squash that matches a sha1' '
>   	test_line_count = 3 actual &&
>   	git diff --exit-code final-shasquash &&
>   	test 1 = "$(git cat-file blob HEAD^:file1)" &&
> -	test 1 = $(git cat-file commit HEAD^ | grep squash | wc -l)
> +	test 0 = $(git cat-file commit HEAD^ | grep squash | wc -l)
>   '
>   
>   test_expect_success 'auto squash that matches longer sha1' '
> @@ -168,7 +188,7 @@ test_expect_success 'auto squash that matches longer sha1' '
>   	test_line_count = 3 actual &&
>   	git diff --exit-code final-longshasquash &&
>   	test 1 = "$(git cat-file blob HEAD^:file1)" &&
> -	test 1 = $(git cat-file commit HEAD^ | grep squash | wc -l)
> +	test 0 = $(git cat-file commit HEAD^ | grep squash | wc -l)
>   '
>   
>   test_auto_commit_flags () {
> @@ -192,7 +212,7 @@ test_expect_success 'use commit --fixup' '
>   '
>   
>   test_expect_success 'use commit --squash' '
> -	test_auto_commit_flags squash 2
> +	test_auto_commit_flags squash 1
>   '
>   
>   test_auto_fixup_fixup () {
> @@ -228,7 +248,7 @@ test_auto_fixup_fixup () {
>   		test 1 = $(git cat-file commit HEAD^ | grep first | wc -l)
>   	elif test "$1" = "squash"
>   	then
> -		test 3 = $(git cat-file commit HEAD^ | grep first | wc -l)
> +		test 1 = $(git cat-file commit HEAD^ | grep first | wc -l)
>   	else
>   		false
>   	fi
> @@ -268,7 +288,7 @@ test_expect_success C_LOCALE_OUTPUT 'autosquash with custom inst format' '
>   	test_line_count = 3 actual &&
>   	git diff --exit-code final-squash-instFmt &&
>   	test 1 = "$(git cat-file blob HEAD^:file1)" &&
> -	test 2 = $(git cat-file commit HEAD^ | grep squash | wc -l)
> +	test 0 = $(git cat-file commit HEAD^ | grep squash | wc -l)
>   '
>   
>   test_expect_success 'autosquash with empty custom instructionFormat' '
> diff --git a/t/t3900-i18n-commit.sh b/t/t3900-i18n-commit.sh
> index d277a9f4b7..bfab245eb3 100755
> --- a/t/t3900-i18n-commit.sh
> +++ b/t/t3900-i18n-commit.sh
> @@ -226,10 +226,6 @@ test_commit_autosquash_multi_encoding () {
>   		git rev-list HEAD >actual &&
>   		test_line_count = 3 actual &&
>   		iconv -f $old -t UTF-8 "$TEST_DIRECTORY"/t3900/$msg >expect &&
> -		if test $flag = squash; then
> -			subject="$(head -1 expect)" &&
> -			printf "\nsquash! %s\n" "$subject" >>expect
> -		fi &&
>   		git cat-file commit HEAD^ >raw &&
>   		(sed "1,/^$/d" raw | iconv -f $new -t utf-8) >actual &&
>   		test_cmp expect actual
> 

^ permalink raw reply

* Re: git-am doesn't strip CRLF line endings when the mbox is base64-encoded
From: Junio C Hamano @ 2020-01-06 17:07 UTC (permalink / raw)
  To: George Dunlap; +Cc: git
In-Reply-To: <0c18df58-9d1d-550f-d69e-d6ffe6c01858@citrix.com>

George Dunlap <george.dunlap@citrix.com> writes:

> On 12/18/19 12:15 PM, George Dunlap wrote:
>> On 12/18/19 11:42 AM, George Dunlap wrote:
>>> Using git 2.24.0 from Debian testing.
>>>
>>> It seems that git-am will strip CRLF endings from mails before applying
>>> patches when the mail isn't encoded in any way.  It will also decode
>>> base64-encoded mails.  But it won't strip CRLF endings from
>>> base64-encoded mails.
>>>
>>> Attached are two mbox files for two different recent series.
>>> plainenc.am applies cleanly with `git am`, while base64enc.am doesn't.
>>>
>>> Poking around the man pages, it looks like part of the issue might be
>>> that the CRLF stripping is done in `git mailsplit`, before the base64
>>> encoding, rather than after.
>> 
>> Poking around -- it looks like the CRLF stripping would be better done
>> in `git mailinfo` after the decoding.
>
> Anyone want to take this up?  I mean, I could try to send a patch, but
> since I've never looked at the git source code before, I'm sure it would
> take me about 10x as much effort for me to do it as for someone already
> familiar with the codebase.

Even before writing a patch, somebody needs to come up with a
sensible design first.  --[no-]keep-cr is about "because transfer of
e-mail messages between MTAs and to the receiving MUA is defined in
terms of CRLF delimited lines per RFC, Git cannot tell if the CRLF
in the input was meant to be part of the patch (i.e. the diff is
describing a change between preimage and postimage of a file that
uses CRLF line endings) or they are cruft added during transit.  By
default we favor LF endings so we will strip, but we leave an option
to keep CRs at the end of lines".  

What you are asking for is quite different, isn't it?  "We know the
CRLF in the payload is from the original because they were protected
from getting munged during the transfer by being MIME-encased.
Please tell Git to preprocess that payload to convert CRLF to LF
before treating it as a patch".

So, if you are imagining to change the meaning of --[no-]keep-cr, I
do not think it will fly (that is why I said that we need a sensible
design before a patch).

And by stepping back a bit like so, and once we start viewing this
as "after receiving a piece of e-mail from MUA (where --[no-]keep-cr
may affect the outermost CRLF line endings) and unwrapping possible
MIME-encasing, we can optionally tell Git to pass the payload
further through a preprocess filter", we'd realize that this does
not have to be limited to just running dos2unix (you may want to run
iconv to fix encodings, for example), which would mean that the new
flag may not just want to be --strip-cr, which is too limiting, but
rather want to be --filter-message=<how> where <how> could be one of
the canned preprocess filter (among which your dos2unix may exist)
or an external script.

I am not saying that "--filter-message=<how>" must be the "sensible
design" I mentioned at the beginning of this message---the above is
to illustrate what kind of thought needs to go in before even the
first line of the patch gets written.

Thanks.

^ permalink raw reply

* Re: [PATCH 1/1] commit: make the sign-off trailer configurable
From: Junio C Hamano @ 2020-01-06 16:53 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Hans Jerry Illikainen, git
In-Reply-To: <71a718a7-2be7-391c-dc8f-0626a0a21aac@gmail.com>

Derrick Stolee <stolee@gmail.com> writes:

> My initial thought was that the sign-off was supposed to be a purposeful
> decision, but then I also realized that I never do anything in the Git
> codebase that I _can't_ put online under the GPL. (It may not make it
> upstream, but it will always be put online somewhere.)

Hmm...

Sorry, but I do not quite understand the flow of your logic here,
especially, how "but then I also realized" trumps "signing off a
patch is a conscious act---it would weaken the legal meaning if you
automate it", which was why we deliberately avoided adding this
configuration variable for the last 10+ years.

So, I dunno.

^ permalink raw reply

* [PATCH 1/1] sequencer: comment out the 'squash!' line
From: Michael Rappazzo via GitGitGadget @ 2020-01-06 16:04 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Michael Rappazzo
In-Reply-To: <pull.511.git.1578326648.gitgitgadget@gmail.com>

From: Michael Rappazzo <rappazzo@gmail.com>

When performing a squash commit, the commit comments are combined into a
single commit.  Since the subject line of the squash commit is used to
identify the squash-to target commit, it cannot offer any useful
contribution to the new commit message.  Therefore, the squash commit
subject line it commented out from the combined message (much like a
fixup commit's full comment).

The body of a squash commit may contain additional content to add to the
commit message, so this part of the squash commit message is retained.

Since this change what the expected post-rebase commit comment would look
like, related test expectations are adjusted to reflect the the new
expectation.  A new test is added for the new expectation.

Signed-off-by: Michael Rappazzo <rappazzo@gmail.com>
---
 sequencer.c                   |  1 +
 t/t3404-rebase-interactive.sh |  4 +---
 t/t3415-rebase-autosquash.sh  | 36 +++++++++++++++++++++++++++--------
 t/t3900-i18n-commit.sh        |  4 ----
 4 files changed, 30 insertions(+), 15 deletions(-)

diff --git a/sequencer.c b/sequencer.c
index 763ccbbc45..e5602686d7 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -1756,6 +1756,7 @@ static int update_squash_messages(struct repository *r,
 		strbuf_addf(&buf, _("This is the commit message #%d:"),
 			    ++opts->current_fixup_count + 1);
 		strbuf_addstr(&buf, "\n\n");
+		strbuf_addf(&buf, "%c ", comment_line_char);
 		strbuf_addstr(&buf, body);
 	} else if (command == TODO_FIXUP) {
 		strbuf_addf(&buf, "\n%c ", comment_line_char);
diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index ae6e55ce79..57d178d431 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -513,8 +513,6 @@ test_expect_success C_LOCALE_OUTPUT 'squash and fixup generate correct log messa
 	cat >expect-squash-fixup <<-\EOF &&
 	B
 
-	D
-
 	ONCE
 	EOF
 	git checkout -b squash-fixup E &&
@@ -1325,7 +1323,7 @@ test_expect_success 'rebase -i commits that overwrite untracked files (squash)'
 	test_cmp_rev HEAD F &&
 	rm file6 &&
 	git rebase --continue &&
-	test $(git cat-file commit HEAD | sed -ne \$p) = I &&
+	test $(git cat-file commit HEAD | sed -ne \$p) = F &&
 	git reset --hard original-branch2
 '
 
diff --git a/t/t3415-rebase-autosquash.sh b/t/t3415-rebase-autosquash.sh
index 22d218698e..51c5a94aea 100755
--- a/t/t3415-rebase-autosquash.sh
+++ b/t/t3415-rebase-autosquash.sh
@@ -59,7 +59,6 @@ test_auto_squash () {
 	git add -u &&
 	test_tick &&
 	git commit -m "squash! first" &&
-
 	git tag $1 &&
 	test_tick &&
 	git rebase $2 -i HEAD^^^ &&
@@ -67,7 +66,7 @@ test_auto_squash () {
 	test_line_count = 3 actual &&
 	git diff --exit-code $1 &&
 	test 1 = "$(git cat-file blob HEAD^:file1)" &&
-	test 2 = $(git cat-file commit HEAD^ | grep first | wc -l)
+	test 1 = $(git cat-file commit HEAD^ | grep first | wc -l)
 }
 
 test_expect_success 'auto squash (option)' '
@@ -82,6 +81,27 @@ test_expect_success 'auto squash (config)' '
 	test_must_fail test_auto_squash final-squash-config-false
 '
 
+test_expect_success 'auto squash includes squash body but not squash directive' '
+	git reset --hard base &&
+	echo 1 >file1 &&
+	git add -u &&
+	test_tick &&
+	git commit -m "squash! first
+
+Additional Body" &&
+	git tag squash-with-body &&
+	test_tick &&
+	git rebase --autosquash -i HEAD^^^ &&
+	git log --oneline >actual &&
+	git log --oneline --format="%s%n%b" >actual-full &&
+	test_line_count = 3 actual &&
+	git diff --exit-code squash-with-body &&
+	test 1 = "$(git cat-file blob HEAD^:file1)" &&
+	test 1 = $(git cat-file commit HEAD^ | grep first | wc -l) &&
+	test 0 = $(grep squash actual-full | wc -l) &&
+	test 1 = $(grep Additional actual-full | wc -l)
+'
+
 test_expect_success 'misspelled auto squash' '
 	git reset --hard base &&
 	echo 1 >file1 &&
@@ -114,7 +134,7 @@ test_expect_success 'auto squash that matches 2 commits' '
 	test_line_count = 4 actual &&
 	git diff --exit-code final-multisquash &&
 	test 1 = "$(git cat-file blob HEAD^^:file1)" &&
-	test 2 = $(git cat-file commit HEAD^^ | grep first | wc -l) &&
+	test 1 = $(git cat-file commit HEAD^^ | grep first | wc -l) &&
 	test 1 = $(git cat-file commit HEAD | grep first | wc -l)
 '
 
@@ -152,7 +172,7 @@ test_expect_success 'auto squash that matches a sha1' '
 	test_line_count = 3 actual &&
 	git diff --exit-code final-shasquash &&
 	test 1 = "$(git cat-file blob HEAD^:file1)" &&
-	test 1 = $(git cat-file commit HEAD^ | grep squash | wc -l)
+	test 0 = $(git cat-file commit HEAD^ | grep squash | wc -l)
 '
 
 test_expect_success 'auto squash that matches longer sha1' '
@@ -168,7 +188,7 @@ test_expect_success 'auto squash that matches longer sha1' '
 	test_line_count = 3 actual &&
 	git diff --exit-code final-longshasquash &&
 	test 1 = "$(git cat-file blob HEAD^:file1)" &&
-	test 1 = $(git cat-file commit HEAD^ | grep squash | wc -l)
+	test 0 = $(git cat-file commit HEAD^ | grep squash | wc -l)
 '
 
 test_auto_commit_flags () {
@@ -192,7 +212,7 @@ test_expect_success 'use commit --fixup' '
 '
 
 test_expect_success 'use commit --squash' '
-	test_auto_commit_flags squash 2
+	test_auto_commit_flags squash 1
 '
 
 test_auto_fixup_fixup () {
@@ -228,7 +248,7 @@ test_auto_fixup_fixup () {
 		test 1 = $(git cat-file commit HEAD^ | grep first | wc -l)
 	elif test "$1" = "squash"
 	then
-		test 3 = $(git cat-file commit HEAD^ | grep first | wc -l)
+		test 1 = $(git cat-file commit HEAD^ | grep first | wc -l)
 	else
 		false
 	fi
@@ -268,7 +288,7 @@ test_expect_success C_LOCALE_OUTPUT 'autosquash with custom inst format' '
 	test_line_count = 3 actual &&
 	git diff --exit-code final-squash-instFmt &&
 	test 1 = "$(git cat-file blob HEAD^:file1)" &&
-	test 2 = $(git cat-file commit HEAD^ | grep squash | wc -l)
+	test 0 = $(git cat-file commit HEAD^ | grep squash | wc -l)
 '
 
 test_expect_success 'autosquash with empty custom instructionFormat' '
diff --git a/t/t3900-i18n-commit.sh b/t/t3900-i18n-commit.sh
index d277a9f4b7..bfab245eb3 100755
--- a/t/t3900-i18n-commit.sh
+++ b/t/t3900-i18n-commit.sh
@@ -226,10 +226,6 @@ test_commit_autosquash_multi_encoding () {
 		git rev-list HEAD >actual &&
 		test_line_count = 3 actual &&
 		iconv -f $old -t UTF-8 "$TEST_DIRECTORY"/t3900/$msg >expect &&
-		if test $flag = squash; then
-			subject="$(head -1 expect)" &&
-			printf "\nsquash! %s\n" "$subject" >>expect
-		fi &&
 		git cat-file commit HEAD^ >raw &&
 		(sed "1,/^$/d" raw | iconv -f $new -t utf-8) >actual &&
 		test_cmp expect actual
-- 
gitgitgadget

^ permalink raw reply related

* [PATCH 0/1] sequencer: comment out the 'squash!' line
From: Michael Rappazzo via GitGitGadget @ 2020-01-06 16:04 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano

When performing a squash commit, the commit comments are combined into a
single commit. Since the subject line of the squash commit is used to
identify the squash-to target commit, it cannot offer any useful
contribution to the new commit message. Therefore, the squash commit subject
line it commented out from the combined message (much like a fixup commit's
full comment).

The body of a squash commit may contain additional content to add to the
commit message, so this part of the squash commit message is retained.

Since this change what the expected post-rebase commit comment would look
like, related test expectations are adjusted to reflect the the new
expectation. A new test is added for the new expectation.

Signed-off-by: Michael Rappazzo rappazzo@gmail.com [rappazzo@gmail.com]

Michael Rappazzo (1):
  sequencer: comment out the 'squash!' line

 sequencer.c                   |  1 +
 t/t3404-rebase-interactive.sh |  4 +---
 t/t3415-rebase-autosquash.sh  | 36 +++++++++++++++++++++++++++--------
 t/t3900-i18n-commit.sh        |  4 ----
 4 files changed, 30 insertions(+), 15 deletions(-)


base-commit: 8679ef24ed64018bb62170c43ce73e0261c0600a
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-511%2Frappazzo%2Fcomment-squash-subject-line-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-511/rappazzo/comment-squash-subject-line-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/511
-- 
gitgitgadget

^ permalink raw reply

* [PATCH 1/1] doc: submodule: fix typo for command absorbgitdirs
From: elonderin via GitGitGadget @ 2020-01-06 14:50 UTC (permalink / raw)
  To: git; +Cc: thomas menzel, Junio C Hamano, elonderin
In-Reply-To: <pull.688.git.git.1578322213.gitgitgadget@gmail.com>

From: elonderin <8225602+elonderin@users.noreply.github.com>

The sentence wants to talk about the superproject's possesive, not plural form.

Signed-off-by: thomas menzel dev@tomsit.de
---
 Documentation/git-submodule.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index 22425cbc76..5232407f68 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -248,7 +248,7 @@ registered submodules, and sync any nested submodules within.
 
 absorbgitdirs::
 	If a git directory of a submodule is inside the submodule,
-	move the git directory of the submodule into its superprojects
+	move the git directory of the submodule into its superproject's
 	`$GIT_DIR/modules` path and then connect the git directory and
 	its working directory by setting the `core.worktree` and adding
 	a .git file pointing to the git directory embedded in the
-- 
gitgitgadget

^ permalink raw reply related

* [PATCH 0/1] doc: fixed a typo in git-submodule.txt
From: thomas menzel via GitGitGadget @ 2020-01-06 14:50 UTC (permalink / raw)
  To: git; +Cc: thomas menzel, Junio C Hamano

fixed a typo in the doc of submodule

elonderin (1):
  doc: submodule: fix typo for command absorbgitdirs

 Documentation/git-submodule.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


base-commit: 8679ef24ed64018bb62170c43ce73e0261c0600a
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-688%2Felonderin%2Fpatch-1-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-688/elonderin/patch-1-v1
Pull-Request: https://github.com/git/git/pull/688
-- 
gitgitgadget

^ permalink raw reply

* Re: [PATCH 1/1] commit: make the sign-off trailer configurable
From: Derrick Stolee @ 2020-01-06 13:38 UTC (permalink / raw)
  To: Hans Jerry Illikainen, git
In-Reply-To: <20200105174127.9278-2-hji@dyntopia.com>

On 1/5/2020 12:41 PM, Hans Jerry Illikainen wrote:
> The commit builtin did not previously have a configuration option to
> enable the 'Signed-off-by' trailer by default.
> 
> For some use-cases (namely, when the user doesn't always have the right
> to contribute patches to a project) it makes sense to make it a
> conscientious decision to add the signoff trailer.  However, others'
> might always have the right to ship patches -- in which case it makes
> sense to have an option to add the trailer by default for projects that
> require it.

My initial thought was that the sign-off was supposed to be a purposeful
decision, but then I also realized that I never do anything in the Git
codebase that I _can't_ put online under the GPL. (It may not make it
upstream, but it will always be put online somewhere.)

> This patch introduces a commit.signOff configuration option that
> determine whether the trailer should be added for commits.  It can be
> overridden with the --(no-)signoff command-line option.

With that in mind, I think this is a valuable feature.
 
> Signed-off-by: Hans Jerry Illikainen <hji@dyntopia.com>

Did you generate this with your config option? ;)

> +commit.signOff::
> +	A boolean to specify whether commits should enable the
> +	`-s/--signoff` option by default.  *Note:* Adding the
> +	Signed-off-by: line to a commit message should be a conscious
> +	act and means that you certify you have the rights to submit the
> +	work under the same open source license.  Please see the
> +	'SubmittingPatches' document for further discussion.
> +

I wonder about the language of "should be a conscious act" here. It's
as if you are trying to convince someone to not use this feature. Perhaps
switch it to "is a deliberate act" and add something such as "Enable this
value only in repos where you are the only user and always have these
rights."

The multi-user scenario may be worth clarifying explicitly here. If there
is any chance that another user will join the machine and use that same
repo, then they would have a different user.name and user.email in their
global config (probably) but this as a local setting would provide their
sign-off.

> diff --git a/builtin/commit.c b/builtin/commit.c
> index c70ad01cc9..497e29c58c 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -1474,6 +1474,10 @@ static int git_commit_config(const char *k, const char *v, void *cb)
>  		sign_commit = git_config_bool(k, v) ? "" : NULL;
>  		return 0;
>  	}
> +	if (!strcmp(k, "commit.signoff")) {
> +		signoff = git_config_bool(k, v);
> +		return 0;
> +	}

Since we are directly modifying the same global used by the --[no-]signoff
option, I verified that the config options are checked before the arguments
are parsed. Thus, --no-signoff will override commit.signOff=true...

> +test_expect_success 'commit.signOff=true' '
> +	test_config commit.signOff true &&
> +	echo 1 >>positive &&
> +	git add positive &&
> +	git commit -m "thank you" &&
> +	git cat-file commit HEAD >commit.msg &&
> +	sed -ne "s/Signed-off-by: //p" commit.msg >actual &&
> +	git var GIT_COMMITTER_IDENT >ident &&
> +	sed -e "s/>.*/>/" ident >expected &&
> +	test_cmp expected actual
> +'
> +
> +test_expect_success 'commit.signOff=true and --no-signoff' '
> +	test_config commit.signOff true &&
> +	echo 2 >>positive &&
> +	git add positive &&
> +	git commit --no-signoff -m "thank you" &&
> +	git cat-file commit HEAD >commit.msg &&
> +	sed -ne "s/Signed-off-by: //p" commit.msg >actual &&
> +	git var GIT_COMMITTER_IDENT >ident &&
> +	sed -e "s/>.*/>/" ident >expected &&
> +	! test_cmp expected actual
> +'

...which you test here, too. Excellent.

> +test_expect_success 'commit.signOff=false and --signoff' '
> +	test_config commit.signOff false &&
> +	echo 1 >>positive &&
> +	git add positive &&
> +	git commit --signoff -m "thank you" &&

Perhaps it is worth adding an explicit "-c commit.signOff=false" here?

> +	git cat-file commit HEAD >commit.msg &&
> +	sed -ne "s/Signed-off-by: //p" commit.msg >actual &&
> +	git var GIT_COMMITTER_IDENT >ident &&
> +	sed -e "s/>.*/>/" ident >expected &&
> +	test_cmp expected actual
> +'
> +

I wonder if the boilerplate for these tests could be simplified or
shared across the tests?

For example: could we not just use test_i18ngrep to see if commit.msg
contains (or does not contain) the string "Signed-off-by"?

I believe this patch delivers the stated feature well. Hopefully others
can add commentary on its usefulness or possible issues in using it.

Thanks,
-Stolee

^ permalink raw reply

* Re: git-am doesn't strip CRLF line endings when the mbox is base64-encoded
From: George Dunlap @ 2020-01-06 11:58 UTC (permalink / raw)
  To: git
In-Reply-To: <dece7350-7b58-bf19-9fdf-4ccf8df268fb@citrix.com>

On 12/18/19 12:15 PM, George Dunlap wrote:
> On 12/18/19 11:42 AM, George Dunlap wrote:
>> Using git 2.24.0 from Debian testing.
>>
>> It seems that git-am will strip CRLF endings from mails before applying
>> patches when the mail isn't encoded in any way.  It will also decode
>> base64-encoded mails.  But it won't strip CRLF endings from
>> base64-encoded mails.
>>
>> Attached are two mbox files for two different recent series.
>> plainenc.am applies cleanly with `git am`, while base64enc.am doesn't.
>>
>> Poking around the man pages, it looks like part of the issue might be
>> that the CRLF stripping is done in `git mailsplit`, before the base64
>> encoding, rather than after.
> 
> Poking around -- it looks like the CRLF stripping would be better done
> in `git mailinfo` after the decoding.

Anyone want to take this up?  I mean, I could try to send a patch, but
since I've never looked at the git source code before, I'm sure it would
take me about 10x as much effort for me to do it as for someone already
familiar with the codebase.

(And I've already done that work for stackgit:
https://github.com/ctmarinas/stgit/pull/46)

 -George

^ permalink raw reply

* Re: [PATCH 1/1] worktree: delete branches auto-created by 'worktree add'
From: Eric Sunshine @ 2020-01-06  4:20 UTC (permalink / raw)
  To: Pratyush Yadav; +Cc: Git List
In-Reply-To: <CAPig+cRL5w7azdALeBKKisTZwjgU6QhBqJRzQqDENjYiaTT0oA@mail.gmail.com>

On Fri, Dec 27, 2019 at 6:05 AM Eric Sunshine <sunshine@sunshineco.com> wrote:
> On Sat, Dec 14, 2019 at 11:16 AM Pratyush Yadav <me@yadavpratyush.com> wrote:
> > When no branch name is supplied to 'worktree add', it creates a new
> > branch based on the name of the directory the new worktree is located
> > in. But when the worktree is later removed, that created branch is left
> > over.
>
> This is describing the existing (intentional) behavior but doesn't
> explain why this might be annoying or problematic. To help sell the
> patch, it might make sense to say something about how the behavior can
> trip up newcomers to git-worktree, leaving them to wonder why they are
> accumulating so many branches that they weren't aware they created. A
> comment about why you think "git worktree add -d foo" is not a viable
> way to side-step the creation of unwanted branches might also be
> worthwhile.

As an alternative to this patch, would the simpler approach of
improving git-worktree documentation to do a better job of pointing
people at -d/--detach as a way to side-step unwanted branch creation
make sense? That is, at minimum, enhance the "Description" section to
prominently talk about throwaway worktrees (created with -d/--detach),
and add an example to the "Examples" section (perhaps as the first
example) showing creation/use/deletion of a throwaway worktree.

Some points in favor of just updating the documentation to address
this issue (rather than implementing the new behavior suggested by
this patch) include:

* far simpler; no code to implement or debug

* no (surprising) behavior changes

* "git worktree add -d foo" is about as easy to type and remember as
  "git worktree add foo"

Of lesser importance, it might make sense, as a followup, to add a
configuration which changes the default behavior to detach instead of
auto-creating a branch. I wonder if this could be piggybacked on the
existing "worktree.guessremote" configuration. Or rather,
retire/deprecate that configuration and add a new one which affects
DWIM'ing behavior such that it becomes multi-state. Some possible
values for the new configuration: "auto" (or "dwim" or whatever),
"guessremote", "detach". (I haven't thought this through thoroughly,
so there might be holes in my suggestion.)

There's at least one point not in favor of merely updating the
documentation to promote -d/--detach more heavily, and that is that
(presumably) the concept of detached HEAD is perceived as an advanced
topic, so it may not be suitable for the newcomer or casual user.

^ permalink raw reply

* Re: [RFC PATCH] bisect run: allow inverting meaning of exit code
From: Junio C Hamano @ 2020-01-06  1:16 UTC (permalink / raw)
  To: Stephen Oberholtzer; +Cc: git
In-Reply-To: <CAD_xR9fUxDTvwmAsfH-6=buRP+UmwBHhQJSV+T3paUOy-S1CGw@mail.gmail.com>

Stephen Oberholtzer <stevie@qrpff.net> writes:

>> In any case, I wonder why something along the line of the attached
>> patch is not sufficient and it needs this much code.
>> ...
> Unfortunately, that doesn't behave properly.
> As far as 'git bisect run' is concerned, there are four distinct sets
> ...
> What needs to happen is that status code lists for "test passed" (#1)
> and "test failed" (#2) are swapped; even when bisecting a fix, #3
> (untestable) and #4 (malfunction) remain unchanged.  Your patch remaps
> case #4 to case #1.

Yeah, I know.  I didn't mean to give you a perfect solution and that
was why I said "along the line of...".  I know I ignored the 128 and
above, as I usually trust that our contributors are competent enough
to be able to fill in the missing details given an outline.

The key takeaway I wanted you to notice was that a single case
statement that maps the exit code external command would give us
would look sufficient, without any of the {SUCCESS,FAIL}_TERM magic
you had in your version, which indicates that there is more than the
simple "using a run script to find where a bug was fixed can be done
by swapping exit code" going on.  And it is quite unclear why that
is needed either from the patch or the text that accompanied the
patch.

Thanks.





^ permalink raw reply

* Re: [PATCH 1/1] branch: advise the user to checkout a different branch before deleting
From: Heba Waly @ 2020-01-06  0:42 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Heba Waly via GitGitGadget, Git List, Junio C Hamano
In-Reply-To: <CAPig+cS39vcy6yT3Dg2HfGVCyg2U+7t7Xj85ayM7LaAk3zTjrg@mail.gmail.com>

On Thu, Jan 2, 2020 at 9:18 PM Eric Sunshine <sunshine@sunshineco.com> wrote:
>
> On Wed, Jan 1, 2020 at 9:50 PM Heba Waly via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> > Display a hint to the user when attempting to delete a checked out
> > branch saying "Checkout another branch before deleting this one:
> > git checkout <branch_name>".
> >
> > Currently the user gets an error message saying: "error: Cannot delete
> > branch <branch_name> checked out at <path>". The hint will be displayed
> > after the error message.
> >
> > Signed-off-by: Heba Waly <heba.waly@gmail.com>
> > ---
> > diff --git a/builtin/branch.c b/builtin/branch.c
> > @@ -240,6 +240,8 @@ static int delete_branches(int argc, const char **argv, int force, int kinds,
> >                                 error(_("Cannot delete branch '%s' "
> >                                         "checked out at '%s'"),
> >                                       bname.buf, wt->path);
> > +                               advise(_("Checkout another branch before deleting this "
> > +                                                "one: git checkout <branch_name>"));
>
> s/another/a different/ would make the meaning clearer.
>
Ok.

> Let's try to avoid underscores in placeholders. <branch-name> would be
> better, however, git-checkout documentation just calls this <branch>,
> so that's probably a good choice.
>
Yes.

> However, these days, I think we're promoting git-switch rather than
> git-checkout, so perhaps this advice should follow suit.
>

I didn't know that, will change it.

> Finally, is this advice sufficient for newcomers when the branch the
> user is trying to delete is in fact checked out in a worktree other
> than the worktree in which the git-branch command is being invoked?
> That is:
>
>     $ pwd
>     /home/me/foo
>     $ git branch -D bip
>     Cannot delete  branch 'bip' checked out at '/home/me/bar'
>     hint: Checkout another branch before deleting this one:
>     hint: git checkout <branch>
>     $ git checkout master # user follows advice
>     $ git branch -D bip
>     Cannot delete  branch 'bip' checked out at '/home/me/foo'
>     hint: Checkout another branch before deleting this one:
>     hint: git checkout <branch>
>     $
>
> And the user is left scratching his or her head wondering why
> git-branch is still showing the error despite following the
> instructions in the hint.
>

Understood.

> > diff --git a/t/t3200-branch.sh b/t/t3200-branch.sh
> > @@ -808,7 +808,8 @@ test_expect_success 'test deleting branch without config' '
> >  test_expect_success 'deleting currently checked out branch fails' '
> >         git worktree add -b my7 my7 &&
> >         test_must_fail git -C my7 branch -d my7 &&
> > -       test_must_fail git branch -d my7 &&
> > +       test_must_fail git branch -d my7 >actual.out 2>actual.err &&
> > +       test_i18ngrep "hint: Checkout another branch" actual.err &&
>
> Why does this capture standard output into 'actual.out' if that file
> is never consulted?
>

Correct, I missed this one.

> >         rm -r my7 &&
> >         git worktree prune
> >  '

Thanks Eric, will submit an updated version soon.

Heba

^ permalink raw reply

* Re: [PATCH 0/5] refactor gpg-interface and add gpg verification for clones
From: Junio C Hamano @ 2020-01-05 23:11 UTC (permalink / raw)
  To: Hans Jerry Illikainen; +Cc: git
In-Reply-To: <20200105135616.19102-1-hji@dyntopia.com>

Hans Jerry Illikainen <hji@dyntopia.com> writes:

> And finally, signature verification is added to the clone builtin.  It
> obeys --(no-)verify-signatures, clone.verifySignatures and
> gpg.verifySignatures (in decreasing order of significance).

It is clear for a merge or a pull what it means to verify
signature---you trust your local history, and you are willing to
merge in a commit only when it has a trusted signature (which
automatically means that you trust the hash function and also the
signer did some reasonable vetting of the history behind the tip
commit, or you never check out your intermediate state, depending
on your threat model).

I am not sure what it should mean to verify signature on clone.  I'd
assume that our threat model and verification policy are consistent
with what we use for a merge/pull, in that we trust all history
behind a commit that has a trusted signature, so it is clear that
you would want the tip commit of the default branch (or if you are
naming a single branch to clone, then the tip of that branch) to
carry a trusted signature.  But what about the commits that are
reachable from other branches and tags that are *not* contained
in the branch that is initially checked out?

Later in the proposed log message of 5/5 you allude to risk of
merely checking out a potentially untrustworthy contents to the
working tree.  This is far stricter than the usual threat model we
tend to think about as the developers of source code management
system, where checking out is perfectly OK but running "make" or its
equivalent is the first contact between the victim's system with
malicious contents.

Verifying the tip of the default/sole branch upon cloning before the
tree of the commit is checked out certainly would cover that single
case (i.e. the initial checkout after cloning), but I am not sure if
it is the best way, and I am reasonably certain it is insufficient
against your threat model.  After such a clone is made, when the
user checks out another branch obtained from the "origin" remote,
there is no mechanism that protects the user from the same risk you
just covered with the new signature verification mechanism upon
cloning, without validating the tip of that other branch, somewhere
between the time the clone is made and that other branch gets
checked out.

It almost makes me suspect that what you are trying to do with the
"verification upon cloning" may be much better done as a "verify the
tree for trustworthyness before checking it out to the working tree"
mechanism, where the trustworthyness of a tree-ish object may be
defined (and possibly made customizable by the policy of the project
the user is working on) like so:

 - A tree is trustworthy if it is the tree of a trustworthy commit.

 - A commit is trustworthy if

   . it carries a trusted signature, or
   . it is pointed by a tag that carries a trusted signature, or
   . it is reachable from a trustworthy commit.

Now, it is prohibitively expensive to compute the trusttworthiness
of a tree, defined like the above, when checking it out, UNLESS you
force each and every commit to carry a trusted signature, which is
not necessarily practical in the real world.

Another approach to ensure that any and all checkout would work only
on a trustworthy tree might be to make sure that tips of *all* the
refs are trustworthy commits immediately after cloning (instead of
verifying only the branch you are going to checkout, which is what
your patch does IIUC).  That way, any subsequent checkout in the
repository would either be checking out a commit that was

 - originally cloned from the remote, and is reachable from some ref
   that was validated back when the repository was cloned, or

 - created locally in this repository, which we trust, or

 - fetched from somewhere else, and is reachable from the commit
   that was validated back when "git pull" validated what was about
   to be integrated into the history of *this* repository.

Hmm?

^ permalink raw reply

* [PATCH 1/1] commit: make the sign-off trailer configurable
From: Hans Jerry Illikainen @ 2020-01-05 17:41 UTC (permalink / raw)
  To: git; +Cc: Hans Jerry Illikainen
In-Reply-To: <20200105174127.9278-1-hji@dyntopia.com>

The commit builtin did not previously have a configuration option to
enable the 'Signed-off-by' trailer by default.

For some use-cases (namely, when the user doesn't always have the right
to contribute patches to a project) it makes sense to make it a
conscientious decision to add the signoff trailer.  However, others'
might always have the right to ship patches -- in which case it makes
sense to have an option to add the trailer by default for projects that
require it.

This patch introduces a commit.signOff configuration option that
determine whether the trailer should be added for commits.  It can be
overridden with the --(no-)signoff command-line option.

Signed-off-by: Hans Jerry Illikainen <hji@dyntopia.com>
---
 Documentation/config/commit.txt |  8 ++++++++
 Documentation/git-commit.txt    |  4 ++++
 builtin/commit.c                |  4 ++++
 t/t7502-commit-porcelain.sh     | 36 +++++++++++++++++++++++++++++++++
 4 files changed, 52 insertions(+)

diff --git a/Documentation/config/commit.txt b/Documentation/config/commit.txt
index 2c95573930..6ebfe384ac 100644
--- a/Documentation/config/commit.txt
+++ b/Documentation/config/commit.txt
@@ -15,6 +15,14 @@ commit.gpgSign::
 	convenient to use an agent to avoid typing your GPG passphrase
 	several times.
 
+commit.signOff::
+	A boolean to specify whether commits should enable the
+	`-s/--signoff` option by default.  *Note:* Adding the
+	Signed-off-by: line to a commit message should be a conscious
+	act and means that you certify you have the rights to submit the
+	work under the same open source license.  Please see the
+	'SubmittingPatches' document for further discussion.
+
 commit.status::
 	A boolean to enable/disable inclusion of status information in the
 	commit message template when using an editor to prepare the commit
diff --git a/Documentation/git-commit.txt b/Documentation/git-commit.txt
index ced5a9beab..61a362770d 100644
--- a/Documentation/git-commit.txt
+++ b/Documentation/git-commit.txt
@@ -165,12 +165,16 @@ The `-m` option is mutually exclusive with `-c`, `-C`, and `-F`.
 
 -s::
 --signoff::
+--no-signoff::
 	Add Signed-off-by line by the committer at the end of the commit
 	log message.  The meaning of a signoff depends on the project,
 	but it typically certifies that committer has
 	the rights to submit this work under the same license and
 	agrees to a Developer Certificate of Origin
 	(see http://developercertificate.org/ for more information).
+	This option can be enabled by default with the `commit.signOff`
+	configuration option, in which case it can be disabled
+	temporarily with `--no-signoff`.
 
 -n::
 --no-verify::
diff --git a/builtin/commit.c b/builtin/commit.c
index c70ad01cc9..497e29c58c 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1474,6 +1474,10 @@ static int git_commit_config(const char *k, const char *v, void *cb)
 		sign_commit = git_config_bool(k, v) ? "" : NULL;
 		return 0;
 	}
+	if (!strcmp(k, "commit.signoff")) {
+		signoff = git_config_bool(k, v);
+		return 0;
+	}
 	if (!strcmp(k, "commit.verbose")) {
 		int is_bool;
 		config_commit_verbose = git_config_bool_or_int(k, v, &is_bool);
diff --git a/t/t7502-commit-porcelain.sh b/t/t7502-commit-porcelain.sh
index 14c92e4c25..7510325698 100755
--- a/t/t7502-commit-porcelain.sh
+++ b/t/t7502-commit-porcelain.sh
@@ -151,6 +151,42 @@ test_expect_success 'sign off' '
 
 '
 
+test_expect_success 'commit.signOff=true' '
+	test_config commit.signOff true &&
+	echo 1 >>positive &&
+	git add positive &&
+	git commit -m "thank you" &&
+	git cat-file commit HEAD >commit.msg &&
+	sed -ne "s/Signed-off-by: //p" commit.msg >actual &&
+	git var GIT_COMMITTER_IDENT >ident &&
+	sed -e "s/>.*/>/" ident >expected &&
+	test_cmp expected actual
+'
+
+test_expect_success 'commit.signOff=true and --no-signoff' '
+	test_config commit.signOff true &&
+	echo 2 >>positive &&
+	git add positive &&
+	git commit --no-signoff -m "thank you" &&
+	git cat-file commit HEAD >commit.msg &&
+	sed -ne "s/Signed-off-by: //p" commit.msg >actual &&
+	git var GIT_COMMITTER_IDENT >ident &&
+	sed -e "s/>.*/>/" ident >expected &&
+	! test_cmp expected actual
+'
+
+test_expect_success 'commit.signOff=false and --signoff' '
+	test_config commit.signOff false &&
+	echo 1 >>positive &&
+	git add positive &&
+	git commit --signoff -m "thank you" &&
+	git cat-file commit HEAD >commit.msg &&
+	sed -ne "s/Signed-off-by: //p" commit.msg >actual &&
+	git var GIT_COMMITTER_IDENT >ident &&
+	sed -e "s/>.*/>/" ident >expected &&
+	test_cmp expected actual
+'
+
 test_expect_success 'multiple -m' '
 
 	>negative &&
-- 
2.25.0.rc1.298.g45d5f025e1


^ permalink raw reply related

* [PATCH 0/1] commit: make the sign-off trailer configurable
From: Hans Jerry Illikainen @ 2020-01-05 17:41 UTC (permalink / raw)
  To: git; +Cc: Hans Jerry Illikainen

This patch that adds commit.signOff as a configuration option.  The
config/commit.txt documentation where the significance of the
signed-off-by trailer is highlighted is copy-pasted from
config/format.txt.

Even with the note in config/{commit,format}.txt, I still think it's
worthwhile to have the trailer as a configuration option for
contributors that always have the rights to send patches.

Hans Jerry Illikainen (1):
  commit: make the sign-off trailer configurable

 Documentation/config/commit.txt |  8 ++++++++
 Documentation/git-commit.txt    |  4 ++++
 builtin/commit.c                |  4 ++++
 t/t7502-commit-porcelain.sh     | 36 +++++++++++++++++++++++++++++++++
 4 files changed, 52 insertions(+)

--
2.25.0.rc1.298.g45d5f025e1

^ permalink raw reply

* Re: git repo on NTFS mount
From: brian m. carlson @ 2020-01-05 16:06 UTC (permalink / raw)
  To: Lukas Schubert; +Cc: git
In-Reply-To: <op.0dv9xiwp1jamlk@localhost>

[-- Attachment #1: Type: text/plain, Size: 2671 bytes --]

On 2020-01-05 at 00:49:56, Lukas Schubert wrote:
> Hi there,
> 
> for historical reasons, I keep the data that doesn't belong to any
> specific user on a harddisk that is formatted as NTFS. Some git
> repositories are there, too. Some time ago, I upgraded from Linux Mint 17
> to 19.2. That upgrade brought a change in data partition's mount options.
> Old:
> UUID=20D0WHATEVER	/mnt/DATA	ntfs	defaults,nls=utf8,umask=000,uid=1000,windows_names
> New: UUID=20D0WHATEVER	/DATA		ntfs	defaults,umask=007,gid=46
> 
> Now I want to initialize a new git repository
> user@xxxx:/DATA/Projects/LearnPython/wxGlade$ git init
> error: chmod on /DATA/Projects/LearnPython/wxGlade/.git/config.lock
> failed: Operation not permitted
> fatal: could not set 'core.filemode' to 'false'

The way Git writes a new config file is that it writes a file to the
side (as a lock file) and then renames it in place.  Because it works
this way, Git tries to call chmod(2) to set the permissions on that
file and when it fails, Git aborts.

You'll need to figure out how to make the chmod call not fail on that
file system, even if it does nothing.  Changing the mount operations
would be a good way to try fixing that.

> Since there already are repos on that drive, the initialization must have
> worked before. But in
> https://www.linuxquestions.org/questions/showthread.php?p=6074034#post6074034
> I've been told that using git in linux with repositories on NTFS is a
> recipe for disaster.
> 
> Given I change the mount options to what worked before the update, can I
> escape certain doom if I stick to a certain subset of git commands? Or is
> the cathastrophe inevitalbe due to subtle errors that culminate but stay
> hidden until it's too late?

NTFS isn't guaranteed to be broken on Linux, but neither do we test it.
It will probably work if you can make it do standard POSIX things,
although it will of course be less capable than a typical Linux file
system.

Assuming that Linux doesn't lie about file system operations on NTFS,
then Git will either work, or it will fail loudly.  It won't silently
eat your data, and that's all we can really promise in this case.  I
certainly encourage you to try fixing things and let us know how and if
it does work, though.

As a side note, if you're looking for a file system that can be shared
between Linux, Windows, and macOS, you may want to look into UDF, which
is used for DVDs, but can also be used for hard drives.  It supports
POSIX permissions and is therefore more functional on a typical Unix
system.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 868 bytes --]

^ permalink raw reply

* Re: push --follow-tags not pushing tags ?
From: Wes Hurd @ 2020-01-05 15:32 UTC (permalink / raw)
  To: Matt Rogers; +Cc: git
In-Reply-To: <CAOjrSZv-D6qsyd7mjAEAJ=6YVUTUupYLNdUkDTjbGkA_9eOfUA@mail.gmail.com>

Ah ok, that's it

thanks Matt

I'll update the stack overflow to clarify


On Sat, Jan 4, 2020 at 11:24 PM Matt Rogers <mattr94@gmail.com> wrote:
>
> I believe --follow-tags only pushes annotated tags.  Does the issue
> still persist when you use something like:
>
> `git tag -a -m "I'm an annotation"`

^ permalink raw reply

* Re: ERANGE strikes again on my Windows build; RFH
From: Michal Suchánek @ 2020-01-05 15:27 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Johannes Sixt, Git Mailing List
In-Reply-To: <20191230184948.GC57251@google.com>

On Mon, Dec 30, 2019 at 10:49:48AM -0800, Jonathan Nieder wrote:
> Johannes Sixt wrote:
> > Am 30.12.19 um 19:06 schrieb Jonathan Nieder:
> 
> >>                                                                    when
> >> errno is meaningful for a function for a given return value, the usual
> >> convention is
> >>
> >>  (1) it *always* sets errno for errors, not conditionally
> >
> > You seem to understand that errno isn't set somewhere where it should be
> > set.
> 
> On the contrary: this caller is using errno as an error *indicator*
> instead of a way of *distinguishing* between errors (or to put it
> another way, this caller is treating `errno == 0` as a meaningful
> condition).  This means the calling code is buggy.

That works completely fine if the code in question also sets errno to 0
in case there is some other leftover value from a previous library call.

Thanks

Michal

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox