Git development
 help / color / mirror / Atom feed
* [PATCH v2 1/2] sha1-name: replace unsigned int with option struct
From: Jonathan Tan @ 2020-08-29  1:02 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, jrnieder
In-Reply-To: <cover.1598662525.git.jonathantanmy@google.com>

In preparation for a future patch adding a boolean parameter to
repo_interpret_branch_name(), which might be easily confused with an
existing unsigned int parameter, refactor repo_interpret_branch_name()
to take an option struct instead of the unsigned int parameter.

The static function interpret_branch_mark() is also updated to take the
option struct in preparation for that future patch, since it will also
make use of the to-be-introduced boolean parameter.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
---
 cache.h     | 20 ++++++++++++--------
 refs.c      |  3 ++-
 revision.c  |  3 ++-
 sha1-name.c | 29 ++++++++++++++++++-----------
 4 files changed, 34 insertions(+), 21 deletions(-)

diff --git a/cache.h b/cache.h
index 4cad61ffa4..4f16a57ba4 100644
--- a/cache.h
+++ b/cache.h
@@ -1557,21 +1557,25 @@ int parse_oid_hex_any(const char *hex, struct object_id *oid, const char **end);
  *
  * If the input was ok but there are not N branch switches in the
  * reflog, it returns 0.
- *
- * If "allowed" is non-zero, it is a treated as a bitfield of allowable
- * expansions: local branches ("refs/heads/"), remote branches
- * ("refs/remotes/"), or "HEAD". If no "allowed" bits are set, any expansion is
- * allowed, even ones to refs outside of those namespaces.
  */
 #define INTERPRET_BRANCH_LOCAL (1<<0)
 #define INTERPRET_BRANCH_REMOTE (1<<1)
 #define INTERPRET_BRANCH_HEAD (1<<2)
+struct interpret_branch_name_options {
+	/*
+	 * If "allowed" is non-zero, it is a treated as a bitfield of allowable
+	 * expansions: local branches ("refs/heads/"), remote branches
+	 * ("refs/remotes/"), or "HEAD". If no "allowed" bits are set, any expansion is
+	 * allowed, even ones to refs outside of those namespaces.
+	 */
+	unsigned allowed;
+};
 int repo_interpret_branch_name(struct repository *r,
 			       const char *str, int len,
 			       struct strbuf *buf,
-			       unsigned allowed);
-#define interpret_branch_name(str, len, buf, allowed) \
-	repo_interpret_branch_name(the_repository, str, len, buf, allowed)
+			       const struct interpret_branch_name_options *options);
+#define interpret_branch_name(str, len, buf, options) \
+	repo_interpret_branch_name(the_repository, str, len, buf, options)
 
 int validate_headref(const char *ref);
 
diff --git a/refs.c b/refs.c
index 3ee3afaf41..cf09cd039f 100644
--- a/refs.c
+++ b/refs.c
@@ -601,7 +601,8 @@ static char *substitute_branch_name(struct repository *r,
 				    const char **string, int *len)
 {
 	struct strbuf buf = STRBUF_INIT;
-	int ret = repo_interpret_branch_name(r, *string, *len, &buf, 0);
+	struct interpret_branch_name_options options = { 0 } ;
+	int ret = repo_interpret_branch_name(r, *string, *len, &buf, &options);
 
 	if (ret == *len) {
 		size_t size;
diff --git a/revision.c b/revision.c
index 96630e3186..1247ee4ec8 100644
--- a/revision.c
+++ b/revision.c
@@ -315,13 +315,14 @@ static void add_pending_object_with_path(struct rev_info *revs,
 					 const char *name, unsigned mode,
 					 const char *path)
 {
+	struct interpret_branch_name_options options = { 0 };
 	if (!obj)
 		return;
 	if (revs->no_walk && (obj->flags & UNINTERESTING))
 		revs->no_walk = 0;
 	if (revs->reflog_info && obj->type == OBJ_COMMIT) {
 		struct strbuf buf = STRBUF_INIT;
-		int len = interpret_branch_name(name, 0, &buf, 0);
+		int len = interpret_branch_name(name, 0, &buf, &options);
 
 		if (0 < len && name[len] && buf.len)
 			strbuf_addstr(&buf, name + len);
diff --git a/sha1-name.c b/sha1-name.c
index 0b8cb5247a..a7a9de66c4 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -1427,9 +1427,12 @@ static int reinterpret(struct repository *r,
 	struct strbuf tmp = STRBUF_INIT;
 	int used = buf->len;
 	int ret;
+	struct interpret_branch_name_options options = {
+		.allowed = allowed
+	};
 
 	strbuf_add(buf, name + len, namelen - len);
-	ret = repo_interpret_branch_name(r, buf->buf, buf->len, &tmp, allowed);
+	ret = repo_interpret_branch_name(r, buf->buf, buf->len, &tmp, &options);
 	/* that data was not interpreted, remove our cruft */
 	if (ret < 0) {
 		strbuf_setlen(buf, used);
@@ -1471,7 +1474,7 @@ static int interpret_branch_mark(struct repository *r,
 				 int (*get_mark)(const char *, int),
 				 const char *(*get_data)(struct branch *,
 							 struct strbuf *),
-				 unsigned allowed)
+				 const struct interpret_branch_name_options *options)
 {
 	int len;
 	struct branch *branch;
@@ -1496,7 +1499,7 @@ static int interpret_branch_mark(struct repository *r,
 	if (!value)
 		die("%s", err.buf);
 
-	if (!branch_interpret_allowed(value, allowed))
+	if (!branch_interpret_allowed(value, options->allowed))
 		return -1;
 
 	set_shortened_ref(r, buf, value);
@@ -1506,7 +1509,7 @@ static int interpret_branch_mark(struct repository *r,
 int repo_interpret_branch_name(struct repository *r,
 			       const char *name, int namelen,
 			       struct strbuf *buf,
-			       unsigned allowed)
+			       const struct interpret_branch_name_options *options)
 {
 	char *at;
 	const char *start;
@@ -1515,7 +1518,7 @@ int repo_interpret_branch_name(struct repository *r,
 	if (!namelen)
 		namelen = strlen(name);
 
-	if (!allowed || (allowed & INTERPRET_BRANCH_LOCAL)) {
+	if (!options->allowed || (options->allowed & INTERPRET_BRANCH_LOCAL)) {
 		len = interpret_nth_prior_checkout(r, name, namelen, buf);
 		if (!len) {
 			return len; /* syntax Ok, not enough switches */
@@ -1523,7 +1526,8 @@ int repo_interpret_branch_name(struct repository *r,
 			if (len == namelen)
 				return len; /* consumed all */
 			else
-				return reinterpret(r, name, namelen, len, buf, allowed);
+				return reinterpret(r, name, namelen, len, buf,
+						   options->allowed);
 		}
 	}
 
@@ -1531,22 +1535,22 @@ int repo_interpret_branch_name(struct repository *r,
 	     (at = memchr(start, '@', namelen - (start - name)));
 	     start = at + 1) {
 
-		if (!allowed || (allowed & INTERPRET_BRANCH_HEAD)) {
+		if (!options->allowed || (options->allowed & INTERPRET_BRANCH_HEAD)) {
 			len = interpret_empty_at(name, namelen, at - name, buf);
 			if (len > 0)
 				return reinterpret(r, name, namelen, len, buf,
-						   allowed);
+						   options->allowed);
 		}
 
 		len = interpret_branch_mark(r, name, namelen, at - name, buf,
 					    upstream_mark, branch_get_upstream,
-					    allowed);
+					    options);
 		if (len > 0)
 			return len;
 
 		len = interpret_branch_mark(r, name, namelen, at - name, buf,
 					    push_mark, branch_get_push,
-					    allowed);
+					    options);
 		if (len > 0)
 			return len;
 	}
@@ -1557,7 +1561,10 @@ int repo_interpret_branch_name(struct repository *r,
 void strbuf_branchname(struct strbuf *sb, const char *name, unsigned allowed)
 {
 	int len = strlen(name);
-	int used = interpret_branch_name(name, len, sb, allowed);
+	struct interpret_branch_name_options options = {
+		.allowed = allowed
+	};
+	int used = interpret_branch_name(name, len, sb, &options);
 
 	if (used < 0)
 		used = 0;
-- 
2.28.0.402.g5ffc5be6b7-goog


^ permalink raw reply related

* [PATCH v2 0/2] Fix for git checkout @{u} (non-local) then git status
From: Jonathan Tan @ 2020-08-29  1:02 UTC (permalink / raw)
  To: git; +Cc: Jonathan Tan, gitster, jrnieder
In-Reply-To: <20200513004058.34456-1-jonathantanmy@google.com>

As Jonathan Nieder wrote [1]:

>  A. Like you're hinting, could dwim_ref get a variant that returns -1
>     instead of die()ing on failure?  That way, we could fulfill the
>     intent described in b397ea48:
> 
> 	When it cannot figure out the original ref, it shows an abbreviated
> 	SHA-1.

Here are some patches that do this. As discussed, this is not the
complete solution, but at least now we can handle unresolvable marks.

I've also switched the commit messages and test from mentioning "git
branch" to mentioning "git status".

[1] https://lore.kernel.org/git/20200827014723.GA750502@google.com/

Jonathan Tan (2):
  sha1-name: replace unsigned int with option struct
  wt-status: tolerate dangling marks

 cache.h           | 27 +++++++++++++++++++--------
 refs.c            | 17 +++++++++++------
 refs.h            |  3 ++-
 revision.c        |  3 ++-
 sha1-name.c       | 45 +++++++++++++++++++++++++++++----------------
 t/t7508-status.sh | 12 ++++++++++++
 wt-status.c       |  2 +-
 7 files changed, 76 insertions(+), 33 deletions(-)

-- 
2.28.0.402.g5ffc5be6b7-goog


^ permalink raw reply

* Re: [PATCH] pack-redundant: gauge the usage before proposing its removal
From: Junio C Hamano @ 2020-08-28 22:45 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, git, dstolee
In-Reply-To: <20200828091454.GA2140991@coredump.intra.peff.net>

Jeff King <peff@peff.net> writes:

> On Tue, Aug 25, 2020 at 03:45:52PM -0700, Junio C Hamano wrote:
>
>> The subcommand is unusably slow and the reason why nobody reports it
>> as a performance bug is suspected to be the absense of users.  Let's
>> show a big message that asks the user to tell us that they still
>> care about the command when an attempt is made to run the command,
>> with an escape hatch to override it with a command line option.
>
> I was looking at the history here and noticed this topic, which I
> somehow missed when it happened:
>
>   $ git show -s cf0879f7e98d2213503622f780d2ab0dd3f93477
>   commit cf0879f7e98d2213503622f780d2ab0dd3f93477
>   Merge: 3710f60a80 0e37abd2e8
>   Author: Junio C Hamano <gitster@pobox.com>
>   Date:   Thu Mar 7 09:59:54 2019 +0900
>   
>       Merge branch 'sc/pack-redundant'
>       
>       Update the implementation of pack-redundant for performance in a
>       repository with many packfiles.
>       
>       * sc/pack-redundant:
>         pack-redundant: consistent sort method
>         pack-redundant: rename pack_list.all_objects
>         pack-redundant: new algorithm to find min packs
>         pack-redundant: delete redundant code
>         pack-redundant: delay creation of unique_objects
>         t5323: test cases for git-pack-redundant
>
> So it sounds like:
>
>   - somebody does care enough to use it
>
>   - it may not be horrifically slow anymore
>
> So it may not be worth trying to follow through on the deprecation
> (though the fact that neither of us realized this makes me worried for
> the general state of maintenance of this code).

OK.  Just dropping the topic is the easiest ;-)  Thanks.

^ permalink raw reply

* Re: [PATCH] builtin/init-db.c: fix a sparse warning
From: Eric Sunshine @ 2020-08-28 22:20 UTC (permalink / raw)
  To: Ramsay Jones; +Cc: GIT Mailing-list
In-Reply-To: <8d4b4011-b8a2-c0e0-a3f2-28c7bbec040b@ramsayjones.plus.com>

On Fri, Aug 28, 2020 at 4:47 PM Ramsay Jones
<ramsay@ramsayjones.plus.com> wrote:
> If you need to re-roll your 'es/worktree-repair' branch, could you
> please squash this into the relevant patch (commit 44a466ca7f (init:
> make --separate-git-dir work from within linked worktree, 27-08-2020)).
>
> -                       git_dir = strbuf_detach(&sb, 0);
> +                       git_dir = strbuf_detach(&sb, NULL);

Thanks. I misremembered the purpose of the second argument, thinking
that it was a size hint which would be passed along to strbuf_init().

^ permalink raw reply

* Re: [PATCH v1 3/3] git: catch an attempt to run "git-foo"
From: Junio C Hamano @ 2020-08-28 22:03 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Jeff King
In-Reply-To: <nycvar.QRO.7.76.6.2008280412030.56@tvgsbejvaqbjf.bet>

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> This is needed to handle the case where `argv[0]` contains the full path
> (which is the case on Windows) and the suffix `.exe` (which is also the
> case on Windows).

Oy.  Yeah, I totally forgot about ".exe" thing.

> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
>  git.c  | 3 ++-
>  help.c | 5 ++++-
>  2 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/git.c b/git.c
> index 71ef4835b20e..863fd0c58a66 100644
> --- a/git.c
> +++ b/git.c
> @@ -851,7 +851,8 @@ int cmd_main(int argc, const char **argv)
>  	 * that one cannot handle it.
>  	 */
>  	if (skip_prefix(cmd, "git-", &cmd)) {
> -		warn_on_dashed_git(argv[0]);
> +		strip_extension(&cmd);
> +		warn_on_dashed_git(cmd);

The argv[0] may have been NULL from the beginning of cmd_main(), and
cmd would be "git-help" in such a case.  We used to pass NULL to
warn_on_dashed_git() in such a case because "git-help" is not what
the user typed, but what we internally trigger, so we didn't want
warn_on_dashed_git() to do anything based on that internal string.

As there is no special casing of "help" in warn_on_dashed_git()
mechanism, it probably would start triggering a warning/die when
"git<ENTER>" is typed, no?

+	if (argv[0]) {
		strip_extension(&cmd);
		warn_on_dashed_git(cmd);

may be the minimum fix, but I would strongly prefer to keep the
interface into warn_on_dashed_git() (eh, the most important is the
interface into find_cmdname_help() helper function, which is
designed to be reused by other parts of help.c) to take the full
command name, not without "git-" prefix.  This is primarily because
the entirety of the help.c API is driven by full command names,
without removing "git-" prefix, and it has to, because the help.c
API needs to handle "gitk", from which you cannot remove any "git-"
prefix.

Perhaps

	if (starts_with(cmd, "git-")) {
               	strip_extension(&cmd);
		if (argv[0])
			warn_on_dashed_git(cmd);
		argv[0] = cmd + 4;
                handle_builtin(argc, argv);
		die(...);

How does your handle_builtin() work, by the way?  

The original code (even before we added warn_on_dashed_git() in this
codepath) did not do any strip_extension(), so handle_builtin() can
take commands with ".exe" suffix, but now we are feeding the result
of strip_extension() to it, so it can deal with both? 

Sounds convenient and sloppy (not the handle_builtin's
implementation, but its callers that sometimes feeds the full
executable name, and feeds only the basename some other times) at
the same time.

>  		argv[0] = cmd;
>  		handle_builtin(argc, argv);
> diff --git a/help.c b/help.c
> index c93a76944b00..27b1b26890be 100644
> --- a/help.c
> +++ b/help.c
> @@ -724,9 +724,12 @@ NORETURN void help_unknown_ref(const char *ref, const char *cmd,
>  static struct cmdname_help *find_cmdname_help(const char *name)
>  {
>  	int i;
> +	const char *p;
>
> +	skip_prefix(name, "git-", &name);
>  	for (i = 0; i < ARRAY_SIZE(command_list); i++) {
> -		if (!strcmp(command_list[i].name, name))
> +		if (skip_prefix(command_list[i].name, "git-", &p) &&
> +		    !strcmp(p, name))
>  			return &command_list[i];
>  	}
>  	return NULL;
> --
> 2.28.0.windows.1

^ permalink raw reply

* Re: [PATCH] send-email: do not prompt for In-Reply-To
From: Raymond E. Pasco @ 2020-08-28 20:58 UTC (permalink / raw)
  To: Drew DeVault, Junio C Hamano, Carlo Marcelo Arenas Belón; +Cc: git
In-Reply-To: <C58UKAYKF1ZY.V5LLW3DY1KAY@homura>

On Fri Aug 28, 2020 at 2:39 PM EDT, Drew DeVault wrote:
> "You can give an empty answer if you are not responding to any message"
> could confuse users, because they might think -v2 is a "response", or
> maybe they've written the patch in response to a discussion on the
> -users mailing list, or any other number of reasons. Now they have to
> figure out how to answer this prompt, even if the mailing list they're
> sending it to isn't expecting it to be a reply. I came up with a number
> of alternative wordings but they all ultimately failed to this same
> problem.

These *are* all reasons why a patch would be sent as a reply. You can
moderate your own lists however you like, but that does not mean that
patches being replies to other mails is 1) wrong or useless, or 2) not
in wide use. I wish the user experience were a bit smoother for those of
us who aren't dedicated Emacs manglers, but things like scissors lines
help a bit.

> "Legitimate" use-cases like qemu-devel or not, this is only ever going
> to confuse new users, and I think that qemu is wrong for encouraging
> users to deal with it.

qemu-devel was given by Carlo as an example of a list which does *not*
keep discussion together in threads; the example give of a list which
*does* keep things together in threads was the Git list itself (surely
the normative list for Git, if there is such a thing). It's too useful
to hackers (and presumably maintainers), I don't think anyone is going
to stop.

Looking at this in isolation from the polemics about list practices,
it's pretty clear that the way send-email prompts for things is not
logical. Axing the In-Reply-To prompt would be one way to make it
logical, because the only other prompt is the To prompt; this is removal
of a prompt users may expect, but by being so inconsistent, the software
itself already removes the prompt when I'm expecting it!

There may be other ways to make it logical, like having the appearance
of the In-Reply-To prompt depend solely on whether a message ID was
already provided or not (but since a message ID is not mandatory, it's
weird to prompt for it and not for, say, additional CCs).

Since every mail client I've ever seen has a "reply" button, I don't
think the concept of a reply is eldritch arcana that users must be
protected from. It's annoying to have to get a Message-ID and paste it,
sure. It would be nice if a mail client could do that for me (I think
dedicated Emacs manglers are insulated from this problem for that
reason). Probably whoever prompted for it in the first place just didn't
want to flub a message by leaving it out, which wouldn't be an issue for
a client command invoked as some variation on "reply all".

^ permalink raw reply

* Re: [PATCH] send-email: do not prompt for In-Reply-To
From: Drew DeVault @ 2020-08-28 21:35 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Carlo Marcelo Arenas Belón, git
In-Reply-To: <xmqq7dtio3pa.fsf@gitster.c.googlers.com>

On Fri Aug 28, 2020 at 5:33 PM EDT, Junio C Hamano wrote:
> I suspect that the legacy reason exists only because it was (thought
> to be) (a) common enough to make a series in response to an existing
> message than a thread starter and (b) "to:" and "in-reply-to:" are
> quite close to make sense to be asked together [*1*], back when the
> current behaviour was established, ? In other words, the "legacy
> reson" may have inherent justification for it.

I understand what you mean, but like you said, without a time machine to
peer into the mind of the original author, this is an assumption without
evidence. If we cannot come up with a renewed justification it... does
it really still hold on the basis that it may or may not have been
adequately justified at some point in the distant past?

> Your primary and only argument to flip the default not to ask about
> in-reply-to is, because, in your opinion, more users would want to
> send thread-starters than responses. I haven't seen the numbers,
> and more importantly I do not think anybody can expected to produce
> numbers that convince everybody (there are biases based on who's
> counting and what is counted), so I cannot buy such an argument
> blindly.

I would agree that this is part of my argument, but it's only a
supporting argument, not my primary argument. The primary argument is
that understanding and answering the prompt requires domain-specific
knowledge about email internals which are not normally presented to the
user. I am unaware of any mail client which makes the meaning or even
the existence of the Message-ID or In-Reply-To headers known to the user
without going out of their way to find it. Explaining the meaning of
these fields, how to find them for their mail client, and when to use
them, would be well outside the scope for the prompt itself.

No one has yet come forward to vouch for this branch as something they
actually depend on, by the way. It's just been people who are unaffected
making arguments that such a user may exist in theory. Maybe we could
figure it for sure out by putting a prompt in this branch, like you
initially suggested?

> That makes it the safest thing to give users a new choice without
> changing the default. That would allow us at least move forward.

Well, it'd be a start, and I may very well write that patch for the sake
of moving forward, as you say. But it doesn't solve the problem I came
here to solve, unless viewed as the first of several steps towards its
eventual removal.

^ permalink raw reply

* Re: [PATCH] send-email: do not prompt for In-Reply-To
From: Junio C Hamano @ 2020-08-28 21:33 UTC (permalink / raw)
  To: Drew DeVault; +Cc: Carlo Marcelo Arenas Belón, git
In-Reply-To: <C58XLLAE3SMA.3T1C6DXZ4VSWA@homura>

"Drew DeVault" <sir@cmpwn.com> writes:

> This would be better, yeah, but at that point it's pretty weird, like
> why are we prompting for this CLI flag and not any others? The answer is
> just for legacy reasons. There's no inherent justification for it.

I suspect that the legacy reason exists only because it was (thought
to be) (a) common enough to make a series in response to an existing
message than a thread starter and (b) "to:" and "in-reply-to:" are
quite close to make sense to be asked together [*1*], back when the
current behaviour was established, ?  In other words, the "legacy
reson" may have inherent justification for it.

Your primary and only argument to flip the default not to ask about
in-reply-to is, because, in your opinion, more users would want to
send thread-starters than responses.  I haven't seen the numbers,
and more importantly I do not think anybody can expected to produce
numbers that convince everybody (there are biases based on who's
counting and what is counted), so I cannot buy such an argument
blindly.

The weighing done, perhaps in the original author's mind, back when
send-email was written would certainly not be based on any concrete
numbers, either.  But as one undisputable fact we know [*2*] is that
quite many people are already used to the behaviour we currently
have.  While some of them may welcome change, others will complain
if they are forced to relearn what they have committed to their
muscle memory.  

That makes it the safest thing to give users a new choice without
changing the default.  That would allow us at least move forward.



[Footnote]

*1* After all, these two affect where the readers find the message,
    i.e. in the archive or mailbox of which mailing list, and where
    in relation to other messages in the list.

*2* Purely based on how widely git is used.

^ permalink raw reply

* Re: [PATCH v2] midx: traverse the local MIDX first
From: Jeff King @ 2020-08-28 21:19 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, dstolee, gitster
In-Reply-To: <20200828202213.GA24009@nand.local>

On Fri, Aug 28, 2020 at 04:22:13PM -0400, Taylor Blau wrote:

> This patch addresses that by:
> 
>   - placing the local MIDX first in the chain when calling
>     'prepare_multi_pack_index_one()', and
> 
>   - introducing a new 'get_local_multi_pack_index()', which explicitly
>     returns the repository-local MIDX, if any.
> 
> Don't impose an additional order on the MIDX's '->next' pointer beyond
> that the first item in the chain must be local if one exists so that we
> avoid a quadratic insertion.

This version looks fine to me.

Thinking on it a bit more, we probably want this "local one is first"
behavior even without it fixing this bug. For normal packs, we always
prefer local copies over alternates, under the assumption that
alternates are likely to be more expensive to access (e.g., shared nfs).

Of course that somewhat goes out the window since we re-order lookups
these days based on where we've found previous objects (my mru stuff,
but even before that the single "last pack" variable). But it makes
sense to at least start with the local ones being given priority.

I don't think we do any mru tricks with the midx list, so it would stay
in local-first mode always, which is reasonable (you probably don't have
so many midx's that any reordering is worth it anyway).

It does mean we may consult an alternates midx before any local non-midx
packs, which _could_ be slower. But since this is all guesses and
heuristics anyway, I'd wait until somebody has a concrete case where
they can demonstrate a different ordering scheme works better.

-Peff

^ permalink raw reply

* Re: [PATCH] bisect: swap command-line options in documentation
From: Junio C Hamano @ 2020-08-28 21:05 UTC (permalink / raw)
  To: Hugo Locurcio via GitGitGadget; +Cc: git, Hugo Locurcio
In-Reply-To: <pull.711.git.1598628679830.gitgitgadget@gmail.com>

"Hugo Locurcio via GitGitGadget" <gitgitgadget@gmail.com> writes:

> + git bisect start [--term-{new,bad}=<term> --term-{old,good}=<term>]
>  		  [--no-checkout] [--first-parent] [<bad> [<good>...]] [--] [<paths>...]

Yeah, matching the --term-{bad,good} and the actual arguments may
make it easier to see the correspondence between them.  Good idea.

In the old days, back when there weren't "--term-*=<myword>", it was
much simpler.  All people have to do was to remember that <bad> and
<good> are ordered alphabetically.

Will queue.  Thanks.

^ permalink raw reply

* Re: [PATCH] send-email: do not prompt for In-Reply-To
From: Drew DeVault @ 2020-08-28 21:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Carlo Marcelo Arenas Belón, git
In-Reply-To: <xmqqk0xio59r.fsf@gitster.c.googlers.com>

On Fri Aug 28, 2020 at 5:00 PM EDT, Junio C Hamano wrote:
> "Drew DeVault" <sir@cmpwn.com> writes:
>
> > "You can give an empty answer if you are not responding to any message"
> > could confuse users, because they might think -v2 is a "response", or
> > maybe they've written the patch in response to a discussion on the
> > -users mailing list, or any other number of reasons.
>
> "Type the value you would have given to --in-reply-to command line
> option (if you forgot to use it), or leave this field empty"
> perhaps? Those who do not know should be able to learn what
> "--in-reply-to" is. A prompt help is not the place to do a
> documentation.

This would be better, yeah, but at that point it's pretty weird, like
why are we prompting for this CLI flag and not any others? The answer is
just for legacy reasons. There's no inherent justification for it.

> > I hate to be a nuisance over such a seemingly simple problem, but there
> > are a lot of new users who are struggling here and I care about their
> > struggle. What path should we take to fixing this issue for them?
>
> The ideal way would be to craft step (0) well enough so that new
> users trigger the To: prompt in the first place, which would
> automatically make the problem disappear ;-)

Do you mean teaching users how to use --to upfront? Honestly the prompt
seems totally fine to me, there's nothing wrong with a use-case which
calls for typing your To address into the prompt. Hell, often I'll use
it, or I'll use --annotate and write my To addresses in vim.

^ permalink raw reply

* Re: [PATCH] po: add missing letter for French message
From: Junio C Hamano @ 2020-08-28 21:00 UTC (permalink / raw)
  To: Jean-Noël AVILA; +Cc: brian m. carlson, git
In-Reply-To: <2398329.Am1n9C2Xzm@cayenne>

Jean-Noël AVILA <jn.avila@free.fr> writes:

> On Friday, 28 August 2020 01:04:02 CEST Junio C Hamano wrote:
>> "brian m. carlson" <sandals@crustytoothpaste.net> writes:
>> 
>> > Add the missing "e" in "de".  While it is possible in French to omit it,
>> > that only occurs with an apostrophe and only when the next word starts
>> > with a vowel or mute h, which is not the case here.
>> >
>> > Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
>> > ---
>> > I noticed this the other day when trying to delete a remote branch that
>> > I'd already deleted.  I'm not sure what the preferred approach is for
>> > this, whether Junio should pick it up or whether Jean-Noël will want to
>> > incorporate it first, but I've CC'd both so y'all can fight it out.
>> 
>> Unless it is in the pre-release period (in which case I'd prefer not
>> to touch po/ myself at all, to give i18n/l10n teams a stable base to
>> work from), I can take it dircetly as long as somebody from the l10n
>> team for the language gives an Ack, as I cannot read most of the
>> files under po/ directory.
>> 
>> Thanks.
>> 
>
> Acked-by: Jean-Noël Avila <jn.avila@free.fr>

Thanks, will queue.

^ permalink raw reply

* Re: [PATCH] send-email: do not prompt for In-Reply-To
From: Junio C Hamano @ 2020-08-28 21:00 UTC (permalink / raw)
  To: Drew DeVault; +Cc: Carlo Marcelo Arenas Belón, git
In-Reply-To: <C58UKAYKF1ZY.V5LLW3DY1KAY@homura>

"Drew DeVault" <sir@cmpwn.com> writes:

> "You can give an empty answer if you are not responding to any message"
> could confuse users, because they might think -v2 is a "response", or
> maybe they've written the patch in response to a discussion on the
> -users mailing list, or any other number of reasons.

"Type the value you would have given to --in-reply-to command line
option (if you forgot to use it), or leave this field empty"
perhaps?  Those who do not know should be able to learn what
"--in-reply-to" is.  A prompt help is not the place to do a
documentation.

> I hate to be a nuisance over such a seemingly simple problem, but there
> are a lot of new users who are struggling here and I care about their
> struggle. What path should we take to fixing this issue for them?

The ideal way would be to craft step (0) well enough so that new
users trigger the To: prompt in the first place, which would
automatically make the problem disappear ;-)


^ permalink raw reply

* Re: What's cooking in git.git (Aug 2020, #07; Thu, 27)
From: Junio C Hamano @ 2020-08-28 20:51 UTC (permalink / raw)
  To: Hariom verma; +Cc: git
In-Reply-To: <CA+CkUQ_=qm2x2zKvgHnvKLLA_A23gTNm8Q8CeWxLFGSBi=9gKw@mail.gmail.com>

Hariom verma <hariom18599@gmail.com> writes:

>> * hv/ref-filter-misc (2020-08-17) 9 commits
>>   (merged to 'next' on 2020-08-27 at c015fa6b0f)
>>  + ref-filter: add `sanitize` option for 'subject' atom
>>  + format-support: move `format_sanitized_subject()` from pretty
>>  + pretty: refactor `format_sanitized_subject()`
>>  + ref-filter: add `short` modifier to 'parent' atom
>>  + ref-filter: add `short` modifier to 'tree' atom
>>  + ref-filter: rename `objectname` related functions and fields
>>  + ref-filter: modify error messages in `grab_objectname()`
>>  + ref-filter: refactor `grab_objectname()`
>>  + ref-filter: support different email formats
>>
>>  The "--format=" option to the "for-each-ref" command and friends
>>  learned a few more tricks, e.g. the ":short" suffix that applies to
>>  "objectname" now also can be used for "parent", "tree", etc.
>>
>>  Will merge to 'master'.
>
> I sent an updated version of the patch series addressing your comment
> concerning new file format-support.{c,h}[1].

Yikes, my mistake.  As long as you plan to vastly extend what would
be in format-support.[ch], I do not mind to have a pair of separate
files in the end, by the time when we have, say, unified "--format"
support for for-each-ref family (e.g. "%(token)") and log family
(e.g. "%x" fixed few letter combinations).  So the step that moves
some from pretty.c to format-support.c does not bother me all that
much.  It just felt unnecessary within the scope of this series.

But other stuff (like format-sanitized-subject having unnecessary
allocation and unnecessary special casing of LF) are worth fixing in
the version queued above.

Thanks for stopping me.

Let's revert the above out of 'next' and start afresh using v4.

^ permalink raw reply

* [PATCH] builtin/init-db.c: fix a sparse warning
From: Ramsay Jones @ 2020-08-28 20:46 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: GIT Mailing-list


Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
---

Hi Eric,

If you need to re-roll your 'es/worktree-repair' branch, could you
please squash this into the relevant patch (commit 44a466ca7f (init:
make --separate-git-dir work from within linked worktree, 27-08-2020)).

Thanks!

ATB,
Ramsay Jones

 builtin/init-db.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/init-db.c b/builtin/init-db.c
index 6a94d20a2e..cd3e760541 100644
--- a/builtin/init-db.c
+++ b/builtin/init-db.c
@@ -661,7 +661,7 @@ int cmd_init_db(int argc, const char **argv, const char *prefix)
 			if (chdir(mainwt.buf) < 0)
 				die_errno(_("cannot chdir to %s"), mainwt.buf);
 			strbuf_release(&mainwt);
-			git_dir = strbuf_detach(&sb, 0);
+			git_dir = strbuf_detach(&sb, NULL);
 		}
 		strbuf_release(&sb);
 	}
-- 
2.28.0

^ permalink raw reply related

* [PATCH v2] midx: traverse the local MIDX first
From: Taylor Blau @ 2020-08-28 20:22 UTC (permalink / raw)
  To: git; +Cc: dstolee, gitster, peff
In-Reply-To: <20200828180621.GA9036@nand.nand.local>

When a repository has an alternate object directory configured, callers
can traverse through each alternate's MIDX by walking the '->next'
pointer.

But, when 'prepare_multi_pack_index_one()' loads multiple MIDXs, it
places the new ones at the front of this pointer chain, not at the end.
This can be confusing for callers such as 'git repack -ad', causing test
failures like in t7700.6 with 'GIT_TEST_MULTI_PACK_INDEX=1'.

The occurs when dropping a pack known to the local MIDX with alternates
configured that have their own MIDX. Since the alternate's MIDX is
returned via 'get_multi_pack_index()', 'midx_contains_pack()' returns
true (which is correct, since it traverses through the '->next' pointer
to find the MIDX in the chain that does contain the requested object).
But, we call 'clear_midx_file()' on 'the_repository', which drops the
MIDX at the path of the first MIDX in the chain, which (in the case of
t7700.6 is the one in the alternate).

This patch addresses that by:

  - placing the local MIDX first in the chain when calling
    'prepare_multi_pack_index_one()', and

  - introducing a new 'get_local_multi_pack_index()', which explicitly
    returns the repository-local MIDX, if any.

Don't impose an additional order on the MIDX's '->next' pointer beyond
that the first item in the chain must be local if one exists so that we
avoid a quadratic insertion.

Likewise, use 'get_local_multi_pack_index()' in
'remove_redundant_pack()' to fix the formerly broken t7700.6 when run
with 'GIT_TEST_MULTI_PACK_INDEX=1'.

Finally, note that the MIDX ordering invariant is only preserved by the
insertion order in 'prepare_packed_git()', which traverses through the
ODB's '->next' pointer, meaning we visit the local object store first.
This fragility makes this an undesirable long-term solution if more
callers are added, but it is acceptable for now since this is the only
caller.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 builtin/repack.c |  2 +-
 midx.c           |  8 ++++++--
 packfile.c       | 11 +++++++++++
 packfile.h       |  1 +
 4 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/builtin/repack.c b/builtin/repack.c
index 98fac03946..01e7767c79 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -133,7 +133,7 @@ static void get_non_kept_pack_filenames(struct string_list *fname_list,
 static void remove_redundant_pack(const char *dir_name, const char *base_name)
 {
 	struct strbuf buf = STRBUF_INIT;
-	struct multi_pack_index *m = get_multi_pack_index(the_repository);
+	struct multi_pack_index *m = get_local_multi_pack_index(the_repository);
 	strbuf_addf(&buf, "%s.pack", base_name);
 	if (m && midx_contains_pack(m, buf.buf))
 		clear_midx_file(the_repository);
diff --git a/midx.c b/midx.c
index e9b2e1253a..cc19b66152 100644
--- a/midx.c
+++ b/midx.c
@@ -416,8 +416,12 @@ int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, i
 	m = load_multi_pack_index(object_dir, local);
 
 	if (m) {
-		m->next = r->objects->multi_pack_index;
-		r->objects->multi_pack_index = m;
+		struct multi_pack_index *mp = r->objects->multi_pack_index;
+		if (mp) {
+			m->next = mp->next;
+			mp->next = m;
+		} else
+			r->objects->multi_pack_index = m;
 		return 1;
 	}
 
diff --git a/packfile.c b/packfile.c
index 6ab5233613..9ef27508f2 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1027,6 +1027,17 @@ struct multi_pack_index *get_multi_pack_index(struct repository *r)
 	return r->objects->multi_pack_index;
 }
 
+struct multi_pack_index *get_local_multi_pack_index(struct repository *r)
+{
+	struct multi_pack_index *m = get_multi_pack_index(r);
+
+	/* no need to iterate; we always put the local one first (if any) */
+	if (m && m->local)
+		return m;
+
+	return NULL;
+}
+
 struct packed_git *get_all_packs(struct repository *r)
 {
 	struct multi_pack_index *m;
diff --git a/packfile.h b/packfile.h
index 240aa73b95..a58fc738e0 100644
--- a/packfile.h
+++ b/packfile.h
@@ -57,6 +57,7 @@ void install_packed_git(struct repository *r, struct packed_git *pack);
 struct packed_git *get_packed_git(struct repository *r);
 struct list_head *get_packed_git_mru(struct repository *r);
 struct multi_pack_index *get_multi_pack_index(struct repository *r);
+struct multi_pack_index *get_local_multi_pack_index(struct repository *r);
 struct packed_git *get_all_packs(struct repository *r);
 
 /*
-- 
2.28.0.338.g87a3b7a5a2.dirty

^ permalink raw reply related

* Re: What's cooking in git.git (Aug 2020, #07; Thu, 27)
From: Hariom verma @ 2020-08-28 20:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <xmqqh7snpxy1.fsf@gitster.c.googlers.com>

Hi,

On Fri, Aug 28, 2020 at 3:14 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> --------------------------------------------------
> [Cooking]
>
> * hv/ref-filter-misc (2020-08-17) 9 commits
>   (merged to 'next' on 2020-08-27 at c015fa6b0f)
>  + ref-filter: add `sanitize` option for 'subject' atom
>  + format-support: move `format_sanitized_subject()` from pretty
>  + pretty: refactor `format_sanitized_subject()`
>  + ref-filter: add `short` modifier to 'parent' atom
>  + ref-filter: add `short` modifier to 'tree' atom
>  + ref-filter: rename `objectname` related functions and fields
>  + ref-filter: modify error messages in `grab_objectname()`
>  + ref-filter: refactor `grab_objectname()`
>  + ref-filter: support different email formats
>
>  The "--format=" option to the "for-each-ref" command and friends
>  learned a few more tricks, e.g. the ":short" suffix that applies to
>  "objectname" now also can be used for "parent", "tree", etc.
>
>  Will merge to 'master'.

I sent an updated version of the patch series addressing your comment
concerning new file format-support.{c,h}[1].

If you are still unsure about adding a new file, you might want to
take a look at the updated version of patch series:
https://public-inbox.org/git/pull.684.v4.git.1598046110.gitgitgadget@gmail.com/

Footnote:
[1]: https://public-inbox.org/git/xmqqlfid1305.fsf@gitster.c.googlers.com/

Thanks,
Hariom

^ permalink raw reply

* Re: [PATCH] midx: traverse the local MIDX first
From: Jeff King @ 2020-08-28 19:51 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Derrick Stolee, git, dstolee, gitster
In-Reply-To: <20200828190752.GD19328@nand.nand.local>

On Fri, Aug 28, 2020 at 03:07:52PM -0400, Taylor Blau wrote:

> Ditto. Peff and I crossed emails, so I was talking about tidying up his
> earlier patch before I even had seen the second one.
> 
> Peff -- any objections to me squashing this into mine and sending that
> for queueing?

Nope, please do (and feel free to forge my signoff).

-Peff

^ permalink raw reply

* Re: [PATCH] midx: traverse the local MIDX first
From: Taylor Blau @ 2020-08-28 19:07 UTC (permalink / raw)
  To: Derrick Stolee, Jeff King; +Cc: git, dstolee, gitster
In-Reply-To: <80ba7515-dc6c-3acd-4b53-c60cbdab1633@gmail.com>

On Fri, Aug 28, 2020 at 03:03:09PM -0400, Derrick Stolee wrote:
> On 8/28/2020 2:55 PM, Jeff King wrote:
> > On Fri, Aug 28, 2020 at 02:50:39PM -0400, Jeff King wrote:
> >
> >> So I'd be tempted to say that the latter callers should be using a
> >> separate function that gives them what they want. That lets them avoid
> >> being too intimate with the details of how we order things.
> >>
> >> The patch below illustrates that.  It also changes the existing function
> >> name to avoid confusion and to help audit the existing callers, but
> >> that's optional and maybe not worth it.
> >
> > And here's the same concept as a more minimal change, suitable for
> > squashing into yours. The advantage is that it keeps the "the local one
> > goes first" logic in one abstracted spot.
>
> This is nice because it is more future-proof: if we needed to
> change the order of the midx list, then we could update the
> implementation of this method instead of every caller.
>
> Personally, I prefer this one (squashed).

Ditto. Peff and I crossed emails, so I was talking about tidying up his
earlier patch before I even had seen the second one.

Peff -- any objections to me squashing this into mine and sending that
for queueing?

> Thanks,
> -Stolee

Thanks,
Taylor

^ permalink raw reply

* Re: [PATCH] midx: traverse the local MIDX first
From: Derrick Stolee @ 2020-08-28 19:03 UTC (permalink / raw)
  To: Jeff King, Taylor Blau; +Cc: git, dstolee, gitster
In-Reply-To: <20200828185531.GB2171353@coredump.intra.peff.net>

On 8/28/2020 2:55 PM, Jeff King wrote:
> On Fri, Aug 28, 2020 at 02:50:39PM -0400, Jeff King wrote:
> 
>> So I'd be tempted to say that the latter callers should be using a
>> separate function that gives them what they want. That lets them avoid
>> being too intimate with the details of how we order things.
>>
>> The patch below illustrates that.  It also changes the existing function
>> name to avoid confusion and to help audit the existing callers, but
>> that's optional and maybe not worth it.
> 
> And here's the same concept as a more minimal change, suitable for
> squashing into yours. The advantage is that it keeps the "the local one
> goes first" logic in one abstracted spot.

This is nice because it is more future-proof: if we needed to
change the order of the midx list, then we could update the
implementation of this method instead of every caller.

Personally, I prefer this one (squashed).

Thanks,
-Stolee

^ permalink raw reply

* Re: [PATCH] midx: traverse the local MIDX first
From: Taylor Blau @ 2020-08-28 18:55 UTC (permalink / raw)
  To: Jeff King; +Cc: Taylor Blau, git, dstolee, gitster
In-Reply-To: <20200828185039.GA2171353@coredump.intra.peff.net>

On Fri, Aug 28, 2020 at 02:50:39PM -0400, Jeff King wrote:
> On Fri, Aug 28, 2020 at 02:06:21PM -0400, Taylor Blau wrote:
> [ snip ]
>
> > This patch bandaids that situation by trying to place the local MIDX
> > first in the chain when calling 'prepare_multi_pack_index_one()'.
> > Specifically, it always inserts all MIDXs loads subsequent to the local
> > one as the head of the tail of the MIDX chain. This makes it so that we
> > don't have a quadratic insertion while still keeping the local MIDX at
> > the front of the list. Likewise, it avoids an 'O(m*n)' lookup in
> > 'remove_redundant_pack()' where 'm' is the number of redundant packs and
> > 'n' is the number of alternates.
> >
> > Protect 'remove_redundant_pack()' against a case where alternates with
> > MIDXs exist, but no local MIDX exists by first checking that 'm->local'
> > before asking whether or not it contains the requested pack.
>
> It seems like the root of the problem is that get_multi_pack_index() is
> being used for two different things:
>
>   - most callers want to iterate over all of the possible midx files,
>     because they're trying to look up an object.
>
>   - this caller wants the single midx for the local repository (I
>     wondered if there might be more of these that we just never noticed
>     because the test suite doesn't use alternates, but having just
>     audited them all, the answer is no).
>
> So I'd be tempted to say that the latter callers should be using a
> separate function that gives them what they want. That lets them avoid
> being too intimate with the details of how we order things.
>
> The patch below illustrates that.  It also changes the existing function
> name to avoid confusion and to help audit the existing callers, but
> that's optional and maybe not worth it.
>
> It does do the linear lookup, so it has the O(m*n) you mentioned. But
> the number of alternates is generally small, and I'd be shocked if this
> was the first m*n loop over the number of alternates. However, we could
> still do the ordering thing and just keep the details inside the new
> function.

I'd be much happier with this patch.

If you wanted to go further, we could do both, and tighten up the
insertion code to make sure that the local MIDX is always first from the
perspective of 'r->objects->multi_pack_index' so that the linear lookup
drops to constant.

But, it might be overkill, since I also tend to think that the number of
alternates is small, and we're not even talking about a difference of
milliseconds here.

So, I'm happy to clean it up for you and forge your sign-off with
permission, or otherwise you're welcome to do so, too.

Thanks,
Taylor

^ permalink raw reply

* Re: [PATCH] midx: traverse the local MIDX first
From: Jeff King @ 2020-08-28 18:55 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, dstolee, gitster
In-Reply-To: <20200828185039.GA2171353@coredump.intra.peff.net>

On Fri, Aug 28, 2020 at 02:50:39PM -0400, Jeff King wrote:

> So I'd be tempted to say that the latter callers should be using a
> separate function that gives them what they want. That lets them avoid
> being too intimate with the details of how we order things.
> 
> The patch below illustrates that.  It also changes the existing function
> name to avoid confusion and to help audit the existing callers, but
> that's optional and maybe not worth it.

And here's the same concept as a more minimal change, suitable for
squashing into yours. The advantage is that it keeps the "the local one
goes first" logic in one abstracted spot.

diff --git a/builtin/repack.c b/builtin/repack.c
index 28b0c1bf1b..60cb196956 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -133,9 +133,9 @@ static void get_non_kept_pack_filenames(struct string_list *fname_list,
 static void remove_redundant_pack(const char *dir_name, const char *base_name)
 {
 	struct strbuf buf = STRBUF_INIT;
-	struct multi_pack_index *m = get_multi_pack_index(the_repository);
+	struct multi_pack_index *m = get_local_multi_pack_index(the_repository);
 	strbuf_addf(&buf, "%s.pack", base_name);
-	if (m && m->local && midx_contains_pack(m, buf.buf))
+	if (m && midx_contains_pack(m, buf.buf))
 		clear_midx_file(the_repository);
 	strbuf_insertf(&buf, 0, "%s/", dir_name);
 	unlink_pack_path(buf.buf, 1);
diff --git a/packfile.c b/packfile.c
index 6ab5233613..9ef27508f2 100644
--- a/packfile.c
+++ b/packfile.c
@@ -1027,6 +1027,17 @@ struct multi_pack_index *get_multi_pack_index(struct repository *r)
 	return r->objects->multi_pack_index;
 }
 
+struct multi_pack_index *get_local_multi_pack_index(struct repository *r)
+{
+	struct multi_pack_index *m = get_multi_pack_index(r);
+
+	/* no need to iterate; we always put the local one first (if any) */
+	if (m && m->local)
+		return m;
+
+	return NULL;
+}
+
 struct packed_git *get_all_packs(struct repository *r)
 {
 	struct multi_pack_index *m;
diff --git a/packfile.h b/packfile.h
index 240aa73b95..a58fc738e0 100644
--- a/packfile.h
+++ b/packfile.h
@@ -57,6 +57,7 @@ void install_packed_git(struct repository *r, struct packed_git *pack);
 struct packed_git *get_packed_git(struct repository *r);
 struct list_head *get_packed_git_mru(struct repository *r);
 struct multi_pack_index *get_multi_pack_index(struct repository *r);
+struct multi_pack_index *get_local_multi_pack_index(struct repository *r);
 struct packed_git *get_all_packs(struct repository *r);
 
 /*

^ permalink raw reply related

* Re: [PATCH] midx: traverse the local MIDX first
From: Jeff King @ 2020-08-28 18:50 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, dstolee, gitster
In-Reply-To: <20200828180621.GA9036@nand.nand.local>

On Fri, Aug 28, 2020 at 02:06:21PM -0400, Taylor Blau wrote:

> The occurs when dropping a pack known to the local MIDX with alternates
> configured that have their own MIDX. Since the alternate's MIDX is
> returned via 'get_multi_pack_index()', 'midx_contains_pack()' returns
> true (which is correct, since it traverses through the '->next' pointer
> to find the MIDX in the chain that does contain the requested object).
> But, we call 'clear_midx_file()' on 'the_repository', which drops the
> MIDX at the path of the first MIDX in the chain, which (in the case of
> t7700.6 is the one in the alternate).

OK, that makes sense for what we're seeing.

> This patch bandaids that situation by trying to place the local MIDX
> first in the chain when calling 'prepare_multi_pack_index_one()'.
> Specifically, it always inserts all MIDXs loads subsequent to the local
> one as the head of the tail of the MIDX chain. This makes it so that we
> don't have a quadratic insertion while still keeping the local MIDX at
> the front of the list. Likewise, it avoids an 'O(m*n)' lookup in
> 'remove_redundant_pack()' where 'm' is the number of redundant packs and
> 'n' is the number of alternates.
> 
> Protect 'remove_redundant_pack()' against a case where alternates with
> MIDXs exist, but no local MIDX exists by first checking that 'm->local'
> before asking whether or not it contains the requested pack.

It seems like the root of the problem is that get_multi_pack_index() is
being used for two different things:

  - most callers want to iterate over all of the possible midx files,
    because they're trying to look up an object.

  - this caller wants the single midx for the local repository (I
    wondered if there might be more of these that we just never noticed
    because the test suite doesn't use alternates, but having just
    audited them all, the answer is no).

So I'd be tempted to say that the latter callers should be using a
separate function that gives them what they want. That lets them avoid
being too intimate with the details of how we order things.

The patch below illustrates that.  It also changes the existing function
name to avoid confusion and to help audit the existing callers, but
that's optional and maybe not worth it.

It does do the linear lookup, so it has the O(m*n) you mentioned. But
the number of alternates is generally small, and I'd be shocked if this
was the first m*n loop over the number of alternates. However, we could
still do the ordering thing and just keep the details inside the new
function.

---
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 7016b28485..a9be7480df 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1246,7 +1246,7 @@ static int want_object_in_pack(const struct object_id *oid,
 			return want;
 	}
 
-	for (m = get_multi_pack_index(the_repository); m; m = m->next) {
+	for (m = get_all_multi_pack_index(the_repository); m; m = m->next) {
 		struct pack_entry e;
 		if (fill_midx_entry(the_repository, oid, &e, m)) {
 			struct packed_git *p = e.p;
diff --git a/builtin/repack.c b/builtin/repack.c
index f10f52779c..60cb196956 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -133,7 +133,7 @@ static void get_non_kept_pack_filenames(struct string_list *fname_list,
 static void remove_redundant_pack(const char *dir_name, const char *base_name)
 {
 	struct strbuf buf = STRBUF_INIT;
-	struct multi_pack_index *m = get_multi_pack_index(the_repository);
+	struct multi_pack_index *m = get_local_multi_pack_index(the_repository);
 	strbuf_addf(&buf, "%s.pack", base_name);
 	if (m && midx_contains_pack(m, buf.buf))
 		clear_midx_file(the_repository);
diff --git a/packfile.c b/packfile.c
index 6ab5233613..ece317642a 100644
--- a/packfile.c
+++ b/packfile.c
@@ -915,7 +915,7 @@ unsigned long repo_approximate_object_count(struct repository *r)
 
 		prepare_packed_git(r);
 		count = 0;
-		for (m = get_multi_pack_index(r); m; m = m->next)
+		for (m = get_all_multi_pack_index(r); m; m = m->next)
 			count += m->num_objects;
 		for (p = r->objects->packed_git; p; p = p->next) {
 			if (open_pack_index(p))
@@ -1021,12 +1021,24 @@ struct packed_git *get_packed_git(struct repository *r)
 	return r->objects->packed_git;
 }
 
-struct multi_pack_index *get_multi_pack_index(struct repository *r)
+struct multi_pack_index *get_all_multi_pack_index(struct repository *r)
 {
 	prepare_packed_git(r);
 	return r->objects->multi_pack_index;
 }
 
+struct multi_pack_index *get_local_multi_pack_index(struct repository *r)
+{
+	struct multi_pack_index *m;
+
+	for (m = get_all_multi_pack_index(r); m; m = m->next) {
+		if (m->local)
+			break;
+	}
+
+	return m;
+}
+
 struct packed_git *get_all_packs(struct repository *r)
 {
 	struct multi_pack_index *m;
diff --git a/packfile.h b/packfile.h
index 240aa73b95..2516eb4667 100644
--- a/packfile.h
+++ b/packfile.h
@@ -56,7 +56,8 @@ void install_packed_git(struct repository *r, struct packed_git *pack);
 
 struct packed_git *get_packed_git(struct repository *r);
 struct list_head *get_packed_git_mru(struct repository *r);
-struct multi_pack_index *get_multi_pack_index(struct repository *r);
+struct multi_pack_index *get_all_multi_pack_index(struct repository *r);
+struct multi_pack_index *get_local_multi_pack_index(struct repository *r);
 struct packed_git *get_all_packs(struct repository *r);
 
 /*
diff --git a/sha1-name.c b/sha1-name.c
index 0b8cb5247a..be6b675953 100644
--- a/sha1-name.c
+++ b/sha1-name.c
@@ -184,7 +184,7 @@ static void find_short_packed_object(struct disambiguate_state *ds)
 	struct multi_pack_index *m;
 	struct packed_git *p;
 
-	for (m = get_multi_pack_index(ds->repo); m && !ds->ambiguous;
+	for (m = get_all_multi_pack_index(ds->repo); m && !ds->ambiguous;
 	     m = m->next)
 		unique_in_midx(m, ds);
 	for (p = get_packed_git(ds->repo); p && !ds->ambiguous;
@@ -660,7 +660,7 @@ static void find_abbrev_len_packed(struct min_abbrev_data *mad)
 	struct multi_pack_index *m;
 	struct packed_git *p;
 
-	for (m = get_multi_pack_index(mad->repo); m; m = m->next)
+	for (m = get_all_multi_pack_index(mad->repo); m; m = m->next)
 		find_abbrev_len_for_midx(m, mad);
 	for (p = get_packed_git(mad->repo); p; p = p->next)
 		find_abbrev_len_for_pack(p, mad);

^ permalink raw reply related

* Re: [PATCH] send-email: do not prompt for In-Reply-To
From: Drew DeVault @ 2020-08-28 18:39 UTC (permalink / raw)
  To: Junio C Hamano, Carlo Marcelo Arenas Belón; +Cc: git
In-Reply-To: <xmqqd03bpuh7.fsf@gitster.c.googlers.com>

On Thu Aug 27, 2020 at 6:57 PM EDT, Junio C Hamano wrote:
> To help those who do not want to add this header, it would probably
> be more helpful to tell what to do when prompted (like "you can give
> an empty answer to tell the command that you are not responding to
> any message").

I'm trying to come up with a message like this for a reduced-scope
initial patch, but I'm drawing up a blank when the goal is to avoid
confusing users who don't understand this. The goal is to remove
domain-specific knowledge about email, namely how the In-Reply-To and
Message-Id headers work, which is uncommon knowledge among new users.

"You can give an empty answer if you are not responding to any message"
could confuse users, because they might think -v2 is a "response", or
maybe they've written the patch in response to a discussion on the
-users mailing list, or any other number of reasons. Now they have to
figure out how to answer this prompt, even if the mailing list they're
sending it to isn't expecting it to be a reply. I came up with a number
of alternative wordings but they all ultimately failed to this same
problem.

In-Reply-To is such an arcane and tangental piece of knowledge so far
removed from git, and so little information is given to the user to help
them understand (1) what it is, and (2) whether they need it or not, and
(3) where to find it, that this prompt just seems totally awful to me.
It'd be like if we prompted someone to enter their display EDID when
filing a bug for a video game. Could it be useful? It's unlikely. Is the
user likely to know what the hell an EDID is? Almost certainly not!

"Legitimate" use-cases like qemu-devel or not, this is only ever going
to confuse new users, and I think that qemu is wrong for encouraging
users to deal with it.

Or they would be wrong, but I looked into it and, the qemu wiki advises
that the user does *not* use in-reply-to:

https://wiki.qemu.org/Contribute/SubmitAPatch

Nor does patchew make it easy to extract the message ID from their
interface for this use-case. I'm not sure where this qemu idea is coming
from.

Is compatibility-breaking migrations a nightmare? Well, maybe.

Is that nightmare worth such a trivial problem? Well, it seems trivial
to those of us who have been using it for years, but I can assure you
there are plenty of new users who shut down at this step.

I hate to be a nuisance over such a seemingly simple problem, but there
are a lot of new users who are struggling here and I care about their
struggle. What path should we take to fixing this issue for them?

^ permalink raw reply

* Re: [PATCH] midx: traverse the local MIDX first
From: Derrick Stolee @ 2020-08-28 18:27 UTC (permalink / raw)
  To: Taylor Blau, git; +Cc: dstolee, gitster, peff
In-Reply-To: <20200828180621.GA9036@nand.nand.local>

On 8/28/2020 2:06 PM, Taylor Blau wrote:
> This invariant is only preserved by the insertion order in
> 'prepare_packed_git()', which traverses through the ODB's '->next'
> pointer, meaning we visit the local object store first. This fragility
> makes this an undesirable long-term solution, but it is acceptable for
> now since this is the only caller.
> 
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
> This is kind of a hack, but the order that we call
> 'prepare_multi_pack_index_one()' from 'prepare_packed_git()' makes it
> acceptable, at least in my own assessment.

The natural alternative would be to scan the list _after_ all are
inserted and pull any MIDX marked "local" to the front of the list.
Such a check would need to happen in the same method that iterates
over all alternates, so that seems a bit redundant.

While perhaps a bit hack-ish, I think this is a sound approach.
And, we have a test that will detect change in behavior here!

Thanks,
-Stolee


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox