* [PATCH 2/4] notes: use exisisting function stream_blob_to_fd
From: Maarten Bosmans @ 2024-02-05 20:49 UTC (permalink / raw)
To: git; +Cc: Teng Long, Maarten Bosmans
In-Reply-To: <20240205204932.16653-1-maarten.bosmans@vortech.nl>
From: Maarten Bosmans <mkbosmans@gmail.com>
From: Maarten Bosmans <maarten.bosmans@vortech.nl>
Use functionality from streaming.c and remove the copy_obj_to_fd()
function local to notes.c.
Streaming the blob to stdout instead of copying through an
intermediate buffer might also be more efficient, but at the
size a typical note is, this is unlikely to matter a lot.
Signed-off-by: Maarten Bosmans <maarten.bosmans@vortech.nl>
---
builtin/notes.c | 14 +-------------
1 file changed, 1 insertion(+), 13 deletions(-)
diff --git a/builtin/notes.c b/builtin/notes.c
index 8efe9809b2..048b8c559c 100644
--- a/builtin/notes.c
+++ b/builtin/notes.c
@@ -149,18 +149,6 @@ static int list_each_note(const struct object_id *object_oid,
return 0;
}
-static void copy_obj_to_fd(int fd, const struct object_id *oid)
-{
- unsigned long size;
- enum object_type type;
- char *buf = repo_read_object_file(the_repository, oid, &type, &size);
- if (buf) {
- if (size)
- write_or_die(fd, buf, size);
- free(buf);
- }
-}
-
static void write_commented_object(int fd, const struct object_id *object)
{
struct child_process show = CHILD_PROCESS_INIT;
@@ -205,7 +193,7 @@ static void prepare_note_data(const struct object_id *object, struct note_data *
if (d->given)
write_or_die(fd, d->buf.buf, d->buf.len);
else if (old_note)
- copy_obj_to_fd(fd, old_note);
+ stream_blob_to_fd(fd, old_note, NULL, 0);
strbuf_addch(&buf, '\n');
strbuf_add_commented_lines(&buf, "\n", strlen("\n"), comment_line_char);
--
2.35.3
^ permalink raw reply related
* [PATCH 1/4] notes: print note blob to stdout directly
From: Maarten Bosmans @ 2024-02-05 20:49 UTC (permalink / raw)
To: git; +Cc: Teng Long, Maarten Bosmans
In-Reply-To: <20240205204932.16653-1-maarten.bosmans@vortech.nl>
From: Maarten Bosmans <mkbosmans@gmail.com>
From: Maarten Bosmans <maarten.bosmans@vortech.nl>
Avoid the need to launch a subprocess by calling stream_blob_to_fd
directly. This does not only get rid of the overhead of a separate
child process, but also avoids the initalization of the whole log
machinery that `git show` does. That is needed for example to show
decorated commits and applying the mailmap. For simply displaying
a blob however, the only useful thing show does is enabling the pager.
Signed-off-by: Maarten Bosmans <maarten.bosmans@vortech.nl>
---
builtin/notes.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/builtin/notes.c b/builtin/notes.c
index e65cae0bcf..8efe9809b2 100644
--- a/builtin/notes.c
+++ b/builtin/notes.c
@@ -20,7 +20,8 @@
#include "repository.h"
#include "pretty.h"
#include "refs.h"
-#include "exec-cmd.h"
+#include "pager.h"
+#include "streaming.h"
#include "run-command.h"
#include "parse-options.h"
#include "string-list.h"
@@ -751,7 +752,7 @@ static int show(int argc, const char **argv, const char *prefix)
struct notes_tree *t;
struct object_id object;
const struct object_id *note;
- int retval;
+ int retval = 0;
struct option options[] = {
OPT_END()
};
@@ -776,8 +777,9 @@ static int show(int argc, const char **argv, const char *prefix)
retval = error(_("no note found for object %s."),
oid_to_hex(&object));
else {
- const char *show_args[3] = {"show", oid_to_hex(note), NULL};
- retval = execv_git_cmd(show_args);
+ setup_pager();
+ if (stream_blob_to_fd(1, note, NULL, 0))
+ die(_("object %s is not a blob"), oid_to_hex(note));
}
free_notes(t);
return retval;
--
2.35.3
^ permalink raw reply related
* [PATCH 3/4] notes: do not clean up right before calling die()
From: Maarten Bosmans @ 2024-02-05 20:49 UTC (permalink / raw)
To: git; +Cc: Teng Long, Maarten Bosmans
In-Reply-To: <20240205204932.16653-1-maarten.bosmans@vortech.nl>
From: Maarten Bosmans <mkbosmans@gmail.com>
From: Maarten Bosmans <maarten.bosmans@vortech.nl>
For consistency with the other uses of die() in the same function.
Signed-off-by: Maarten Bosmans <maarten.bosmans@vortech.nl>
---
builtin/notes.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/builtin/notes.c b/builtin/notes.c
index 048b8c559c..0e98a820dd 100644
--- a/builtin/notes.c
+++ b/builtin/notes.c
@@ -310,12 +310,8 @@ static int parse_reuse_arg(const struct option *opt, const char *arg, int unset)
die(_("failed to resolve '%s' as a valid ref."), arg);
if (!(value = repo_read_object_file(the_repository, &object, &type, &len)))
die(_("failed to read object '%s'."), arg);
- if (type != OBJ_BLOB) {
- strbuf_release(&msg->buf);
- free(value);
- free(msg);
+ if (type != OBJ_BLOB)
die(_("cannot read note data from non-blob object '%s'."), arg);
- }
strbuf_add(&msg->buf, value, len);
free(value);
--
2.35.3
^ permalink raw reply related
* [PATCH 4/4] notes: use strbuf_attach to take ownership of the object contents
From: Maarten Bosmans @ 2024-02-05 20:49 UTC (permalink / raw)
To: git; +Cc: Teng Long, Maarten Bosmans
In-Reply-To: <20240205204932.16653-1-maarten.bosmans@vortech.nl>
From: Maarten Bosmans <mkbosmans@gmail.com>
From: Maarten Bosmans <maarten.bosmans@vortech.nl>
Avoid an extra allocation in the strbuf when pushing the string into it.
Signed-off-by: Maarten Bosmans <maarten.bosmans@vortech.nl>
---
builtin/notes.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/builtin/notes.c b/builtin/notes.c
index 0e98a820dd..71f36667f3 100644
--- a/builtin/notes.c
+++ b/builtin/notes.c
@@ -313,10 +313,8 @@ static int parse_reuse_arg(const struct option *opt, const char *arg, int unset)
if (type != OBJ_BLOB)
die(_("cannot read note data from non-blob object '%s'."), arg);
- strbuf_add(&msg->buf, value, len);
- free(value);
+ strbuf_attach(&msg->buf, value, len, len + 1);
- msg->buf.len = len;
ALLOC_GROW_BY(d->messages, d->msg_nr, 1, d->msg_alloc);
d->messages[d->msg_nr - 1] = msg;
msg->stripspace = NO_STRIPSPACE;
@@ -702,12 +700,11 @@ static int append_edit(int argc, const char **argv, const char *prefix)
char *prev_buf = repo_read_object_file(the_repository, note, &type, &size);
if (prev_buf && size)
- strbuf_add(&buf, prev_buf, size);
+ strbuf_attach(&buf, prev_buf, size, size + 1);
if (d.buf.len && prev_buf && size)
append_separator(&buf);
strbuf_insert(&d.buf, 0, buf.buf, buf.len);
- free(prev_buf);
strbuf_release(&buf);
}
--
2.35.3
^ permalink raw reply related
* Re: [PATCH 2/4] notes: use exisisting function stream_blob_to_fd
From: Eric Sunshine @ 2024-02-05 22:00 UTC (permalink / raw)
To: Maarten Bosmans; +Cc: git, Teng Long, Maarten Bosmans
In-Reply-To: <20240205204932.16653-3-maarten.bosmans@vortech.nl>
On Mon, Feb 5, 2024 at 4:53 PM Maarten Bosmans <mkbosmans@gmail.com> wrote:
> notes: use exisisting function stream_blob_to_fd
s/exisisting/existing/
> Use functionality from streaming.c and remove the copy_obj_to_fd()
> function local to notes.c.
>
> Streaming the blob to stdout instead of copying through an
> intermediate buffer might also be more efficient, but at the
> size a typical note is, this is unlikely to matter a lot.
>
> Signed-off-by: Maarten Bosmans <maarten.bosmans@vortech.nl>
^ permalink raw reply
* Re: [PATCH 2/2] http: prevent redirect from dropping credentials during reauth
From: brian m. carlson @ 2024-02-05 22:18 UTC (permalink / raw)
To: Quentin Bouget; +Cc: git
In-Reply-To: <CYWT6JWQCTFG.106OCJTV3NQDU@devyard.org>
[-- Attachment #1: Type: text/plain, Size: 1967 bytes --]
On 2024-02-05 at 03:01:17, Quentin Bouget wrote:
> Good point, I had not considered the security implications.
>
> I can see libcurl only reuses credentials after a redirect if the
> hostname has not changed: [1]
>
> By default, libcurl only sends credentials and Authentication
> headers to the initial hostname as given in the original URL, to
> avoid leaking username + password to other sites.
>
> Does it sound OK if I use the credentials provided by the redirect when
> there are any (out of consistency with the current implementation), and
> only allow reusing the current credentials when the redirect and the
> original URLs share the same hostname?
I don't think we can actually rely on that functionality because
`credential.usehttppath` could actually have been set, in which case
we'd need a different credential. For example, I know some forges issue
certain types of tokens that are tied to a specific URL and wouldn't
validate for a redirect, even if it were actually the same repo.
If there are credentials in the URL provided by the redirect, I think it
should be safe to use them; otherwise, we'd need to rely on filling them
with the credential protocol.
> Apologies, I feel like I may have given the impression I wanted to
> configure credentials in git's configuration files, which is not the
> case.
>
> My use case is to `git push` a tag from a CI/CD pipeline to trigger a
> release, similar to how I do it here. [3]
>
> Or is this the kind of use case you are trying to discourage?
We're trying to discourage all use of credentials in the URL at the
command line and in remote names/configuration files. If you want to
pass in credentials from the environment, the Git FAQ explains how to do
that[0], and that technique can be used in such a situation.
[0] https://git-scm.com/docs/gitfaq#http-credentials-environment
--
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply
* Re: [PATCH] builtin/stash: report failure to write to index
From: Rubén Justo @ 2024-02-05 22:22 UTC (permalink / raw)
To: Patrick Steinhardt, git; +Cc: moti sd
In-Reply-To: <2cd44b01dc29b099a07658499481a6847c46562d.1707116449.git.ps@pks.im>
On 05-feb-2024 08:01:04, Patrick Steinhardt wrote:
> The git-stash(1) command needs to write to the index for many of its
> operations. When the index is locked by a concurrent writer it will thus
> fail to operate, which is expected. What is not expected though is that
> we do not print any error message at all in this case. The user can thus
> easily miss the fact that the command didn't do what they expected it to
> do and would be left wondering why that is.
>
> Fix this bug and report failures to write to the index. Add tests for
> the subcommands which hit the respective code paths.
>
> Note that the chosen error message ("Cannot write to the index") does
> not match our guidelines as it starts with a capitalized letter. This is
> intentional though and matches the style of all the other messages used
> in git-stash(1).
Your message made me curious, so I ran:
$ git grep -E '(die|error)\(' builtin/stash.c
builtin/stash.c:168: die(_("'%s' is not a stash-like commit"), revision);
builtin/stash.c:214: return error(_("%s is not a valid reference"), revision);
builtin/stash.c:261: return error(_("git stash clear with arguments is "
builtin/stash.c:303: return error(_("unable to write new index file"));
builtin/stash.c:487: die("Failed to move %s to %s\n",
builtin/stash.c:523: die(_("Unable to write index."));
builtin/stash.c:544: return error(_("cannot apply a stash in the middle of a merge"));
builtin/stash.c:555: return error(_("could not generate diff %s^!."),
builtin/stash.c:562: return error(_("conflicts in index. "
builtin/stash.c:569: return error(_("could not save index tree"));
builtin/stash.c:610: ret = error(_("could not write index"));
builtin/stash.c:630: ret = error(_("could not restore untracked files from stash"));
builtin/stash.c:702: return error(_("%s: Could not drop stash entry"),
builtin/stash.c:721: return error(_("'%s' is not a stash reference"), info->revision.buf);
builtin/stash.c:872: die(_("failed to parse tree"));
builtin/stash.c:883: die(_("failed to unpack trees"));
builtin/stash.c:1547: if (report_path_error(ps_matched, ps)) {
builtin/stash.c:1763: die("subcommand wasn't specified; 'push' can't be assumed due to unexpected token '%s'",
builtin/stash.c:1773: die(_("options '%s' and '%s' cannot be used together"), "--pathspec-from-file", "--patch");
builtin/stash.c:1776: die(_("options '%s' and '%s' cannot be used together"), "--pathspec-from-file", "--staged");
builtin/stash.c:1779: die(_("'%s' and pathspec arguments cannot be used together"), "--pathspec-from-file");
builtin/stash.c:1785: die(_("the option '%s' requires '%s'"), "--pathspec-file-nul", "--pathspec-from-file");
Certainly, there are some error messages in builtin/stash.c that do not
follow the notes in Documentation/CodingGuideLines, but it does not seem
to be "the style of all other messages" in git-stash.
However, your message is clear ... What error messages are you
considering?
>
> Reported-by: moti sd <motisd8@gmail.com>
> Signed-off-by: Patrick Steinhardt <ps@pks.im>
> ---
> builtin/stash.c | 6 +++---
> t/t3903-stash.sh | 52 ++++++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 55 insertions(+), 3 deletions(-)
>
> diff --git a/builtin/stash.c b/builtin/stash.c
> index b2813c614c..9df072b459 100644
> --- a/builtin/stash.c
> +++ b/builtin/stash.c
> @@ -537,7 +537,7 @@ static int do_apply_stash(const char *prefix, struct stash_info *info,
> repo_read_index_preload(the_repository, NULL, 0);
> if (repo_refresh_and_write_index(the_repository, REFRESH_QUIET, 0, 0,
> NULL, NULL, NULL))
> - return -1;
> + return error(_("Cannot write to the index"));
>
> if (write_index_as_tree(&c_tree, &the_index, get_index_file(), 0,
> NULL))
> @@ -1364,7 +1364,7 @@ static int do_create_stash(const struct pathspec *ps, struct strbuf *stash_msg_b
> repo_read_index_preload(the_repository, NULL, 0);
> if (repo_refresh_and_write_index(the_repository, REFRESH_QUIET, 0, 0,
> NULL, NULL, NULL) < 0) {
> - ret = -1;
> + ret = error(_("Cannot write to the index"));
> goto done;
> }
>
> @@ -1555,7 +1555,7 @@ static int do_push_stash(const struct pathspec *ps, const char *stash_msg, int q
>
> if (repo_refresh_and_write_index(the_repository, REFRESH_QUIET, 0, 0,
> NULL, NULL, NULL)) {
> - ret = -1;
> + ret = error(_("Cannot write to the index"));
> goto done;
> }
>
> diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
> index 3319240515..770881e537 100755
> --- a/t/t3903-stash.sh
> +++ b/t/t3903-stash.sh
> @@ -1516,4 +1516,56 @@ test_expect_success 'restore untracked files even when we hit conflicts' '
> )
> '
>
> +test_expect_success 'stash create reports a locked index' '
> + test_when_finished "rm -rf repo" &&
> + git init repo &&
> + (
> + cd repo &&
> + test_commit A A.file &&
> + echo change >A.file &&
> + touch .git/index.lock &&
> +
> + cat >expect <<-EOF &&
> + error: Cannot write to the index
> + EOF
> + test_must_fail git stash create 2>err &&
> + test_cmp expect err
> + )
> +'
> +
> +test_expect_success 'stash push reports a locked index' '
> + test_when_finished "rm -rf repo" &&
> + git init repo &&
> + (
> + cd repo &&
> + test_commit A A.file &&
> + echo change >A.file &&
> + touch .git/index.lock &&
> +
> + cat >expect <<-EOF &&
> + error: Cannot write to the index
> + EOF
> + test_must_fail git stash push 2>err &&
> + test_cmp expect err
> + )
> +'
> +
> +test_expect_success 'stash apply reports a locked index' '
> + test_when_finished "rm -rf repo" &&
> + git init repo &&
> + (
> + cd repo &&
> + test_commit A A.file &&
> + echo change >A.file &&
> + git stash push &&
> + touch .git/index.lock &&
> +
> + cat >expect <<-EOF &&
> + error: Cannot write to the index
> + EOF
> + test_must_fail git stash apply 2>err &&
> + test_cmp expect err
> + )
> +'
> +
> test_done
> --
> 2.43.GIT
>
^ permalink raw reply
* Re: git-gui desktop launcher
From: brian m. carlson @ 2024-02-05 22:29 UTC (permalink / raw)
To: Tobias Boesch; +Cc: git
In-Reply-To: <beeab03c564e94861ab339d26c4e135b879a1ccd.camel@googlemail.com>
[-- Attachment #1: Type: text/plain, Size: 2733 bytes --]
On 2024-02-05 at 20:12:10, Tobias Boesch wrote:
> Hello everyone,
>
> quoting from downstream issue:
> https://gitlab.archlinux.org/archlinux/packaging/packages/git/-/issues/5
>
> -------------------------
>
> "As far as I can see git gui cannot easily be used by me on arch.
> A .desktop entry is missing for me.
> I created one that opens git gui.
> It also adds an entry in the "Open With..." menu of file managers (I
> tested only with Nautilus). Opeing git gui with this entry git gui is
> opened in the folder where the menu was opened.
> If it is a git repository git gui open it. If it is no git repository
> git gui opens just as if it was called from the desktop launcher.
> Since it took a while to create it and adds value for me I would like
> to share it to be added to the git package by default.
> It is far from being perfect. It's a first working version. For me
> personally it is enough.
> Before tweaking it further to fit the packaging standards I would like
> to ask if is desired to be added.
>
> .desktop file proposal
>
> [Desktop Entry]
> Name=git gui
I don't know whether this is the official name of the project or not.
Perhaps someone else can comment on what the capitalization and
punctuation of this entry should be.
> Comment=A portable graphical interface to Git
> Exec=/bin/bash -c 'if [[ "$0" = "/bin/bash" ]]; then git gui; else cd
> "$0" && git gui; fi' %F
It's not guaranteed that bash even exists on the system, let alone that
it's in /bin. For example, this wouldn't work on most of the BSDs.
This would need to be templated using SHELL_PATH and written in POSIX
sh (e.g., no `[[`).
> Icon=/usr/share/git-gui/lib/git-gui.ico
This would also need to be given an appropriate location based on the
build parameters.
> Type=Application
> Terminal=false
> Categories=Development;
>
>
> I think upstream has any interest to add this. Therefore I ask here."
>
> -------------------------
>
> The arch package maintainer proposed to try to to add this to upstream
> before just putting it into the arch package.
> Here I am asking if it could be added to git.
If you wanted to send a suitable patch for the file such that it were
appropriately built as part of the build process and installed, then we
could probably accept it. Such patches are usually created by using
`git format-patch` on one or multiple commits and then sent using `git
send-email`. You can take a look at `Documentation/SubmittingPatches`
for more details.
I think such functionality would be generally useful, and probably be
beneficial to a wide variety of distributors.
--
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
^ permalink raw reply
* Re: What's cooking in git.git (Feb 2024, #02; Fri, 2)
From: Taylor Blau @ 2024-02-05 22:39 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, SZEDER Gábor
In-Reply-To: <xmqqmssirm6t.fsf@gitster.g>
On Sat, Feb 03, 2024 at 12:23:06AM -0800, Junio C Hamano wrote:
> * tb/pair-chunk-expect (2023-11-10) 8 commits
> - midx: read `OOFF` chunk with `pair_chunk_expect()`
> - midx: read `OIDL` chunk with `pair_chunk_expect()`
> - commit-graph: read `BIDX` chunk with `pair_chunk_expect()`
> - commit-graph: read `GDAT` chunk with `pair_chunk_expect()`
> - commit-graph: read `CDAT` chunk with `pair_chunk_expect()`
> - commit-graph: read `OIDL` chunk with `pair_chunk_expect()`
> - chunk-format: introduce `pair_chunk_expect()` helper
> - Merge branch 'jk/chunk-bounds-more' into HEAD
>
> Further code clean-up.
>
> Needs review.
> source: <cover.1699569246.git.me@ttaylorr.com>
Let's drop this one. It's been sitting on my backlog for almost four
months, and I don't think it's going to be a priority for me to finish
in the near future.
After spending some time offline thinking about it, I'm not convinced
that it substantially hardens the chunk-format code anyway, so I'd
probably just as soon drop it from your queue.
>
> * tb/path-filter-fix (2024-01-31) 16 commits
> - bloom: introduce `deinit_bloom_filters()`
> - commit-graph: reuse existing Bloom filters where possible
> - object.h: fix mis-aligned flag bits table
> - commit-graph: new Bloom filter version that fixes murmur3
> - commit-graph: unconditionally load Bloom filters
> - bloom: prepare to discard incompatible Bloom filters
> - bloom: annotate filters with hash version
> - repo-settings: introduce commitgraph.changedPathsVersion
> - t4216: test changed path filters with high bit paths
> - t/helper/test-read-graph: implement `bloom-filters` mode
> - bloom.h: make `load_bloom_filter_from_graph()` public
> - t/helper/test-read-graph.c: extract `dump_graph_info()`
> - gitformat-commit-graph: describe version 2 of BDAT
> - commit-graph: ensure Bloom filters are read with consistent settings
> - revision.c: consult Bloom filters for root commits
> - t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()`
>
> The Bloom filter used for path limited history traversal was broken
> on systems whose "char" is unsigned; update the implementation and
> bump the format version to 2.
> source: <cover.1706741516.git.me@ttaylorr.com>
This version should be ready, but given how egregious the mistake I made
in the last round was, I'd love to SZEDER to take a look before I
recommend merging it down.
Thanks,
Taylor
^ permalink raw reply
* Re: [PATCH 1/2] t5332-multi-pack-reuse.sh: extract pack-objects helper functions
From: Taylor Blau @ 2024-02-05 22:44 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Jeff King, Junio C Hamano
In-Reply-To: <ZaeCeY-9AIR-zt7u@tanuki>
On Wed, Jan 17, 2024 at 08:32:09AM +0100, Patrick Steinhardt wrote:
> > @@ -104,12 +110,7 @@ test_expect_success 'reuse objects from first pack with middle gap' '
> > ^$(git rev-parse D)
> > EOF
> >
> > - : >trace2.txt &&
> > - GIT_TRACE2_EVENT="$PWD/trace2.txt" \
> > - git pack-objects --stdout --delta-base-offset --revs <in >/dev/null &&
> > -
> > - test_pack_reused 3 <trace2.txt &&
> > - test_packs_reused 1 <trace2.txt
> > + test_pack_objects_reused 3 1 <in
>
> This conversion causes us to drop the `--delta-base-offset` flag. It
> would be great to have an explanation in the commit message why it is
> fine to drop it.
Oops, that was unintentional. I'll resend a new round that fixes this
issue shortly.
> Other than that this looks like a nice simplification to me.
Thanks for the review!
Thanks,
Taylor
^ permalink raw reply
* Re: [PATCH 2/2] pack-objects: enable multi-pack reuse via `feature.experimental`
From: Taylor Blau @ 2024-02-05 22:48 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Jeff King, Junio C Hamano
In-Reply-To: <ZaeCf1KCwAUCeBPy@tanuki>
On Wed, Jan 17, 2024 at 08:32:15AM +0100, Patrick Steinhardt wrote:
> I would like to avoid cases like this by laying out a plan for when
> experimental features become the new default. It could be as simple as
> "Let's wait N releases and then mark it stable". But having something
> and documenting such a plan in our code makes it a lot more actionable
> to promote those features to become stable eventually.
Fair point. I think that these have been mostly ad-hoc. Unfortunately,
when folks leave the project or change their attention to working on a
different feature, things that were left in feature.experimental may be
forgotten about.
When this inevitably happens, it would be nice to have a written record
(either in the repository, or here on the mailing list) so that other
folks can take up the mantle and graduate those feature(s) as
appropriate.
> I know that this is not in any way specific to your patch, but I thought
> this was a good opportunity to start this discussion. If we can agree on
> my opinion then it would be great to add a comment to the experimental
> feature to explain such an exit criterion.
I don't have a ton to add here for a graduation plan other than it would
be nice to enable it eventually, likely after a few releases without any
show-stopping bug reports.
> Other than that this patch looks good to me, thanks!
Thanks again for the review!
Thanks,
Taylor
^ permalink raw reply
* [PATCH v2 0/2] pack-objects: enable multi-pack reuse via feature.experimental
From: Taylor Blau @ 2024-02-05 22:50 UTC (permalink / raw)
To: git; +Cc: Jeff King, Junio C Hamano, Patrick Steinhardt
In-Reply-To: <cover.1705431816.git.me@ttaylorr.com>
Here is a small reroll of my series to teache pack-objects to perform
multi-pack reuse by way of the feature.experimental configuration.
Similar to last time, the hope is that this will expose this new feature
to more users who might not otherwise be aware of lesser-known
configuration options for pack-objects's internals.
The only changes since last time are as follows:
- Rebased forward onto 2a540e432f (The thirteenth batch, 2024-02-02).
- Re-introduced a missing `--delta-base-offset` argument into the new
helper function added and used by t5332.
Thanks in advance for your review!
Taylor Blau (2):
t5332-multi-pack-reuse.sh: extract pack-objects helper functions
pack-objects: enable multi-pack reuse via `feature.experimental`
Documentation/config/feature.txt | 3 ++
builtin/pack-objects.c | 2 +
repo-settings.c | 1 +
repository.h | 1 +
t/t5332-multi-pack-reuse.sh | 85 +++++++++++++++++---------------
5 files changed, 51 insertions(+), 41 deletions(-)
Range-diff against v1:
1: 94dd41e1af ! 1: db3d67bfa3 t5332-multi-pack-reuse.sh: extract pack-objects helper functions
@@ t/t5332-multi-pack-reuse.sh: pack_position () {
+test_pack_objects_reused_all () {
+ : >trace2.txt &&
+ GIT_TRACE2_EVENT="$PWD/trace2.txt" \
-+ git pack-objects --stdout --revs --all >/dev/null &&
++ git pack-objects --stdout --revs --all --delta-base-offset \
++ >/dev/null &&
+
+ test_pack_reused "$1" <trace2.txt &&
+ test_packs_reused "$2" <trace2.txt
2: a2d0af562a = 2: 683ffd154e pack-objects: enable multi-pack reuse via `feature.experimental`
base-commit: 2a540e432fe5dff3cfa9d3bf7ca56db2ad12ebb9
--
2.43.0.524.g683ffd154e
^ permalink raw reply
* [PATCH v2 1/2] t5332-multi-pack-reuse.sh: extract pack-objects helper functions
From: Taylor Blau @ 2024-02-05 22:50 UTC (permalink / raw)
To: git; +Cc: Jeff King, Junio C Hamano, Patrick Steinhardt
In-Reply-To: <cover.1707173415.git.me@ttaylorr.com>
Most of the tests in t5332 perform some setup before repeating a common
refrain that looks like:
: >trace2.txt &&
GIT_TRACE2_EVENT="$PWD/trace2.txt" \
git pack-objects --stdout --revs --all >/dev/null &&
test_pack_reused $objects_nr <trace2.txt &&
test_packs_reused $packs_nr <trace2.txt
The next commit will add more tests which repeat the above refrain.
Avoid duplicating this invocation even further and prepare for the
following commit by wrapping the above in a helper function called
`test_pack_objects_reused_all()`.
Introduce another similar function `test_pack_objects_reused`, which
expects to read a list of revisions over stdin for tests which need more
fine-grained control of the contents of the pack they generate.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
t/t5332-multi-pack-reuse.sh | 71 +++++++++++++++----------------------
1 file changed, 29 insertions(+), 42 deletions(-)
diff --git a/t/t5332-multi-pack-reuse.sh b/t/t5332-multi-pack-reuse.sh
index 2ba788b042..d516062f84 100755
--- a/t/t5332-multi-pack-reuse.sh
+++ b/t/t5332-multi-pack-reuse.sh
@@ -23,6 +23,27 @@ pack_position () {
grep "$1" objects | cut -d" " -f1
}
+# test_pack_objects_reused_all <pack-reused> <packs-reused>
+test_pack_objects_reused_all () {
+ : >trace2.txt &&
+ GIT_TRACE2_EVENT="$PWD/trace2.txt" \
+ git pack-objects --stdout --revs --all --delta-base-offset \
+ >/dev/null &&
+
+ test_pack_reused "$1" <trace2.txt &&
+ test_packs_reused "$2" <trace2.txt
+}
+
+# test_pack_objects_reused <pack-reused> <packs-reused>
+test_pack_objects_reused () {
+ : >trace2.txt &&
+ GIT_TRACE2_EVENT="$PWD/trace2.txt" \
+ git pack-objects --stdout --revs >/dev/null &&
+
+ test_pack_reused "$1" <trace2.txt &&
+ test_packs_reused "$2" <trace2.txt
+}
+
test_expect_success 'preferred pack is reused for single-pack reuse' '
test_config pack.allowPackReuse single &&
@@ -34,14 +55,10 @@ test_expect_success 'preferred pack is reused for single-pack reuse' '
git multi-pack-index write --bitmap &&
- : >trace2.txt &&
- GIT_TRACE2_EVENT="$PWD/trace2.txt" \
- git pack-objects --stdout --revs --all >/dev/null &&
-
- test_pack_reused 3 <trace2.txt &&
- test_packs_reused 1 <trace2.txt
+ test_pack_objects_reused_all 3 1
'
+
test_expect_success 'enable multi-pack reuse' '
git config pack.allowPackReuse multi
'
@@ -57,21 +74,11 @@ test_expect_success 'reuse all objects from subset of bitmapped packs' '
^$(git rev-parse A)
EOF
- : >trace2.txt &&
- GIT_TRACE2_EVENT="$PWD/trace2.txt" \
- git pack-objects --stdout --revs <in >/dev/null &&
-
- test_pack_reused 6 <trace2.txt &&
- test_packs_reused 2 <trace2.txt
+ test_pack_objects_reused 6 2 <in
'
test_expect_success 'reuse all objects from all packs' '
- : >trace2.txt &&
- GIT_TRACE2_EVENT="$PWD/trace2.txt" \
- git pack-objects --stdout --revs --all >/dev/null &&
-
- test_pack_reused 9 <trace2.txt &&
- test_packs_reused 3 <trace2.txt
+ test_pack_objects_reused_all 9 3
'
test_expect_success 'reuse objects from first pack with middle gap' '
@@ -104,12 +111,7 @@ test_expect_success 'reuse objects from first pack with middle gap' '
^$(git rev-parse D)
EOF
- : >trace2.txt &&
- GIT_TRACE2_EVENT="$PWD/trace2.txt" \
- git pack-objects --stdout --delta-base-offset --revs <in >/dev/null &&
-
- test_pack_reused 3 <trace2.txt &&
- test_packs_reused 1 <trace2.txt
+ test_pack_objects_reused 3 1 <in
'
test_expect_success 'reuse objects from middle pack with middle gap' '
@@ -125,12 +127,7 @@ test_expect_success 'reuse objects from middle pack with middle gap' '
^$(git rev-parse D)
EOF
- : >trace2.txt &&
- GIT_TRACE2_EVENT="$PWD/trace2.txt" \
- git pack-objects --stdout --delta-base-offset --revs <in >/dev/null &&
-
- test_pack_reused 3 <trace2.txt &&
- test_packs_reused 1 <trace2.txt
+ test_pack_objects_reused 3 1 <in
'
test_expect_success 'omit delta with uninteresting base (same pack)' '
@@ -160,10 +157,6 @@ test_expect_success 'omit delta with uninteresting base (same pack)' '
^$base
EOF
- : >trace2.txt &&
- GIT_TRACE2_EVENT="$PWD/trace2.txt" \
- git pack-objects --stdout --delta-base-offset --revs <in >/dev/null &&
-
# We can only reuse the 3 objects corresponding to "other" from
# the latest pack.
#
@@ -175,8 +168,7 @@ test_expect_success 'omit delta with uninteresting base (same pack)' '
# The remaining objects from the other pack are similarly not
# reused because their objects are on the uninteresting side of
# the query.
- test_pack_reused 3 <trace2.txt &&
- test_packs_reused 1 <trace2.txt
+ test_pack_objects_reused 3 1 <in
'
test_expect_success 'omit delta from uninteresting base (cross pack)' '
@@ -189,15 +181,10 @@ test_expect_success 'omit delta from uninteresting base (cross pack)' '
git multi-pack-index write --bitmap --preferred-pack="pack-$P.idx" &&
- : >trace2.txt &&
- GIT_TRACE2_EVENT="$PWD/trace2.txt" \
- git pack-objects --stdout --delta-base-offset --all >/dev/null &&
-
packs_nr="$(find $packdir -type f -name "pack-*.pack" | wc -l)" &&
objects_nr="$(git rev-list --count --all --objects)" &&
- test_pack_reused $(($objects_nr - 1)) <trace2.txt &&
- test_packs_reused $packs_nr <trace2.txt
+ test_pack_objects_reused_all $(($objects_nr - 1)) $packs_nr
'
test_done
--
2.43.0.524.g683ffd154e
^ permalink raw reply related
* [PATCH v2 2/2] pack-objects: enable multi-pack reuse via `feature.experimental`
From: Taylor Blau @ 2024-02-05 22:50 UTC (permalink / raw)
To: git; +Cc: Jeff King, Junio C Hamano, Patrick Steinhardt
In-Reply-To: <cover.1707173415.git.me@ttaylorr.com>
Now that multi-pack reuse is supported, enable it via the
feature.experimental configuration in addition to the classic
`pack.allowPackReuse`.
This will allow more users to experiment with the new behavior who might
not otherwise be aware of the existing `pack.allowPackReuse`
configuration option.
The enum with values NO_PACK_REUSE, SINGLE_PACK_REUSE, and
MULTI_PACK_REUSE is defined statically in builtin/pack-objects.c's
compilation unit. We could hoist that enum into a scope visible from the
repository_settings struct, and then use that enum value in
pack-objects. Instead, define a single int that indicates what
pack-objects's default value should be to avoid additional unnecessary
code movement.
Though `feature.experimental` implies `pack.allowPackReuse=multi`, this
can still be overridden by explicitly setting the latter configuration
to either "single" or "false". Tests covering all of these cases are
showin t5332.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
Documentation/config/feature.txt | 3 +++
builtin/pack-objects.c | 2 ++
repo-settings.c | 1 +
repository.h | 1 +
t/t5332-multi-pack-reuse.sh | 16 ++++++++++++++++
5 files changed, 23 insertions(+)
diff --git a/Documentation/config/feature.txt b/Documentation/config/feature.txt
index bf9546fca4..f061b64b74 100644
--- a/Documentation/config/feature.txt
+++ b/Documentation/config/feature.txt
@@ -17,6 +17,9 @@ skipping more commits at a time, reducing the number of round trips.
+
* `pack.useBitmapBoundaryTraversal=true` may improve bitmap traversal times by
walking fewer objects.
++
+* `pack.allowPackReuse=multi` may improve the time it takes to create a pack by
+reusing objects from multiple packs instead of just one.
feature.manyFiles::
Enable config options that optimize for repos with many files in the
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index d8c2128a97..329aeac804 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -4396,6 +4396,8 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
prepare_repo_settings(the_repository);
if (sparse < 0)
sparse = the_repository->settings.pack_use_sparse;
+ if (the_repository->settings.pack_use_multi_pack_reuse)
+ allow_pack_reuse = MULTI_PACK_REUSE;
}
reset_pack_idx_option(&pack_idx_opts);
diff --git a/repo-settings.c b/repo-settings.c
index 30cd478762..a0b590bc6c 100644
--- a/repo-settings.c
+++ b/repo-settings.c
@@ -43,6 +43,7 @@ void prepare_repo_settings(struct repository *r)
if (experimental) {
r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_SKIPPING;
r->settings.pack_use_bitmap_boundary_traversal = 1;
+ r->settings.pack_use_multi_pack_reuse = 1;
}
if (manyfiles) {
r->settings.index_version = 4;
diff --git a/repository.h b/repository.h
index 7a250a6605..21949c5a17 100644
--- a/repository.h
+++ b/repository.h
@@ -39,6 +39,7 @@ struct repo_settings {
int sparse_index;
int pack_read_reverse_index;
int pack_use_bitmap_boundary_traversal;
+ int pack_use_multi_pack_reuse;
/*
* Does this repository have core.useReplaceRefs=true (on by
diff --git a/t/t5332-multi-pack-reuse.sh b/t/t5332-multi-pack-reuse.sh
index d516062f84..b925a81d37 100755
--- a/t/t5332-multi-pack-reuse.sh
+++ b/t/t5332-multi-pack-reuse.sh
@@ -58,6 +58,22 @@ test_expect_success 'preferred pack is reused for single-pack reuse' '
test_pack_objects_reused_all 3 1
'
+test_expect_success 'multi-pack reuse is disabled by default' '
+ test_pack_objects_reused_all 3 1
+'
+
+test_expect_success 'feature.experimental implies multi-pack reuse' '
+ test_config feature.experimental true &&
+
+ test_pack_objects_reused_all 6 2
+'
+
+test_expect_success 'multi-pack reuse can be disabled with feature.experimental' '
+ test_config feature.experimental true &&
+ test_config pack.allowPackReuse single &&
+
+ test_pack_objects_reused_all 3 1
+'
test_expect_success 'enable multi-pack reuse' '
git config pack.allowPackReuse multi
--
2.43.0.524.g683ffd154e
^ permalink raw reply related
* RE: [PATCH 2/2] http: prevent redirect from dropping credentials during reauth
From: rsbecker @ 2024-02-05 22:52 UTC (permalink / raw)
To: 'brian m. carlson', 'Quentin Bouget'; +Cc: git
In-Reply-To: <ZcFeoyFXqLhMXVRh@tapette.crustytoothpaste.net>
On Monday, February 5, 2024 5:18 PM, brian m. carlson wrote:
>On 2024-02-05 at 03:01:17, Quentin Bouget wrote:
>> Good point, I had not considered the security implications.
>>
>> I can see libcurl only reuses credentials after a redirect if the
>> hostname has not changed: [1]
>>
>> By default, libcurl only sends credentials and Authentication
>> headers to the initial hostname as given in the original URL, to
>> avoid leaking username + password to other sites.
>>
>> Does it sound OK if I use the credentials provided by the redirect
>> when there are any (out of consistency with the current
>> implementation), and only allow reusing the current credentials when
>> the redirect and the original URLs share the same hostname?
>
>I don't think we can actually rely on that functionality because
>`credential.usehttppath` could actually have been set, in which case we'd need a
>different credential. For example, I know some forges issue certain types of tokens
>that are tied to a specific URL and wouldn't validate for a redirect, even if it were
>actually the same repo.
>
>If there are credentials in the URL provided by the redirect, I think it should be safe
>to use them; otherwise, we'd need to rely on filling them with the credential
>protocol.
>
>> Apologies, I feel like I may have given the impression I wanted to
>> configure credentials in git's configuration files, which is not the
>> case.
>>
>> My use case is to `git push` a tag from a CI/CD pipeline to trigger a
>> release, similar to how I do it here. [3]
>>
>> Or is this the kind of use case you are trying to discourage?
>
>We're trying to discourage all use of credentials in the URL at the command line and
>in remote names/configuration files. If you want to pass in credentials from the
>environment, the Git FAQ explains how to do that[0], and that technique can be
>used in such a situation.
>
>[0] https://git-scm.com/docs/gitfaq#http-credentials-environment
A common side-use case (not directly in git) for this situation is to attempt to use curl (or libcurl) to create a Pull Request via the GitHub (or other enterprise git server) CLI or POST. This is most often done via REST rather than supplying via the URL. It does remove the need to pass some credentials (a.k.a. the API token) via the URL as the API token gets injected into the JSON content - this may have been the original motivation as many of the servers do redirects. However, they do not reprocess or inject different credentials. I am wonder about the specific use case is for this situation and why a redirect injects a credential change, which I cannot see is a good thing.
--Randall
^ permalink raw reply
* Re: [PATCH v3 2/2] add-patch: classify '@' as a synonym for 'HEAD'
From: Junio C Hamano @ 2024-02-05 23:07 UTC (permalink / raw)
To: Phillip Wood; +Cc: Ghanshyam Thakkar, git, ps
In-Reply-To: <9f9f26f1-5460-468e-a893-5caf7fbea981@gmail.com>
Phillip Wood <phillip.wood123@gmail.com> writes:
> As well as the places you have converted we also have an explicit test
> for "HEAD" in parse_diff() which looks like
>
> if (s->revision) {
> struct object_id oid;
> strvec_push(&args,
> /* could be on an unborn branch */
> !strcmp("HEAD", s->revision) &&
> repo_get_oid(the_repository, "HEAD", &oid) ?
> empty_tree_oid_hex() : s->revision);
> }
>
> I suspect we need to update that code as well to accept "@" as a
> synonym for "HEAD" on an unborn branch.
I actually think "@" in the input should be normalized to "HEAD" as
soon as possible. After the if/elseif/ cascade in run_add_p() the
patch in this thread touches, there is an assignment
s.revision = revision;
and even though we rewrite !strcmp(revision, "HEAD") to "user means
HEAD" to additionally accept "@" in that if/elseif/ cascade, here we
will stuff different values to s.revision here. We could normalize
the end-user supplied "@" to "HEAD" before making this assignment,
then you do not have to worry about the code in parse_diff() above,
and more importantly, we do not have to rely on what the current set
of callers happen to do and do not happen to do (e.g., "git reset
-p" happens to use hardcoded "HEAD" for unborn case without using
anything obtained from the user, so the above code in parse_diff()
might be safe in that case, but we do not want to rely on such
subtlety to make sure our code is correct)
Come to think of it, we could even do the "normalizing" even before,
and that might greatly simplify things. For example, if we did so
at the very beginning of run_add_p(), before we come to that if/else
if/ cascade, we may not even have to worry about the "user meant HEAD"
helper. After the normalization, we can just keep using !strcmp()
with "HEAD" alone. Simple and clean, no?
Thanks.
^ permalink raw reply
* Re: [PATCH 2/2] sequencer: unset GIT_CHERRY_PICK_HELP for 'exec' commands
From: Junio C Hamano @ 2024-02-05 23:09 UTC (permalink / raw)
To: Kristoffer Haugsbakk; +Cc: Vegard Nossum, git, Jonathan Nieder, Phillip Wood
In-Reply-To: <ebe188e5-7289-4f7b-b845-d59a47cd06fe@app.fastmail.com>
"Kristoffer Haugsbakk" <code@khaugsbakk.name> writes:
> On Mon, Feb 5, 2024, at 15:13, Vegard Nossum wrote:
>> Link: https://lore.kernel.org/git/0adb1068-ef10-44ed-ad1d-e0927a09245d@gmail.com/
>> Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
>> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
>
> `Link` is not really used a lot. Junio’s `refs/notes/amlog` will point
> back to the patch (which is often close to the “suggested by” and so
> on).
Good. Also, is there [PATCH 1/2] that comes before this patch?
^ permalink raw reply
* Re: [PATCH 2/2] sequencer: unset GIT_CHERRY_PICK_HELP for 'exec' commands
From: Vegard Nossum @ 2024-02-05 23:14 UTC (permalink / raw)
To: Junio C Hamano, Kristoffer Haugsbakk; +Cc: git, Jonathan Nieder, Phillip Wood
In-Reply-To: <xmqqy1bymru0.fsf@gitster.g>
On 06/02/2024 00:09, Junio C Hamano wrote:
> "Kristoffer Haugsbakk" <code@khaugsbakk.name> writes:
>
>> On Mon, Feb 5, 2024, at 15:13, Vegard Nossum wrote:
>>> Link: https://lore.kernel.org/git/0adb1068-ef10-44ed-ad1d-e0927a09245d@gmail.com/
>>> Suggested-by: Phillip Wood <phillip.wood123@gmail.com>
>>> Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
>>
>> `Link` is not really used a lot. Junio’s `refs/notes/amlog` will point
>> back to the patch (which is often close to the “suggested by” and so
>> on).
>
> Good. Also, is there [PATCH 1/2] that comes before this patch?
Yes, kind of -- that's the testcase at the root of the thread:
https://lore.kernel.org/git/20240202091850.160203-1-vegard.nossum@oracle.com/
("t/t3515-cherry-pick-rebase.sh: new testcase demonstrating broken
behavior")
Vegard
^ permalink raw reply
* Re: What's cooking in git.git (Feb 2024, #02; Fri, 2)
From: Junio C Hamano @ 2024-02-05 23:20 UTC (permalink / raw)
To: Phillip Wood; +Cc: git
In-Reply-To: <087d3438-98a1-46fe-89d9-8e7e1662151b@gmail.com>
Phillip Wood <phillip.wood123@gmail.com> writes:
> I'm concerned that the UI could use some improvement. If I understand
> correctly the proposal is to make
>
> git for-each-ref
>
> and
>
> git for-each-ref ""
>
> behave differently so that the latter prints the pseudorefs from the
> current worktree and the former does not.
I would actually think that is perfectly sensible. The optional
arguments for-each-ref name "filtering patterns" and you can view
the behaviour of the command as using "refs/" as the default
filtering pattern when nothing is given. But it is easy to defeat
the unfortunate and historical default filtering pattern, by saying
"we do not limit to any hierarchy, anything goes" by giving "" as
the prefix.
^ permalink raw reply
* Re: [PATCH] branch: clarify <oldbranch> and <newbranch> terms further
From: Kyle Lippincott @ 2024-02-05 23:53 UTC (permalink / raw)
To: Dragan Simic; +Cc: git
In-Reply-To: <e2eb777bca8ffeec42bdd684837d28dd52cfc9c3.1707136999.git.dsimic@manjaro.org>
On Mon, Feb 5, 2024 at 4:45 AM Dragan Simic <dsimic@manjaro.org> wrote:
>
> Clarify further the uses for <oldbranch> and describe the additional use
> for <newbranch>. Mentioning both renaming and copying in these places might
> seem a bit redundant, but it should actually make understanding these terms
> easier to the readers of the git-branch(1) man page.
>
> Signed-off-by: Dragan Simic <dsimic@manjaro.org>
> ---
> Documentation/git-branch.txt | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/Documentation/git-branch.txt b/Documentation/git-branch.txt
> index 0b0844293235..7392c2f0797d 100644
> --- a/Documentation/git-branch.txt
> +++ b/Documentation/git-branch.txt
> @@ -312,12 +312,14 @@ superproject's "origin/main", but tracks the submodule's "origin/main".
> option is omitted, the current HEAD will be used instead.
>
> <oldbranch>::
> - The name of an existing branch. If this option is omitted,
> - the name of the current branch will be used instead.
> + The name of an existing branch to be renamed or copied.
> + If this option is omitted, the name of the current branch
> + will be used instead.
>
> <newbranch>::
> - The new name for an existing branch. The same restrictions as for
> - <branchname> apply.
> + The new name for an existing branch, when renaming a branch,
> + or the name for a new branch, when copying a branch. The same
> + naming restrictions apply as for <branchname>.
The precision here makes me worry that I'm potentially missing
something when reading this, and has made me re-read it multiple times
to try to figure out what it is.
I think this would be cleaner:
The name to give the branch created by the rename or copy operation.
The operation fails if <newbranch> already exists, use --force to ignore
this error. The same naming restrictions apply as for <branchname>.
I'm not super pleased with that second sentence, and maybe we
shouldn't include it here. Maybe it belongs on the documentation for
--move and --copy instead? It's sort of mentioned in the text at the
top describing the -m/-M and -c/-C options, though it's not clear from
that text what actually happens to the existing copy of <newbranch> if
one uses --force. If we could include a better description of what
happens to the existing branch when one uses --force, that'd be nice.
>
> --sort=<key>::
> Sort based on the key given. Prefix `-` to sort in descending
>
^ permalink raw reply
* Re: [PATCH v2 4/5] reftable/writer: fix writing multi-level indices
From: jltobler @ 2024-02-05 23:56 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: git, Junio C Hamano, Toon Claes, Kristoffer Haugsbakk
In-Reply-To: <89a88cf83eeb50542d3878c5c6e56e46f2bc3e73.1706773842.git.ps@pks.im>
On 24/02/01 08:52AM, Patrick Steinhardt wrote:
> When finishing a section we will potentially write an index that makes
> it more efficient to look up relevant blocks. The index records written
> will encode, for each block of the indexed section, what the offset of
> that block is as well as the last key of that block. Thus, the reader
> would iterate through the index records to find the first key larger or
> equal to the wanted key and then use the encoded offset to look up the
> desired block.
>
> When there are a lot of blocks to index though we may end up writing
> multiple index blocks, too. To not require a linear search across all
> index blocks we instead end up writing a multi-level index. Instead of
> referring to the block we are after, an index record may point to
> another index block. The reader will then access the highest-level index
> and follow down the chain of index blocks until it hits the sought-after
> block.
>
> It has been observed though that it is impossible to seek ref records of
> the last ref block when using a multi-level index. While the multi-level
> index exists and looks fine for most of the part, the highest-level
> index was missing an index record pointing to the last block of the next
> index. Thus, every additional level made more refs become unseekable at
> the end of the ref section.
Just to clarify, is only the highest-level index not recording the last
block when multi-level indexes are being used? Or are the indexes at all
levels leaving the last block unreachable?
>
> The root cause is that we are not flushing the last block of the current
> level once done writing the level. Consequently, it wasn't recorded in
> the blocks that need to be indexed by the next-higher level and thus we
> forgot about it.
>
> Fix this bug by flushing blocks after we have written all index records.
-Justin
^ permalink raw reply
* Re: [PATCH] branch: clarify <oldbranch> and <newbranch> terms further
From: Junio C Hamano @ 2024-02-06 0:09 UTC (permalink / raw)
To: Kyle Lippincott; +Cc: Dragan Simic, git
In-Reply-To: <CAO_smViHVZRObZjg0tEPXezJZb7wvs9LQdHUFJQTK4-ASCfrmw@mail.gmail.com>
Kyle Lippincott <spectral@google.com> writes:
> I'm not super pleased with that second sentence, and maybe we
> shouldn't include it here. Maybe it belongs on the documentation for
> --move and --copy instead? It's sort of mentioned in the text at the
> top describing the -m/-M and -c/-C options, though it's not clear from
> that text what actually happens to the existing copy of <newbranch> if
> one uses --force. If we could include a better description of what
> happens to the existing branch when one uses --force, that'd be nice.
My preference is to limit the "OPTIONS" section to dashed options.
If "--move" takes one or two arguments, update its description to
talk about how these one or two arguments are used, perhaps like
-m [<oldbranch>] <newbranch>::
--move [<oldbranch>] <newbranch>::
Rename an existing branch <oldbranch>, which
defaults to the current branch, to <newbranch>. The
configuration variables about and the reflog of
<oldbranch> are also renamed appropriately to be
used with <newbranch>. It is an error if <newbranch>
exists (you can use `--force` to clobber an existing
<newbranch>).
or something like that.
Listing non-options in the list may have been a misguided attempt to
"save" description on arguments that are common to multiple options,
but it is not working. We can see the bad effect of that approach
only by looking at the current description of the above option,
which reads:
-m::
--move::
Move/rename a branch, together with its config and reflog.
It does not mentioning what arguments "--move" takes, and does not
even refer the readers to the entries for <newbranch> and
<oldbranch>, so the only plausible way the users can learn what they
want about this single option is by reading the page from top to
bottom.
And trim the DESCRIPTION part. A lot. Because things are explained
redundantly between there and the OPTIONS part, and their details
are waiting to drift apart unless we are careful.
I think I laid all this out and more in a separate message.
https://lore.kernel.org/git/xmqq8r4zln08.fsf@gitster.g/
^ permalink raw reply
* Re: [PATCH v3 08/10] trailer: move arg handling to interpret-trailers.c
From: Linus Arver @ 2024-02-06 1:01 UTC (permalink / raw)
To: Junio C Hamano, Linus Arver via GitGitGadget
Cc: git, Christian Couder, Emily Shaffer, Josh Steadmon,
Randall S. Becker
In-Reply-To: <xmqqttmrx1ro.fsf@gitster.g>
Junio C Hamano <gitster@pobox.com> writes:
> "Linus Arver via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
>> diff --git a/builtin/interpret-trailers.c b/builtin/interpret-trailers.c
>> index 9f0ba39b317..9a902012912 100644
>> --- a/builtin/interpret-trailers.c
>> +++ b/builtin/interpret-trailers.c
>> @@ -45,23 +45,17 @@ static int option_parse_if_missing(const struct option *opt,
>> return trailer_set_if_missing(opt->value, arg);
>> }
>>
>> -static void free_new_trailers(struct list_head *trailers)
>> -{
>> - struct list_head *pos, *tmp;
>> - struct new_trailer_item *item;
>> -
>> - list_for_each_safe(pos, tmp, trailers) {
>> - item = list_entry(pos, struct new_trailer_item, list);
>> - list_del(pos);
>> - free(item);
>> - }
>> -}
>> +static char *cl_separators;
>
> It somehow feels ugly and goes backward to depend on a new global,
> especially when you are aiming to libify more things.
Removed in the next reroll (we can just use a regular stack variable,
not a global).
> Even if you plan to later make this function callable from outside
> the context of parse_options() callback, and if you do not want "CLI
> allows '=' as well" for such new callers, we should be able to have
> a shared helper function that is the bulk of this function but takes
> additional parameter to tweak its behaviour slightly depending on
> the needs of the callers?
I'll refrain from exploring this possibility in this series (and just
make do with the non-global variable as described above) because of the
additional pending changes around trailer configuration handling in the
larger series. FWIW even in the larger series I don't think there's a
need to have such a helper function.
>> + trailer_conf_set(where, if_exists, if_missing, conf_current);
>
> I am getting an impression that the use of and the introduction of
> the new helper, mixed with movement of the responsibility, makes
> reviewing the change unnecessarily more cumbersome. An equivalent
> series with a few more steps, each of them focusing on a small
> change (like, "updating the conf_current members is done with
> assignments that appear as a pattern very often---introduce a helper
> to reduce boilerplate") would have been more enticing to reviewers.
I've broken this patch down into smaller steps (along with the other
notable patches in this series).
>> + trailer_add_arg_item(strbuf_detach(&tok, NULL),
>> + strbuf_detach(&val, NULL),
>> + conf_current,
>> + trailers);
>> + } else {
>> + struct strbuf sb = STRBUF_INIT;
>> + strbuf_addstr(&sb, arg);
>> + strbuf_trim(&sb);
>> + error(_("empty trailer token in trailer '%.*s'"),
>> + (int) sb.len, sb.buf);
>> + strbuf_release(&sb);
>> + }
>> +
>> + free(conf_current);
>> +
>> return 0;
>> }
>>
>> @@ -135,7 +148,7 @@ static void read_input_file(struct strbuf *sb, const char *file)
>> }
>>
>> static void interpret_trailers(const struct process_trailer_options *opts,
>> - struct list_head *new_trailer_head,
>> + struct list_head *arg_trailers,
>> const char *file)
>> {
>> LIST_HEAD(head);
>> @@ -144,8 +157,6 @@ static void interpret_trailers(const struct process_trailer_options *opts,
>> struct trailer_block *trailer_block;
>> FILE *outfile = stdout;
>>
>> - trailer_config_init();
>> -
>> read_input_file(&sb, file);
>>
>> if (opts->in_place)
>> @@ -162,12 +173,7 @@ static void interpret_trailers(const struct process_trailer_options *opts,
>>
>>
>> if (!opts->only_input) {
>> - LIST_HEAD(config_head);
>> - LIST_HEAD(arg_head);
>> - parse_trailers_from_config(&config_head);
>> - parse_trailers_from_command_line_args(&arg_head, new_trailer_head);
>> - list_splice(&config_head, &arg_head);
>> - process_trailers_lists(&head, &arg_head);
>> + process_trailers_lists(&head, arg_trailers);
>> }
>
> So, the bulk of the parsing is no longer responsibility of this
> function. Instead, the caller (e.g., cmd_interpret_trailers()) is
> expected to do that before they call us.
Yup. This movement of "parsing responsibility" logic should be easy to
see in the next reroll (as it will be broken out separately).
>> ...
>> git_config(git_default_config, NULL);
>> + trailer_config_init();
>> +
>> + if (!opts.only_input) {
>> + parse_trailers_from_config(&configured_trailers);
>> + }
>
> By the way, until we add more statements in this block, lose the
> unnecessary {} around it.
Oops. Done.
>> @@ -424,6 +429,25 @@ int trailer_set_if_missing(enum trailer_if_missing *item, const char *value)
>> return 0;
>> }
>>
>> +void trailer_conf_set(enum trailer_where where,
>> + enum trailer_if_exists if_exists,
>> + enum trailer_if_missing if_missing,
>> + struct trailer_conf *conf)
>> +{
>> + if (where != WHERE_DEFAULT)
>> + conf->where = where;
>> + if (if_exists != EXISTS_DEFAULT)
>> + conf->if_exists = if_exists;
>> + if (if_missing != MISSING_DEFAULT)
>> + conf->if_missing = if_missing;
>> +}
>
> So, this is such a helper function. It is curious that it
> deliberately lacks the ability to reset what has already been
> configured back to the default.
>
> For example, we earlier saw this original code ...
>
>> - item = xmalloc(sizeof(*item));
>> - item->text = arg;
>> - item->where = where;
>> - item->if_exists = if_exists;
>> - item->if_missing = if_missing;
>
> ... gets replaced with this call.
>
>> + struct trailer_conf *conf_current = new_trailer_conf();
>> ...
>> + trailer_conf_set(where, if_exists, if_missing, conf_current);
>
> For this conversion to be correct, we require that the value of the
> *_DEFAULT macro to be 0 and/or these three values getting assigned
> are not *_DEFAULT values. Maybe that may happen to be the case in
> today's code, but such an unwritten assumption makes me feel nervous.
>
> I am not sure if it is worth the confusion factor to make a function
> whose name is $anything_set() to be making a conditional assignment.
> If such a conditional assignment *also* happens often and deserves
> to have its own helper, it is fine to have such a helper for
> conditional assignment, but calling it $anything_set() is probably a
> mistake.
Agreed. I've split this out into separate helpers that do not have
conditionals built into them.
> Other than that, the main thrust of this step seems sensible.
Ack.
Overall, for the next reroll I have the commits broken down and it is
mostly the same. I just need to wait for CI to pass and update the cover
letter, before sending it (as v4). Should be ready later tonight.
Thanks.
^ permalink raw reply
* Re: [PATCH] Always check the return value of `repo_read_object_file()`
From: Kyle Lippincott @ 2024-02-06 1:13 UTC (permalink / raw)
To: Johannes Schindelin via GitGitGadget; +Cc: git, Johannes Schindelin
In-Reply-To: <pull.1650.git.1707143753726.gitgitgadget@gmail.com>
On Mon, Feb 5, 2024 at 6:36 AM Johannes Schindelin via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> There are a couple of places in Git's source code where the return value
> is not checked. As a consequence, they are susceptible to segmentation
> faults.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
> Always check the return value of repo_read_object_file()
>
> I ran into this today, when I had tried git am -3 to import changes from
> a repository into a different repository that has the first repository's
> code vendored in. To make this work, I set
> GIT_ALTERNATE_OBJECT_DIRECTORIES accordingly for the git am -3 call, but
> forgot to set it for a subsequent git diff call, which then segfaulted.
>
> There are still a couple of places left where there are checks but they
> look dubious to me, as they simply continue as if an empty blob had been
> read, for example in builtin/tag.c. However, there are checks that avoid
> segfaults, so I left them alone.
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1650%2Fdscho%2Fsafer-object-reads-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1650/dscho/safer-object-reads-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1650
>
> bisect.c | 3 +++
> builtin/cat-file.c | 10 ++++++++--
> builtin/grep.c | 2 ++
> builtin/notes.c | 6 ++++--
> combine-diff.c | 2 ++
> rerere.c | 3 +++
> 6 files changed, 22 insertions(+), 4 deletions(-)
>
> diff --git a/builtin/notes.c b/builtin/notes.c
> index e65cae0bcf7..caf20fd5bdd 100644
> --- a/builtin/notes.c
> +++ b/builtin/notes.c
> @@ -716,9 +716,11 @@ static int append_edit(int argc, const char **argv, const char *prefix)
> struct strbuf buf = STRBUF_INIT;
> char *prev_buf = repo_read_object_file(the_repository, note, &type, &size);
>
> - if (prev_buf && size)
> + if (!prev_buf)
> + die(_("unable to read %s"), oid_to_hex(note));
This changes the behavior of this function. Previously, it would not
add prev_buf output, but still succeed. This now dies.
> + if (size)
> strbuf_add(&buf, prev_buf, size);
> - if (d.buf.len && prev_buf && size)
> + if (d.buf.len && size)
> append_separator(&buf);
> strbuf_insert(&d.buf, 0, buf.buf, buf.len);
>
> diff --git a/combine-diff.c b/combine-diff.c
> index db94581f724..d6d6fa16894 100644
> --- a/combine-diff.c
> +++ b/combine-diff.c
> @@ -337,6 +337,8 @@ static char *grab_blob(struct repository *r,
> free_filespec(df);
> } else {
> blob = repo_read_object_file(r, oid, &type, size);
> + if (!blob)
> + die(_("unable to read %s"), oid_to_hex(oid));
Technically this is changing the output, but I think that's good - I
believe that previously the behavior was undefined, because `type`
wouldn't be modified if the blob didn't exist, and `type` wasn't
assigned a value earlier in the function.
> if (type != OBJ_BLOB)
> die("object '%s' is not a blob!", oid_to_hex(oid));
> }
>
> base-commit: 2a540e432fe5dff3cfa9d3bf7ca56db2ad12ebb9
> --
> gitgitgadget
>
^ permalink raw reply
* Re: [PATCH] commit.c: ensure strchrnul() doesn't scan beyond range
From: Kyle Lippincott @ 2024-02-06 1:41 UTC (permalink / raw)
To: Chandra Pratap via GitGitGadget; +Cc: git, Chandra Pratap, Chandra Pratap
In-Reply-To: <pull.1652.git.1707153705840.gitgitgadget@gmail.com>
On Mon, Feb 5, 2024 at 9:23 AM Chandra Pratap via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Chandra Pratap <chandrapratap3519@gmail.com>
>
> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com>
> ---
> commit.c: ensure strchrnul() doesn't scan beyond range
>
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1652%2FChand-ra%2Fstrchrnul-v1
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1652/Chand-ra/strchrnul-v1
> Pull-Request: https://github.com/gitgitgadget/git/pull/1652
>
> commit.c | 8 +-------
> 1 file changed, 1 insertion(+), 7 deletions(-)
>
> diff --git a/commit.c b/commit.c
> index ef679a0b939..a65b8e92e94 100644
> --- a/commit.c
> +++ b/commit.c
> @@ -1743,15 +1743,9 @@ const char *find_header_mem(const char *msg, size_t len,
> int key_len = strlen(key);
> const char *line = msg;
>
> - /*
> - * NEEDSWORK: It's possible for strchrnul() to scan beyond the range
> - * given by len. However, current callers are safe because they compute
> - * len by scanning a NUL-terminated block of memory starting at msg.
> - * Nonetheless, it would be better to ensure the function does not look
> - * at msg beyond the len provided by the caller.
> - */
> while (line && line < msg + len) {
> const char *eol = strchrnul(line, '\n');
> + assert(eol - line <= len);
I don't think this is sufficient to address the NEEDSWORK. `assert` is
only active in debug builds, and strchrnul would have already
potentially exceeded the bounds of its memory by the time this check
is happening. We'd need a safe version of strchrnul that took the
maximum length and never exceeded it.
>
> if (line == eol)
> return NULL;
>
> base-commit: a54a84b333adbecf7bc4483c0e36ed5878cac17b
> --
> gitgitgadget
>
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox