* [PATCH v2 2/3] advice: fix an unexpected leading space
From: Rubén Justo @ 2024-01-11 12:40 UTC (permalink / raw)
To: Git List, Junio C Hamano
In-Reply-To: <4aedc15c-4b3f-4f5e-abea-581b501600f8@gmail.com>
This space was introduced, presumably unintentionally, in b3b18d1621
(advice: revamp advise API, 2020-03-02)
I notice this space due to confuse diff outputs while doing some
changes to enum advice_type.
As a reference, a recent change we have to that enum is:
$ git show 35f0383
[ ... ]
diff --git a/advice.h b/advice.h
index 0f584163f5..2affbe1426 100644
--- a/advice.h
+++ b/advice.h
@@ -49,6 +49,7 @@ struct string_list;
ADVICE_UPDATE_SPARSE_PATH,
ADVICE_WAITING_FOR_EDITOR,
ADVICE_SKIPPED_CHERRY_PICKS,
+ ADVICE_WORKTREE_ADD_ORPHAN,
};
Note the hunk header, instead of a much more expected:
@@ -49,6 +49,7 @@ enum advice_type
Signed-off-by: Rubén Justo <rjusto@gmail.com>
---
advice.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/advice.h b/advice.h
index 9396bcdcf1..74d44d1156 100644
--- a/advice.h
+++ b/advice.h
@@ -10,7 +10,7 @@ struct string_list;
* Add the new config variable to Documentation/config/advice.txt.
* Call advise_if_enabled to print your advice.
*/
- enum advice_type {
+enum advice_type {
ADVICE_ADD_EMBEDDED_REPO,
ADVICE_ADD_EMPTY_PATHSPEC,
ADVICE_ADD_IGNORED_FILE,
--
2.42.0
^ permalink raw reply related
* [PATCH v2 1/3] advice: sort the advice related lists
From: Rubén Justo @ 2024-01-11 12:40 UTC (permalink / raw)
To: Git List, Junio C Hamano
In-Reply-To: <4aedc15c-4b3f-4f5e-abea-581b501600f8@gmail.com>
Let's keep the advice related lists sorted to make them more
digestible.
A multi-line comment has also been changed; that produces the unexpected
'insertion != deletion' in this supposedly 'only sort lines' commit.
Signed-off-by: Rubén Justo <rjusto@gmail.com>
---
Documentation/config/advice.txt | 154 ++++++++++++++++----------------
advice.c | 13 ++-
advice.h | 12 +--
3 files changed, 88 insertions(+), 91 deletions(-)
diff --git a/Documentation/config/advice.txt b/Documentation/config/advice.txt
index 4d7e5d8759..e0deaf3144 100644
--- a/Documentation/config/advice.txt
+++ b/Documentation/config/advice.txt
@@ -4,27 +4,56 @@ advice.*::
can tell Git that you do not need help by setting these to 'false':
+
--
+ addEmbeddedRepo::
+ Advice on what to do when you've accidentally added one
+ git repo inside of another.
+ addEmptyPathspec::
+ Advice shown if a user runs the add command without providing
+ the pathspec parameter.
+ addIgnoredFile::
+ Advice shown if a user attempts to add an ignored file to
+ the index.
+ amWorkDir::
+ Advice that shows the location of the patch file when
+ linkgit:git-am[1] fails to apply it.
ambiguousFetchRefspec::
Advice shown when a fetch refspec for multiple remotes maps to
the same remote-tracking branch namespace and causes branch
tracking set-up to fail.
+ checkoutAmbiguousRemoteBranchName::
+ Advice shown when the argument to
+ linkgit:git-checkout[1] and linkgit:git-switch[1]
+ ambiguously resolves to a
+ remote tracking branch on more than one remote in
+ situations where an unambiguous argument would have
+ otherwise caused a remote-tracking branch to be
+ checked out. See the `checkout.defaultRemote`
+ configuration variable for how to set a given remote
+ to be used by default in some situations where this
+ advice would be printed.
+ commitBeforeMerge::
+ Advice shown when linkgit:git-merge[1] refuses to
+ merge to avoid overwriting local changes.
+ detachedHead::
+ Advice shown when you used
+ linkgit:git-switch[1] or linkgit:git-checkout[1]
+ to move to the detached HEAD state, to instruct how to
+ create a local branch after the fact.
+ diverging::
+ Advice shown when a fast-forward is not possible.
fetchShowForcedUpdates::
Advice shown when linkgit:git-fetch[1] takes a long time
to calculate forced updates after ref updates, or to warn
that the check is disabled.
- pushUpdateRejected::
- Set this variable to 'false' if you want to disable
- 'pushNonFFCurrent', 'pushNonFFMatching', 'pushAlreadyExists',
- 'pushFetchFirst', 'pushNeedsForce', and 'pushRefNeedsUpdate'
- simultaneously.
- pushNonFFCurrent::
- Advice shown when linkgit:git-push[1] fails due to a
- non-fast-forward update to the current branch.
- pushNonFFMatching::
- Advice shown when you ran linkgit:git-push[1] and pushed
- 'matching refs' explicitly (i.e. you used ':', or
- specified a refspec that isn't your current branch) and
- it resulted in a non-fast-forward error.
+ ignoredHook::
+ Advice shown if a hook is ignored because the hook is not
+ set as executable.
+ implicitIdentity::
+ Advice on how to set your identity configuration when
+ your information is guessed from the system username and
+ domain name.
+ nestedTag::
+ Advice shown if a user attempts to recursively tag a tag object.
pushAlreadyExists::
Shown when linkgit:git-push[1] rejects an update that
does not qualify for fast-forwarding (e.g., a tag.)
@@ -37,6 +66,18 @@ advice.*::
tries to overwrite a remote ref that points at an
object that is not a commit-ish, or make the remote
ref point at an object that is not a commit-ish.
+ pushNonFFCurrent::
+ Advice shown when linkgit:git-push[1] fails due to a
+ non-fast-forward update to the current branch.
+ pushNonFFMatching::
+ Advice shown when you ran linkgit:git-push[1] and pushed
+ 'matching refs' explicitly (i.e. you used ':', or
+ specified a refspec that isn't your current branch) and
+ it resulted in a non-fast-forward error.
+ pushRefNeedsUpdate::
+ Shown when linkgit:git-push[1] rejects a forced update of
+ a branch when its remote-tracking ref has updates that we
+ do not have locally.
pushUnqualifiedRefname::
Shown when linkgit:git-push[1] gives up trying to
guess based on the source and destination refs what
@@ -44,10 +85,23 @@ advice.*::
we can still suggest that the user push to either
refs/heads/* or refs/tags/* based on the type of the
source object.
- pushRefNeedsUpdate::
- Shown when linkgit:git-push[1] rejects a forced update of
- a branch when its remote-tracking ref has updates that we
- do not have locally.
+ pushUpdateRejected::
+ Set this variable to 'false' if you want to disable
+ 'pushNonFFCurrent', 'pushNonFFMatching', 'pushAlreadyExists',
+ 'pushFetchFirst', 'pushNeedsForce', and 'pushRefNeedsUpdate'
+ simultaneously.
+ resetNoRefresh::
+ Advice to consider using the `--no-refresh` option to
+ linkgit:git-reset[1] when the command takes more than 2 seconds
+ to refresh the index after reset.
+ resolveConflict::
+ Advice shown by various commands when conflicts
+ prevent the operation from being performed.
+ rmHints::
+ In case of failure in the output of linkgit:git-rm[1],
+ show directions on how to proceed from the current state.
+ sequencerInUse::
+ Advice shown when a sequencer command is already in progress.
skippedCherryPicks::
Shown when linkgit:git-rebase[1] skips a commit that has already
been cherry-picked onto the upstream branch.
@@ -68,76 +122,22 @@ advice.*::
Advise to consider using the `-u` option to linkgit:git-status[1]
when the command takes more than 2 seconds to enumerate untracked
files.
- commitBeforeMerge::
- Advice shown when linkgit:git-merge[1] refuses to
- merge to avoid overwriting local changes.
- resetNoRefresh::
- Advice to consider using the `--no-refresh` option to
- linkgit:git-reset[1] when the command takes more than 2 seconds
- to refresh the index after reset.
- resolveConflict::
- Advice shown by various commands when conflicts
- prevent the operation from being performed.
- sequencerInUse::
- Advice shown when a sequencer command is already in progress.
- implicitIdentity::
- Advice on how to set your identity configuration when
- your information is guessed from the system username and
- domain name.
- detachedHead::
- Advice shown when you used
- linkgit:git-switch[1] or linkgit:git-checkout[1]
- to move to the detached HEAD state, to instruct how to
- create a local branch after the fact.
- suggestDetachingHead::
- Advice shown when linkgit:git-switch[1] refuses to detach HEAD
- without the explicit `--detach` option.
- checkoutAmbiguousRemoteBranchName::
- Advice shown when the argument to
- linkgit:git-checkout[1] and linkgit:git-switch[1]
- ambiguously resolves to a
- remote tracking branch on more than one remote in
- situations where an unambiguous argument would have
- otherwise caused a remote-tracking branch to be
- checked out. See the `checkout.defaultRemote`
- configuration variable for how to set a given remote
- to be used by default in some situations where this
- advice would be printed.
- amWorkDir::
- Advice that shows the location of the patch file when
- linkgit:git-am[1] fails to apply it.
- rmHints::
- In case of failure in the output of linkgit:git-rm[1],
- show directions on how to proceed from the current state.
- addEmbeddedRepo::
- Advice on what to do when you've accidentally added one
- git repo inside of another.
- ignoredHook::
- Advice shown if a hook is ignored because the hook is not
- set as executable.
- waitingForEditor::
- Print a message to the terminal whenever Git is waiting for
- editor input from the user.
- nestedTag::
- Advice shown if a user attempts to recursively tag a tag object.
submoduleAlternateErrorStrategyDie::
Advice shown when a submodule.alternateErrorStrategy option
configured to "die" causes a fatal error.
submodulesNotUpdated::
Advice shown when a user runs a submodule command that fails
because `git submodule update --init` was not run.
- addIgnoredFile::
- Advice shown if a user attempts to add an ignored file to
- the index.
- addEmptyPathspec::
- Advice shown if a user runs the add command without providing
- the pathspec parameter.
+ suggestDetachingHead::
+ Advice shown when linkgit:git-switch[1] refuses to detach HEAD
+ without the explicit `--detach` option.
updateSparsePath::
Advice shown when either linkgit:git-add[1] or linkgit:git-rm[1]
is asked to update index entries outside the current sparse
checkout.
- diverging::
- Advice shown when a fast-forward is not possible.
+ waitingForEditor::
+ Print a message to the terminal whenever Git is waiting for
+ editor input from the user.
worktreeAddOrphan::
Advice shown when a user tries to create a worktree from an
invalid reference, to instruct how to create a new unborn
diff --git a/advice.c b/advice.c
index 50c79443ba..03322caba0 100644
--- a/advice.c
+++ b/advice.c
@@ -40,12 +40,11 @@ static struct {
[ADVICE_ADD_EMBEDDED_REPO] = { "addEmbeddedRepo", 1 },
[ADVICE_ADD_EMPTY_PATHSPEC] = { "addEmptyPathspec", 1 },
[ADVICE_ADD_IGNORED_FILE] = { "addIgnoredFile", 1 },
- [ADVICE_AM_WORK_DIR] = { "amWorkDir", 1 },
[ADVICE_AMBIGUOUS_FETCH_REFSPEC] = { "ambiguousFetchRefspec", 1 },
+ [ADVICE_AM_WORK_DIR] = { "amWorkDir", 1 },
[ADVICE_CHECKOUT_AMBIGUOUS_REMOTE_BRANCH_NAME] = { "checkoutAmbiguousRemoteBranchName", 1 },
[ADVICE_COMMIT_BEFORE_MERGE] = { "commitBeforeMerge", 1 },
[ADVICE_DETACHED_HEAD] = { "detachedHead", 1 },
- [ADVICE_SUGGEST_DETACHING_HEAD] = { "suggestDetachingHead", 1 },
[ADVICE_DIVERGING] = { "diverging", 1 },
[ADVICE_FETCH_SHOW_FORCED_UPDATES] = { "fetchShowForcedUpdates", 1 },
[ADVICE_GRAFT_FILE_DEPRECATED] = { "graftFileDeprecated", 1 },
@@ -56,15 +55,12 @@ static struct {
[ADVICE_PUSH_ALREADY_EXISTS] = { "pushAlreadyExists", 1 },
[ADVICE_PUSH_FETCH_FIRST] = { "pushFetchFirst", 1 },
[ADVICE_PUSH_NEEDS_FORCE] = { "pushNeedsForce", 1 },
- [ADVICE_PUSH_REF_NEEDS_UPDATE] = { "pushRefNeedsUpdate", 1 },
-
- /* make this an alias for backward compatibility */
- [ADVICE_PUSH_UPDATE_REJECTED_ALIAS] = { "pushNonFastForward", 1 },
-
[ADVICE_PUSH_NON_FF_CURRENT] = { "pushNonFFCurrent", 1 },
[ADVICE_PUSH_NON_FF_MATCHING] = { "pushNonFFMatching", 1 },
+ [ADVICE_PUSH_REF_NEEDS_UPDATE] = { "pushRefNeedsUpdate", 1 },
[ADVICE_PUSH_UNQUALIFIED_REF_NAME] = { "pushUnqualifiedRefName", 1 },
[ADVICE_PUSH_UPDATE_REJECTED] = { "pushUpdateRejected", 1 },
+ [ADVICE_PUSH_UPDATE_REJECTED_ALIAS] = { "pushNonFastForward", 1 }, /* backwards compatibility */
[ADVICE_RESET_NO_REFRESH_WARNING] = { "resetNoRefresh", 1 },
[ADVICE_RESOLVE_CONFLICT] = { "resolveConflict", 1 },
[ADVICE_RM_HINTS] = { "rmHints", 1 },
@@ -74,8 +70,9 @@ static struct {
[ADVICE_STATUS_AHEAD_BEHIND_WARNING] = { "statusAheadBehindWarning", 1 },
[ADVICE_STATUS_HINTS] = { "statusHints", 1 },
[ADVICE_STATUS_U_OPTION] = { "statusUoption", 1 },
- [ADVICE_SUBMODULE_ALTERNATE_ERROR_STRATEGY_DIE] = { "submoduleAlternateErrorStrategyDie", 1 },
[ADVICE_SUBMODULES_NOT_UPDATED] = { "submodulesNotUpdated", 1 },
+ [ADVICE_SUBMODULE_ALTERNATE_ERROR_STRATEGY_DIE] = { "submoduleAlternateErrorStrategyDie", 1 },
+ [ADVICE_SUGGEST_DETACHING_HEAD] = { "suggestDetachingHead", 1 },
[ADVICE_UPDATE_SPARSE_PATH] = { "updateSparsePath", 1 },
[ADVICE_WAITING_FOR_EDITOR] = { "waitingForEditor", 1 },
[ADVICE_WORKTREE_ADD_ORPHAN] = { "worktreeAddOrphan", 1 },
diff --git a/advice.h b/advice.h
index 2affbe1426..9396bcdcf1 100644
--- a/advice.h
+++ b/advice.h
@@ -14,13 +14,12 @@ struct string_list;
ADVICE_ADD_EMBEDDED_REPO,
ADVICE_ADD_EMPTY_PATHSPEC,
ADVICE_ADD_IGNORED_FILE,
- ADVICE_AM_WORK_DIR,
ADVICE_AMBIGUOUS_FETCH_REFSPEC,
+ ADVICE_AM_WORK_DIR,
ADVICE_CHECKOUT_AMBIGUOUS_REMOTE_BRANCH_NAME,
ADVICE_COMMIT_BEFORE_MERGE,
ADVICE_DETACHED_HEAD,
ADVICE_DIVERGING,
- ADVICE_SUGGEST_DETACHING_HEAD,
ADVICE_FETCH_SHOW_FORCED_UPDATES,
ADVICE_GRAFT_FILE_DEPRECATED,
ADVICE_IGNORED_HOOK,
@@ -32,23 +31,24 @@ struct string_list;
ADVICE_PUSH_NEEDS_FORCE,
ADVICE_PUSH_NON_FF_CURRENT,
ADVICE_PUSH_NON_FF_MATCHING,
+ ADVICE_PUSH_REF_NEEDS_UPDATE,
ADVICE_PUSH_UNQUALIFIED_REF_NAME,
- ADVICE_PUSH_UPDATE_REJECTED_ALIAS,
ADVICE_PUSH_UPDATE_REJECTED,
- ADVICE_PUSH_REF_NEEDS_UPDATE,
+ ADVICE_PUSH_UPDATE_REJECTED_ALIAS,
ADVICE_RESET_NO_REFRESH_WARNING,
ADVICE_RESOLVE_CONFLICT,
ADVICE_RM_HINTS,
ADVICE_SEQUENCER_IN_USE,
ADVICE_SET_UPSTREAM_FAILURE,
+ ADVICE_SKIPPED_CHERRY_PICKS,
ADVICE_STATUS_AHEAD_BEHIND_WARNING,
ADVICE_STATUS_HINTS,
ADVICE_STATUS_U_OPTION,
- ADVICE_SUBMODULE_ALTERNATE_ERROR_STRATEGY_DIE,
ADVICE_SUBMODULES_NOT_UPDATED,
+ ADVICE_SUBMODULE_ALTERNATE_ERROR_STRATEGY_DIE,
+ ADVICE_SUGGEST_DETACHING_HEAD,
ADVICE_UPDATE_SPARSE_PATH,
ADVICE_WAITING_FOR_EDITOR,
- ADVICE_SKIPPED_CHERRY_PICKS,
ADVICE_WORKTREE_ADD_ORPHAN,
};
--
2.42.0
^ permalink raw reply related
* [PATCH v2 0/3] branch: error description when deleting a not fully merged branch
From: Rubén Justo @ 2024-01-11 12:20 UTC (permalink / raw)
To: Git List, Junio C Hamano
In-Reply-To: <04c3556f-0242-4ac3-b94a-be824cd2004a@gmail.com>
The error message we show when the user tries to delete a not fully
merged branch describes the error and gives a hint to the user:
error: the branch 'foo' is not fully merged.
If you are sure you want to delete it, run 'git branch -D foo'.
Let's move the hint part so that it is displayed using the advice
machinery:
error: the branch 'foo' is not fully merged
hint: If you are sure you want to delete it, run 'git branch -D foo'
hint: Disable this message with "git config advice.forceDeleteBranch false"
Rubén Justo (3):
advice: sort the advice related lists
advice: fix an unexpected leading space
branch: make the advice to force-deleting a conditional one
Documentation/config/advice.txt | 157 ++++++++++++++++----------------
advice.c | 14 ++-
advice.h | 15 +--
builtin/branch.c | 8 +-
4 files changed, 99 insertions(+), 95 deletions(-)
--
2.42.0
^ permalink raw reply
* Re: [DISCUSS] Introducing Rust into the Git project
From: Sam James @ 2024-01-11 11:45 UTC (permalink / raw)
To: me; +Cc: git
In-Reply-To: <ZZ77NQkSuiRxRDwt@nand.local>
Something I'm a bit concerned about is that right now, neither
rustc_codegen_gcc nor gccrs are ready for use here.
We've had trouble getting things wired up for rustc_codegen_gcc
- which is not to speak against their wonderful efforts - because
the Rust community hasn't yet figured out how to handle things which
pure rustc supports yet. See
e.g. https://github.com/rust-lang/libc/pull/3032.
I think care should be taken in citing rustc_codegen_gcc and gccrs
as options for alternative platforms for now. They will hopefully
be great options in the future, but they aren't today, and they probably
won't be in the next 6 months at the least.
We also do use git heavily on platforms which rustc isn't supported
yet.
thanks,
sam
^ permalink raw reply
* Re: [PATCH 1/2] t1401: generalize reference locking
From: Patrick Steinhardt @ 2024-01-11 11:08 UTC (permalink / raw)
To: Jeff King; +Cc: Justin Tobler via GitGitGadget, git, Justin Tobler
In-Reply-To: <20240111071329.GC48154@coredump.intra.peff.net>
[-- Attachment #1: Type: text/plain, Size: 2054 bytes --]
On Thu, Jan 11, 2024 at 02:13:29AM -0500, Jeff King wrote:
> On Wed, Jan 10, 2024 at 06:52:29PM +0000, Justin Tobler via GitGitGadget wrote:
>
> > From: Justin Tobler <jltobler@gmail.com>
> >
> > Some tests set up reference locks by directly creating the lockfile.
> > While this works for the files reference backend, reftable reference
> > locks operate differently and are incompatible with this approach.
> > Refactor the test to use git-update-ref(1) to lock refs instead so that
> > the test does not need to be aware of how the ref backend locks refs.
>
> It looks like you re-create this situation in a backend-agnostic way by
> having two simultaneous updates that conflict on the lock (but don't
> care how that lock is implemented).
>
> That works, but I think we could keep it simple. This test doesn't care
> about the exact error condition we create. The point was just to die in
> create_symref() and make sure the exit code was propagated. So something
> like this would work:
>
> $ git symbolic-ref refs/heads refs/heads/foo
> error: unable to write symref for refs/heads: Is a directory
>
> (note that you get a different error message if the refs are packed,
> since there we can notice the d/f conflict manually).
If all we care for is the exit code then this would work for the
reftable backend, too:
```
$ git init --ref-format=reftable repo
Initialized empty Git repository in /tmp/repo/.git/
$ cd repo/
$ git commit --allow-empty --message message
[main (root-commit) c2512d3] x
$ git symbolic-ref refs/heads refs/heads/foo
$ echo $?
1
```
A bit unfortunate that there is no proper error message in that case,
but that is a different topic.
Patrick
> There may be other ways to stimulate a failure. I thought "symbolic-ref
> HEAD refs/heads/.invalid" might work, but sadly the refname format check
> happens earlier.
>
> I think it is worth avoiding the fifo magic if we can. It's complicated,
> and it means that not all platforms run the test.
>
> -Peff
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply
* [PATCH 2/2] completion: silence pseudoref existence check
From: Patrick Steinhardt @ 2024-01-11 10:41 UTC (permalink / raw)
To: git; +Cc: Stan Hu
In-Reply-To: <cover.1704969119.git.ps@pks.im>
[-- Attachment #1: Type: text/plain, Size: 2092 bytes --]
In 44dbb3bf29 (completion: support pseudoref existence checks for
reftables, 2023-12-19), we have extended the Bash completion script to
support future ref backends better by using git-rev-parse(1) to check
for pseudo-ref existence. This conversion has introduced a bug, because
even though we pass `--quiet` to git-rev-parse(1) it would still output
the resolved object ID of the ref in question if it exists.
Fix this by redirecting its stdout to `/dev/null` and add a test that
catches this behaviour. Note that the test passes even without the fix
for the "files" backend because we parse pseudo refs via the filesystem
directly in that case. But the test will fail with the "reftable"
backend.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
contrib/completion/git-completion.bash | 2 +-
t/t9902-completion.sh | 8 ++++++++
2 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
index 8c40ade494..8872192e2b 100644
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -146,7 +146,7 @@ __git_pseudoref_exists ()
if __git_eread "$__git_repo_path/HEAD" head; then
b="${head#ref: }"
if [ "$b" == "refs/heads/.invalid" ]; then
- __git -C "$__git_repo_path" rev-parse --verify --quiet "$ref" 2>/dev/null
+ __git -C "$__git_repo_path" rev-parse --verify --quiet "$ref" >/dev/null 2>&1
return $?
fi
fi
diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh
index 78cb93bea7..b14ae4de14 100755
--- a/t/t9902-completion.sh
+++ b/t/t9902-completion.sh
@@ -1927,6 +1927,14 @@ test_expect_success 'git checkout - --orphan with branch already provided comple
EOF
'
+test_expect_success 'git reset completes modified files' '
+ test_commit A a.file &&
+ echo B >a.file &&
+ test_completion "git restore a." <<-\EOF
+ a.file
+ EOF
+'
+
test_expect_success 'teardown after ref completion' '
git branch -d matching-branch &&
git tag -d matching-tag &&
--
2.43.GIT
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related
* [PATCH 1/2] t9902: verify that completion does not print anything
From: Patrick Steinhardt @ 2024-01-11 10:41 UTC (permalink / raw)
To: git; +Cc: Stan Hu
In-Reply-To: <cover.1704969119.git.ps@pks.im>
[-- Attachment #1: Type: text/plain, Size: 908 bytes --]
The Bash completion script must not print anything to either stdout or
stderr. Instead, it is only expected to populate certain variables.
Tighten our `test_completion ()` test helper to verify this requirement.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
t/t9902-completion.sh | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/t/t9902-completion.sh b/t/t9902-completion.sh
index aa9a614de3..78cb93bea7 100755
--- a/t/t9902-completion.sh
+++ b/t/t9902-completion.sh
@@ -87,9 +87,11 @@ test_completion ()
else
sed -e 's/Z$//' |sort >expected
fi &&
- run_completion "$1" &&
+ run_completion "$1" >"$TRASH_DIRECTORY"/output 2>&1 &&
sort out >out_sorted &&
- test_cmp expected out_sorted
+ test_cmp expected out_sorted &&
+ test_must_be_empty "$TRASH_DIRECTORY"/output &&
+ rm "$TRASH_DIRECTORY"/output
}
# Test __gitcomp.
--
2.43.GIT
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related
* [PATCH 0/2] completion: silence pseudo-ref existence check
From: Patrick Steinhardt @ 2024-01-11 10:41 UTC (permalink / raw)
To: git; +Cc: Stan Hu
[-- Attachment #1: Type: text/plain, Size: 536 bytes --]
Hi,
I recently noticed that the Bash completion script started to output
object IDs in repositories which use the "reftable" backend when
completing certain commands. This patch series fixes this issue.
Patrick
Patrick Steinhardt (2):
t9902: verify that completion does not print anything
completion: silence pseudoref existence check
contrib/completion/git-completion.bash | 2 +-
t/t9902-completion.sh | 14 ++++++++++++--
2 files changed, 13 insertions(+), 3 deletions(-)
--
2.43.GIT
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply
* [PATCH v2 5/5] reftable/blocksource: use mmap to read tables
From: Patrick Steinhardt @ 2024-01-11 10:06 UTC (permalink / raw)
To: git; +Cc: Han-Wen Nienhuys, Junio C Hamano
In-Reply-To: <cover.1704966670.git.ps@pks.im>
[-- Attachment #1: Type: text/plain, Size: 6238 bytes --]
The blocksource interface provides an interface to read blocks from a
reftable table. This interface is implemented using read(3P) calls on
the underlying file descriptor. While this works alright, this pattern
is very inefficient when repeatedly querying the reftable stack for one
or more refs. This inefficiency can mostly be attributed to the fact
that we often need to re-read the same blocks over and over again, and
every single time we need to call read(3P) again.
A natural fit in this context is to use mmap(3P) instead of read(3P),
which has a bunch of benefits:
- We do not need to come up with a caching strategy for some of the
blocks as this will be handled by the kernel already.
- We can avoid the overhead of having to call into the read(3P)
syscall repeatedly.
- We do not need to allocate returned blocks repeatedly, but can
instead hand out pointers into the mmapped region directly.
Using mmap comes with a significant drawback on Windows though, because
mmapped files cannot be deleted and neither is it possible to rename
files onto an mmapped file. But for one, the reftable library gracefully
handles the case where auto-compaction cannot delete a still-open stack
already and ignores any such errors. Also, `reftable_stack_clean()` will
prune stale tables which are not referenced by "tables.list" anymore so
that those files can eventually be pruned. And second, we never rewrite
already-written stacks, so it does not matter that we cannot rename a
file over an mmaped file, either.
Another unfortunate property of mmap is that it is not supported by all
systems. But given that the size of reftables should typically be rather
limited (megabytes at most in the vast majority of repositories), we can
use the fallback implementation provided by `git_mmap()` which reads the
whole file into memory instead. This is the same strategy that the
"packed" backend uses.
While this change doesn't significantly improve performance in the case
where we're seeking through stacks once (like e.g. git-for-each-ref(1)
would). But it does speed up usecases where there is lots of random
access to refs, e.g. when writing. The following benchmark demonstrates
these savings with git-update-ref(1) creating N refs in an otherwise
empty repository:
Benchmark 1: update-ref: create many refs (refcount = 1, revision = HEAD~)
Time (mean ± σ): 5.1 ms ± 0.2 ms [User: 2.5 ms, System: 2.5 ms]
Range (min … max): 4.8 ms … 7.1 ms 111 runs
Benchmark 2: update-ref: create many refs (refcount = 100, revision = HEAD~)
Time (mean ± σ): 14.8 ms ± 0.5 ms [User: 7.1 ms, System: 7.5 ms]
Range (min … max): 14.1 ms … 18.7 ms 84 runs
Benchmark 3: update-ref: create many refs (refcount = 10000, revision = HEAD~)
Time (mean ± σ): 926.4 ms ± 5.6 ms [User: 448.5 ms, System: 477.7 ms]
Range (min … max): 920.0 ms … 936.1 ms 10 runs
Benchmark 4: update-ref: create many refs (refcount = 1, revision = HEAD)
Time (mean ± σ): 5.0 ms ± 0.2 ms [User: 2.4 ms, System: 2.5 ms]
Range (min … max): 4.7 ms … 5.4 ms 111 runs
Benchmark 5: update-ref: create many refs (refcount = 100, revision = HEAD)
Time (mean ± σ): 10.5 ms ± 0.2 ms [User: 5.0 ms, System: 5.3 ms]
Range (min … max): 10.0 ms … 10.9 ms 93 runs
Benchmark 6: update-ref: create many refs (refcount = 10000, revision = HEAD)
Time (mean ± σ): 529.6 ms ± 9.1 ms [User: 268.0 ms, System: 261.4 ms]
Range (min … max): 522.4 ms … 547.1 ms 10 runs
Summary
update-ref: create many refs (refcount = 1, revision = HEAD) ran
1.01 ± 0.06 times faster than update-ref: create many refs (refcount = 1, revision = HEAD~)
2.08 ± 0.07 times faster than update-ref: create many refs (refcount = 100, revision = HEAD)
2.95 ± 0.14 times faster than update-ref: create many refs (refcount = 100, revision = HEAD~)
105.33 ± 3.76 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD)
184.24 ± 5.89 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD~)
Theoretically, we could also replicate the strategy of the "packed"
backend where small tables are read into memory instead of using mmap.
Benchmarks did not confirm that this has a performance benefit though.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
reftable/blocksource.c | 22 +++++++---------------
1 file changed, 7 insertions(+), 15 deletions(-)
diff --git a/reftable/blocksource.c b/reftable/blocksource.c
index 1e2fb5e9fd..8c41e3c70f 100644
--- a/reftable/blocksource.c
+++ b/reftable/blocksource.c
@@ -76,8 +76,8 @@ struct reftable_block_source malloc_block_source(void)
}
struct file_block_source {
- int fd;
uint64_t size;
+ unsigned char *data;
};
static uint64_t file_size(void *b)
@@ -87,19 +87,12 @@ static uint64_t file_size(void *b)
static void file_return_block(void *b, struct reftable_block *dest)
{
- if (dest->len)
- memset(dest->data, 0xff, dest->len);
- reftable_free(dest->data);
}
-static void file_close(void *b)
+static void file_close(void *v)
{
- int fd = ((struct file_block_source *)b)->fd;
- if (fd > 0) {
- close(fd);
- ((struct file_block_source *)b)->fd = 0;
- }
-
+ struct file_block_source *b = v;
+ munmap(b->data, b->size);
reftable_free(b);
}
@@ -108,9 +101,7 @@ static int file_read_block(void *v, struct reftable_block *dest, uint64_t off,
{
struct file_block_source *b = v;
assert(off + size <= b->size);
- dest->data = reftable_malloc(size);
- if (pread_in_full(b->fd, dest->data, size, off) != size)
- return -1;
+ dest->data = b->data + off;
dest->len = size;
return size;
}
@@ -143,7 +134,8 @@ int reftable_block_source_from_file(struct reftable_block_source *bs,
p = reftable_calloc(sizeof(*p));
p->size = st.st_size;
- p->fd = fd;
+ p->data = xmmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
+ close(fd);
assert(!bs->ops);
bs->ops = &file_vtable;
--
2.43.GIT
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related
* [PATCH v2 4/5] reftable/blocksource: refactor code to match our coding style
From: Patrick Steinhardt @ 2024-01-11 10:06 UTC (permalink / raw)
To: git; +Cc: Han-Wen Nienhuys, Junio C Hamano
In-Reply-To: <cover.1704966670.git.ps@pks.im>
[-- Attachment #1: Type: text/plain, Size: 1206 bytes --]
Refactor `reftable_block_source_from_file()` to match our coding style
better.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
reftable/blocksource.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/reftable/blocksource.c b/reftable/blocksource.c
index a1ea304429..1e2fb5e9fd 100644
--- a/reftable/blocksource.c
+++ b/reftable/blocksource.c
@@ -125,24 +125,23 @@ static struct reftable_block_source_vtable file_vtable = {
int reftable_block_source_from_file(struct reftable_block_source *bs,
const char *name)
{
- struct stat st = { 0 };
- int err = 0;
- int fd = open(name, O_RDONLY);
- struct file_block_source *p = NULL;
+ struct file_block_source *p;
+ struct stat st;
+ int fd;
+
+ fd = open(name, O_RDONLY);
if (fd < 0) {
- if (errno == ENOENT) {
+ if (errno == ENOENT)
return REFTABLE_NOT_EXIST_ERROR;
- }
return -1;
}
- err = fstat(fd, &st);
- if (err < 0) {
+ if (fstat(fd, &st) < 0) {
close(fd);
return REFTABLE_IO_ERROR;
}
- p = reftable_calloc(sizeof(struct file_block_source));
+ p = reftable_calloc(sizeof(*p));
p->size = st.st_size;
p->fd = fd;
--
2.43.GIT
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related
* [PATCH v2 3/5] reftable/stack: use stat info to avoid re-reading stack list
From: Patrick Steinhardt @ 2024-01-11 10:06 UTC (permalink / raw)
To: git; +Cc: Han-Wen Nienhuys, Junio C Hamano
In-Reply-To: <cover.1704966670.git.ps@pks.im>
[-- Attachment #1: Type: text/plain, Size: 5676 bytes --]
Whenever we call into the refs interfaces we potentially have to reload
refs in case they have been concurrently modified, either in-process or
externally. While this happens somewhat automatically for loose refs
because we simply try to re-read the files, the "packed" backend will
reload its snapshot of the packed-refs file in case its stat info has
changed since last reading it.
In the reftable backend we have a similar mechanism that is provided by
`reftable_stack_reload()`. This function will read the list of stacks
from "tables.list" and, if they have changed from the currently stored
list, reload the stacks. This is heavily inefficient though, as we have
to check whether the stack is up-to-date on basically every read and
thus keep on re-reading the file all the time even if it didn't change
at all.
We can do better and use the same stat(3P)-based mechanism that the
"packed" backend uses. Instead of reading the file, we will only open
the file descriptor, fstat(3P) it, and then compare the info against the
cached value from the last time we have updated the stack. This should
always work alright because "tables.list" is updated atomically via a
rename, so even if the ctime or mtime wasn't granular enough to identify
a change, at least the inode number or file size should have changed.
This change significantly speeds up operations where many refs are read,
like when using git-update-ref(1). The following benchmark creates N
refs in an otherwise-empty repository via `git update-ref --stdin`:
Benchmark 1: update-ref: create many refs (refcount = 1, revision = HEAD~)
Time (mean ± σ): 5.1 ms ± 0.2 ms [User: 2.4 ms, System: 2.6 ms]
Range (min … max): 4.8 ms … 7.2 ms 109 runs
Benchmark 2: update-ref: create many refs (refcount = 100, revision = HEAD~)
Time (mean ± σ): 19.1 ms ± 0.9 ms [User: 8.9 ms, System: 9.9 ms]
Range (min … max): 18.4 ms … 26.7 ms 72 runs
Benchmark 3: update-ref: create many refs (refcount = 10000, revision = HEAD~)
Time (mean ± σ): 1.336 s ± 0.018 s [User: 0.590 s, System: 0.724 s]
Range (min … max): 1.314 s … 1.373 s 10 runs
Benchmark 4: update-ref: create many refs (refcount = 1, revision = HEAD)
Time (mean ± σ): 5.1 ms ± 0.2 ms [User: 2.4 ms, System: 2.6 ms]
Range (min … max): 4.8 ms … 7.2 ms 109 runs
Benchmark 5: update-ref: create many refs (refcount = 100, revision = HEAD)
Time (mean ± σ): 14.8 ms ± 0.2 ms [User: 7.1 ms, System: 7.5 ms]
Range (min … max): 14.2 ms … 15.2 ms 82 runs
Benchmark 6: update-ref: create many refs (refcount = 10000, revision = HEAD)
Time (mean ± σ): 927.6 ms ± 5.3 ms [User: 437.8 ms, System: 489.5 ms]
Range (min … max): 919.4 ms … 936.4 ms 10 runs
Summary
update-ref: create many refs (refcount = 1, revision = HEAD) ran
1.00 ± 0.07 times faster than update-ref: create many refs (refcount = 1, revision = HEAD~)
2.89 ± 0.14 times faster than update-ref: create many refs (refcount = 100, revision = HEAD)
3.74 ± 0.25 times faster than update-ref: create many refs (refcount = 100, revision = HEAD~)
181.26 ± 8.30 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD)
261.01 ± 12.35 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD~)
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
reftable/stack.c | 12 +++++++++++-
reftable/stack.h | 1 +
reftable/system.h | 1 +
3 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/reftable/stack.c b/reftable/stack.c
index b1ee247601..c28d82299d 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -175,6 +175,7 @@ void reftable_stack_destroy(struct reftable_stack *st)
st->readers_len = 0;
FREE_AND_NULL(st->readers);
}
+ stat_validity_clear(&st->list_validity);
FREE_AND_NULL(st->list_file);
FREE_AND_NULL(st->reftable_dir);
reftable_free(st);
@@ -374,7 +375,11 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
sleep_millisec(delay);
}
+ stat_validity_update(&st->list_validity, fd);
+
out:
+ if (err)
+ stat_validity_clear(&st->list_validity);
if (fd >= 0)
close(fd);
free_names(names);
@@ -388,8 +393,13 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
static int stack_uptodate(struct reftable_stack *st)
{
char **names = NULL;
- int err = read_lines(st->list_file, &names);
+ int err;
int i = 0;
+
+ if (stat_validity_check(&st->list_validity, st->list_file))
+ return 0;
+
+ err = read_lines(st->list_file, &names);
if (err < 0)
return err;
diff --git a/reftable/stack.h b/reftable/stack.h
index f57005846e..3f80cc598a 100644
--- a/reftable/stack.h
+++ b/reftable/stack.h
@@ -14,6 +14,7 @@ license that can be found in the LICENSE file or at
#include "reftable-stack.h"
struct reftable_stack {
+ struct stat_validity list_validity;
char *list_file;
char *reftable_dir;
int disable_auto_compact;
diff --git a/reftable/system.h b/reftable/system.h
index 6b74a81514..2cc7adf271 100644
--- a/reftable/system.h
+++ b/reftable/system.h
@@ -12,6 +12,7 @@ license that can be found in the LICENSE file or at
/* This header glues the reftable library to the rest of Git */
#include "git-compat-util.h"
+#include "statinfo.h"
#include "strbuf.h"
#include "hash-ll.h" /* hash ID, sizes.*/
#include "dir.h" /* remove_dir_recursively, for tests.*/
--
2.43.GIT
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related
* [PATCH v2 2/5] reftable/stack: refactor reloading to use file descriptor
From: Patrick Steinhardt @ 2024-01-11 10:06 UTC (permalink / raw)
To: git; +Cc: Han-Wen Nienhuys, Junio C Hamano
In-Reply-To: <cover.1704966670.git.ps@pks.im>
[-- Attachment #1: Type: text/plain, Size: 1900 bytes --]
We're about to introduce a stat(3P)-based caching mechanism to reload
the list of stacks only when it has changed. In order to avoid race
conditions this requires us to have a file descriptor available that we
can use to call fstat(3P) on.
Prepare for this by converting the code to use `fd_read_lines()` so that
we have the file descriptor readily available.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
reftable/stack.c | 21 ++++++++++++++++++---
1 file changed, 18 insertions(+), 3 deletions(-)
diff --git a/reftable/stack.c b/reftable/stack.c
index bf869a6772..b1ee247601 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -308,6 +308,7 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
struct timeval deadline;
int64_t delay = 0;
int tries = 0, err;
+ int fd = -1;
err = gettimeofday(&deadline, NULL);
if (err < 0)
@@ -329,9 +330,19 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
if (tries > 3 && tv_cmp(&now, &deadline) >= 0)
goto out;
- err = read_lines(st->list_file, &names);
- if (err < 0)
- goto out;
+ fd = open(st->list_file, O_RDONLY);
+ if (fd < 0) {
+ if (errno != ENOENT) {
+ err = REFTABLE_IO_ERROR;
+ goto out;
+ }
+
+ names = reftable_calloc(sizeof(char *));
+ } else {
+ err = fd_read_lines(fd, &names);
+ if (err < 0)
+ goto out;
+ }
err = reftable_stack_reload_once(st, names, reuse_open);
if (!err)
@@ -356,12 +367,16 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
names = NULL;
free_names(names_after);
names_after = NULL;
+ close(fd);
+ fd = -1;
delay = delay + (delay * rand()) / RAND_MAX + 1;
sleep_millisec(delay);
}
out:
+ if (fd >= 0)
+ close(fd);
free_names(names);
free_names(names_after);
return err;
--
2.43.GIT
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related
* [PATCH v2 1/5] reftable/stack: refactor stack reloading to have common exit path
From: Patrick Steinhardt @ 2024-01-11 10:06 UTC (permalink / raw)
To: git; +Cc: Han-Wen Nienhuys, Junio C Hamano
In-Reply-To: <cover.1704966670.git.ps@pks.im>
[-- Attachment #1: Type: text/plain, Size: 3371 bytes --]
The `reftable_stack_reload_maybe_reuse()` function is responsible for
reloading the reftable list from disk. The function is quite hard to
follow though because it has a bunch of different exit paths, many of
which have to free the same set of resources.
Refactor the function to have a common exit path. While at it, touch up
the style of this function a bit to match our usual coding style better.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
reftable/stack.c | 86 +++++++++++++++++++++++-------------------------
1 file changed, 42 insertions(+), 44 deletions(-)
diff --git a/reftable/stack.c b/reftable/stack.c
index 16bab82063..bf869a6772 100644
--- a/reftable/stack.c
+++ b/reftable/stack.c
@@ -304,69 +304,67 @@ static int tv_cmp(struct timeval *a, struct timeval *b)
static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
int reuse_open)
{
- struct timeval deadline = { 0 };
- int err = gettimeofday(&deadline, NULL);
+ char **names = NULL, **names_after = NULL;
+ struct timeval deadline;
int64_t delay = 0;
- int tries = 0;
- if (err < 0)
- return err;
+ int tries = 0, err;
+ err = gettimeofday(&deadline, NULL);
+ if (err < 0)
+ goto out;
deadline.tv_sec += 3;
+
while (1) {
- char **names = NULL;
- char **names_after = NULL;
- struct timeval now = { 0 };
- int err = gettimeofday(&now, NULL);
- int err2 = 0;
- if (err < 0) {
- return err;
- }
+ struct timeval now;
- /* Only look at deadlines after the first few times. This
- simplifies debugging in GDB */
+ err = gettimeofday(&now, NULL);
+ if (err < 0)
+ goto out;
+
+ /*
+ * Only look at deadlines after the first few times. This
+ * simplifies debugging in GDB.
+ */
tries++;
- if (tries > 3 && tv_cmp(&now, &deadline) >= 0) {
- break;
- }
+ if (tries > 3 && tv_cmp(&now, &deadline) >= 0)
+ goto out;
err = read_lines(st->list_file, &names);
- if (err < 0) {
- free_names(names);
- return err;
- }
+ if (err < 0)
+ goto out;
+
err = reftable_stack_reload_once(st, names, reuse_open);
- if (err == 0) {
- free_names(names);
+ if (!err)
break;
- }
- if (err != REFTABLE_NOT_EXIST_ERROR) {
- free_names(names);
- return err;
- }
-
- /* err == REFTABLE_NOT_EXIST_ERROR can be caused by a concurrent
- writer. Check if there was one by checking if the name list
- changed.
- */
- err2 = read_lines(st->list_file, &names_after);
- if (err2 < 0) {
- free_names(names);
- return err2;
- }
-
+ if (err != REFTABLE_NOT_EXIST_ERROR)
+ goto out;
+
+ /*
+ * REFTABLE_NOT_EXIST_ERROR can be caused by a concurrent
+ * writer. Check if there was one by checking if the name list
+ * changed.
+ */
+ err = read_lines(st->list_file, &names_after);
+ if (err < 0)
+ goto out;
if (names_equal(names_after, names)) {
- free_names(names);
- free_names(names_after);
- return err;
+ err = REFTABLE_NOT_EXIST_ERROR;
+ goto out;
}
+
free_names(names);
+ names = NULL;
free_names(names_after);
+ names_after = NULL;
delay = delay + (delay * rand()) / RAND_MAX + 1;
sleep_millisec(delay);
}
- return 0;
+out:
+ free_names(names);
+ free_names(names_after);
+ return err;
}
/* -1 = error
--
2.43.GIT
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply related
* [PATCH v2 0/5] reftable: optimize I/O patterns
From: Patrick Steinhardt @ 2024-01-11 10:06 UTC (permalink / raw)
To: git; +Cc: Han-Wen Nienhuys, Junio C Hamano
In-Reply-To: <cover.1704714575.git.ps@pks.im>
[-- Attachment #1: Type: text/plain, Size: 15504 bytes --]
Hi,
this is the second version of my patch series that optimizes some I/O
patterns that the reftable library uses. Changes compared to v1:
- Added missing signoffs.
- The validity cache is now cleared when reloading the reftable stack
fails.
- Style fixes for `reftable_block_source_from_file()` are now in a
separate commit.
- We now use `xmmap()` exclusively, relying on its fallback code on
NO_MMAP platforms.
- The file descriptor is closed immediately after mmapping now.
Thanks for your review, Junio!
Patrick
Patrick Steinhardt (5):
reftable/stack: refactor stack reloading to have common exit path
reftable/stack: refactor reloading to use file descriptor
reftable/stack: use stat info to avoid re-reading stack list
reftable/blocksource: refactor code to match our coding style
reftable/blocksource: use mmap to read tables
reftable/blocksource.c | 39 ++++++--------
reftable/stack.c | 113 +++++++++++++++++++++++++----------------
reftable/stack.h | 1 +
reftable/system.h | 1 +
4 files changed, 85 insertions(+), 69 deletions(-)
Range-diff against v1:
1: 01ece2626d ! 1: 4b7f52c415 reftable/stack: refactor stack reloading to have common exit path
@@ Commit message
Refactor the function to have a common exit path. While at it, touch up
the style of this function a bit to match our usual coding style better.
+ Signed-off-by: Patrick Steinhardt <ps@pks.im>
+
## reftable/stack.c ##
@@ reftable/stack.c: static int tv_cmp(struct timeval *a, struct timeval *b)
static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
2: 726d302d7b ! 2: 36b9f6b624 reftable/stack: refactor reloading to use file descriptor
@@ Commit message
Prepare for this by converting the code to use `fd_read_lines()` so that
we have the file descriptor readily available.
+ Signed-off-by: Patrick Steinhardt <ps@pks.im>
+
## reftable/stack.c ##
@@ reftable/stack.c: static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
struct timeval deadline;
3: 4fabdc3d80 ! 3: c0f7cec95b reftable/stack: use stat info to avoid re-reading stack list
@@ Commit message
cached value from the last time we have updated the stack. This should
always work alright because "tables.list" is updated atomically via a
rename, so even if the ctime or mtime wasn't granular enough to identify
- a change, at least the inode number should have changed.
+ a change, at least the inode number or file size should have changed.
This change significantly speeds up operations where many refs are read,
like when using git-update-ref(1). The following benchmark creates N
refs in an otherwise-empty repository via `git update-ref --stdin`:
Benchmark 1: update-ref: create many refs (refcount = 1, revision = HEAD~)
- Time (mean ± σ): 5.6 ms ± 0.2 ms [User: 2.5 ms, System: 3.0 ms]
- Range (min … max): 5.2 ms … 6.0 ms 89 runs
+ Time (mean ± σ): 5.1 ms ± 0.2 ms [User: 2.4 ms, System: 2.6 ms]
+ Range (min … max): 4.8 ms … 7.2 ms 109 runs
Benchmark 2: update-ref: create many refs (refcount = 100, revision = HEAD~)
- Time (mean ± σ): 19.2 ms ± 0.3 ms [User: 8.6 ms, System: 10.4 ms]
- Range (min … max): 18.4 ms … 19.8 ms 63 runs
+ Time (mean ± σ): 19.1 ms ± 0.9 ms [User: 8.9 ms, System: 9.9 ms]
+ Range (min … max): 18.4 ms … 26.7 ms 72 runs
Benchmark 3: update-ref: create many refs (refcount = 10000, revision = HEAD~)
- Time (mean ± σ): 1.310 s ± 0.014 s [User: 0.566 s, System: 0.724 s]
- Range (min … max): 1.291 s … 1.342 s 10 runs
+ Time (mean ± σ): 1.336 s ± 0.018 s [User: 0.590 s, System: 0.724 s]
+ Range (min … max): 1.314 s … 1.373 s 10 runs
Benchmark 4: update-ref: create many refs (refcount = 1, revision = HEAD)
- Time (mean ± σ): 5.7 ms ± 0.4 ms [User: 2.6 ms, System: 3.1 ms]
- Range (min … max): 5.1 ms … 9.5 ms 91 runs
+ Time (mean ± σ): 5.1 ms ± 0.2 ms [User: 2.4 ms, System: 2.6 ms]
+ Range (min … max): 4.8 ms … 7.2 ms 109 runs
Benchmark 5: update-ref: create many refs (refcount = 100, revision = HEAD)
- Time (mean ± σ): 15.2 ms ± 0.3 ms [User: 7.0 ms, System: 8.1 ms]
- Range (min … max): 14.3 ms … 17.1 ms 71 runs
+ Time (mean ± σ): 14.8 ms ± 0.2 ms [User: 7.1 ms, System: 7.5 ms]
+ Range (min … max): 14.2 ms … 15.2 ms 82 runs
Benchmark 6: update-ref: create many refs (refcount = 10000, revision = HEAD)
- Time (mean ± σ): 916.1 ms ± 11.0 ms [User: 420.8 ms, System: 495.0 ms]
- Range (min … max): 906.9 ms … 943.8 ms 10 runs
+ Time (mean ± σ): 927.6 ms ± 5.3 ms [User: 437.8 ms, System: 489.5 ms]
+ Range (min … max): 919.4 ms … 936.4 ms 10 runs
Summary
- update-ref: create many refs (refcount = 1, revision = HEAD~) ran
- 1.01 ± 0.09 times faster than update-ref: create many refs (refcount = 1, revision = HEAD)
- 2.72 ± 0.11 times faster than update-ref: create many refs (refcount = 100, revision = HEAD)
- 3.42 ± 0.13 times faster than update-ref: create many refs (refcount = 100, revision = HEAD~)
- 163.59 ± 5.62 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD)
- 233.91 ± 7.92 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD~)
+ update-ref: create many refs (refcount = 1, revision = HEAD) ran
+ 1.00 ± 0.07 times faster than update-ref: create many refs (refcount = 1, revision = HEAD~)
+ 2.89 ± 0.14 times faster than update-ref: create many refs (refcount = 100, revision = HEAD)
+ 3.74 ± 0.25 times faster than update-ref: create many refs (refcount = 100, revision = HEAD~)
+ 181.26 ± 8.30 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD)
+ 261.01 ± 12.35 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD~)
+
+ Signed-off-by: Patrick Steinhardt <ps@pks.im>
## reftable/stack.c ##
@@ reftable/stack.c: void reftable_stack_destroy(struct reftable_stack *st)
@@ reftable/stack.c: static int reftable_stack_reload_maybe_reuse(struct reftable_s
+ stat_validity_update(&st->list_validity, fd);
+
out:
++ if (err)
++ stat_validity_clear(&st->list_validity);
if (fd >= 0)
close(fd);
+ free_names(names);
@@ reftable/stack.c: static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
static int stack_uptodate(struct reftable_stack *st)
{
-: ---------- > 4: 96e79dc3ba reftable/blocksource: refactor code to match our coding style
4: a23f38a806 ! 5: e53a20c8e1 reftable/blocksource: use mmap to read tables
@@ Commit message
already and ignores any such errors. Also, `reftable_stack_clean()` will
prune stale tables which are not referenced by "tables.list" anymore so
that those files can eventually be pruned. And second, we never rewrite
- already-rewritten stacks, so it does not matter that we cannot rename a
+ already-written stacks, so it does not matter that we cannot rename a
file over an mmaped file, either.
Another unfortunate property of mmap is that it is not supported by all
systems. But given that the size of reftables should typically be rather
limited (megabytes at most in the vast majority of repositories), we can
- provide an easy fallback by just reading the complete table into memory
- on such platforms. This is the same strategy that the "packed" backend
- uses in case the platform does not provide mmap.
+ use the fallback implementation provided by `git_mmap()` which reads the
+ whole file into memory instead. This is the same strategy that the
+ "packed" backend uses.
While this change doesn't significantly improve performance in the case
where we're seeking through stacks once (like e.g. git-for-each-ref(1)
@@ Commit message
empty repository:
Benchmark 1: update-ref: create many refs (refcount = 1, revision = HEAD~)
- Time (mean ± σ): 5.6 ms ± 0.2 ms [User: 2.7 ms, System: 2.8 ms]
- Range (min … max): 5.1 ms … 6.0 ms 90 runs
+ Time (mean ± σ): 5.1 ms ± 0.2 ms [User: 2.5 ms, System: 2.5 ms]
+ Range (min … max): 4.8 ms … 7.1 ms 111 runs
Benchmark 2: update-ref: create many refs (refcount = 100, revision = HEAD~)
- Time (mean ± σ): 15.1 ms ± 0.4 ms [User: 7.1 ms, System: 8.0 ms]
- Range (min … max): 14.2 ms … 16.5 ms 71 runs
+ Time (mean ± σ): 14.8 ms ± 0.5 ms [User: 7.1 ms, System: 7.5 ms]
+ Range (min … max): 14.1 ms … 18.7 ms 84 runs
Benchmark 3: update-ref: create many refs (refcount = 10000, revision = HEAD~)
- Time (mean ± σ): 919.4 ms ± 11.2 ms [User: 427.5 ms, System: 491.7 ms]
- Range (min … max): 908.1 ms … 936.6 ms 10 runs
+ Time (mean ± σ): 926.4 ms ± 5.6 ms [User: 448.5 ms, System: 477.7 ms]
+ Range (min … max): 920.0 ms … 936.1 ms 10 runs
Benchmark 4: update-ref: create many refs (refcount = 1, revision = HEAD)
- Time (mean ± σ): 5.5 ms ± 0.3 ms [User: 2.4 ms, System: 3.1 ms]
- Range (min … max): 5.0 ms … 7.3 ms 89 runs
+ Time (mean ± σ): 5.0 ms ± 0.2 ms [User: 2.4 ms, System: 2.5 ms]
+ Range (min … max): 4.7 ms … 5.4 ms 111 runs
Benchmark 5: update-ref: create many refs (refcount = 100, revision = HEAD)
- Time (mean ± σ): 10.4 ms ± 0.3 ms [User: 5.1 ms, System: 5.3 ms]
- Range (min … max): 9.7 ms … 11.0 ms 78 runs
+ Time (mean ± σ): 10.5 ms ± 0.2 ms [User: 5.0 ms, System: 5.3 ms]
+ Range (min … max): 10.0 ms … 10.9 ms 93 runs
Benchmark 6: update-ref: create many refs (refcount = 10000, revision = HEAD)
- Time (mean ± σ): 483.5 ms ± 6.3 ms [User: 227.8 ms, System: 255.6 ms]
- Range (min … max): 477.5 ms … 498.8 ms 10 runs
+ Time (mean ± σ): 529.6 ms ± 9.1 ms [User: 268.0 ms, System: 261.4 ms]
+ Range (min … max): 522.4 ms … 547.1 ms 10 runs
Summary
update-ref: create many refs (refcount = 1, revision = HEAD) ran
1.01 ± 0.06 times faster than update-ref: create many refs (refcount = 1, revision = HEAD~)
- 1.89 ± 0.11 times faster than update-ref: create many refs (refcount = 100, revision = HEAD)
- 2.73 ± 0.16 times faster than update-ref: create many refs (refcount = 100, revision = HEAD~)
- 87.55 ± 4.65 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD)
- 166.48 ± 8.80 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD~)
+ 2.08 ± 0.07 times faster than update-ref: create many refs (refcount = 100, revision = HEAD)
+ 2.95 ± 0.14 times faster than update-ref: create many refs (refcount = 100, revision = HEAD~)
+ 105.33 ± 3.76 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD)
+ 184.24 ± 5.89 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD~)
Theoretically, we could also replicate the strategy of the "packed"
backend where small tables are read into memory instead of using mmap.
Benchmarks did not confirm that this has a performance benefit though.
+ Signed-off-by: Patrick Steinhardt <ps@pks.im>
+
## reftable/blocksource.c ##
-@@ reftable/blocksource.c: license that can be found in the LICENSE file or at
- #include "reftable-blocksource.h"
- #include "reftable-error.h"
-
-+#if defined(NO_MMAP)
-+static int use_mmap = 0;
-+#else
-+static int use_mmap = 1;
-+#endif
-+
- static void strbuf_return_block(void *b, struct reftable_block *dest)
- {
- if (dest->len)
@@ reftable/blocksource.c: struct reftable_block_source malloc_block_source(void)
+ }
+
struct file_block_source {
- int fd;
+- int fd;
uint64_t size;
+ unsigned char *data;
};
@@ reftable/blocksource.c: static uint64_t file_size(void *b)
- if (fd > 0) {
- close(fd);
- ((struct file_block_source *)b)->fd = 0;
+- }
+-
+ struct file_block_source *b = v;
-+
-+ if (b->fd >= 0) {
-+ close(b->fd);
-+ b->fd = -1;
- }
-
-+ if (use_mmap)
-+ munmap(b->data, b->size);
-+ else
-+ reftable_free(b->data);
-+ b->data = NULL;
-+
++ munmap(b->data, b->size);
reftable_free(b);
}
@@ reftable/blocksource.c: static int file_read_block(void *v, struct reftable_bloc
return size;
}
@@ reftable/blocksource.c: int reftable_block_source_from_file(struct reftable_block_source *bs,
- {
- struct stat st = { 0 };
- int err = 0;
-- int fd = open(name, O_RDONLY);
-+ int fd;
- struct file_block_source *p = NULL;
-+
-+ fd = open(name, O_RDONLY);
- if (fd < 0) {
- if (errno == ENOENT) {
- return REFTABLE_NOT_EXIST_ERROR;
-@@ reftable/blocksource.c: int reftable_block_source_from_file(struct reftable_block_source *bs,
- p = reftable_calloc(sizeof(struct file_block_source));
+ p = reftable_calloc(sizeof(*p));
p->size = st.st_size;
- p->fd = fd;
-+ if (use_mmap) {
-+ p->data = xmmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
-+ p->fd = fd;
-+ } else {
-+ p->data = xmalloc(st.st_size);
-+ if (read_in_full(fd, p->data, st.st_size) != st.st_size) {
-+ close(fd);
-+ return -1;
-+ }
-+ close(fd);
-+ p->fd = -1;
-+ }
++ p->data = xmmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
++ close(fd);
assert(!bs->ops);
bs->ops = &file_vtable;
base-commit: a54a84b333adbecf7bc4483c0e36ed5878cac17b
--
2.43.GIT
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply
* Re: [PATCH 4/4] reftable/blocksource: use mmap to read tables
From: Patrick Steinhardt @ 2024-01-11 9:21 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Han-Wen Nienhuys
In-Reply-To: <xmqq5y00anlj.fsf@gitster.g>
[-- Attachment #1: Type: text/plain, Size: 5948 bytes --]
On Wed, Jan 10, 2024 at 01:24:24PM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
>
> > Using mmap comes with a significant drawback on Windows though, because
> > mmapped files cannot be deleted and neither is it possible to rename
> > files onto an mmapped file. But for one, the reftable library gracefully
> > handles the case where auto-compaction cannot delete a still-open stack
> > already and ignores any such errors. Also, `reftable_stack_clean()` will
> > prune stale tables which are not referenced by "tables.list" anymore so
> > that those files can eventually be pruned. And second, we never rewrite
> > already-rewritten stacks, so it does not matter that we cannot rename a
> > file over an mmaped file, either.
>
> I somehow thought that we use "read into an allocated memory and
> pretend as if we mapped" emulation on Windows, so all of that may be
> moot.
Ah, you're right in fact. I missed the definition of `git_mmap()` and
`git_munmap()`.
> > diff --git a/reftable/blocksource.c b/reftable/blocksource.c
> > index a1ea304429..5d3f3d264c 100644
> > --- a/reftable/blocksource.c
> > +++ b/reftable/blocksource.c
> > @@ -13,6 +13,12 @@ license that can be found in the LICENSE file or at
> > #include "reftable-blocksource.h"
> > #include "reftable-error.h"
> >
> > +#if defined(NO_MMAP)
> > +static int use_mmap = 0;
> > +#else
> > +static int use_mmap = 1;
> > +#endif
>
> Is there (do you need) some externally controllable knob that the
> user can use to turn this off on a system that is compiled without
> NO_MMAP? Otherwise it is misleading to have this as a variable.
No. The benefit of using a variable instead of using ifdefs is that we
know that both code paths continue to compile just fine. We do the same
thing in "refs/packed-backend.c".
But this code is not needed at all anyway, as you pointed out, because
we can simply use the emulated mmap(3P) code.
> > +static void file_close(void *v)
> > {
> > - int fd = ((struct file_block_source *)b)->fd;
> > - if (fd > 0) {
> > - close(fd);
> > - ((struct file_block_source *)b)->fd = 0;
> > + struct file_block_source *b = v;
> > +
> > + if (b->fd >= 0) {
> > + close(b->fd);
> > + b->fd = -1;
> > }
> >
> > + if (use_mmap)
> > + munmap(b->data, b->size);
> > + else
> > + reftable_free(b->data);
> > + b->data = NULL;
> > +
> > reftable_free(b);
> > }
>
> It would have been nicer to do this kind of "a void pointer is taken
> and we immediately cast it to a concrete and useful type upon entry"
> clean-up as a separate step that is purely clean-up, if there were
> many instances. It is the first such one in the series as far as I
> remember, so it is not a huge deal.
>
> > @@ -108,9 +119,7 @@ static int file_read_block(void *v, struct reftable_block *dest, uint64_t off,
> > {
> > struct file_block_source *b = v;
> > assert(off + size <= b->size);
> > - dest->data = reftable_malloc(size);
> > - if (pread_in_full(b->fd, dest->data, size, off) != size)
> > - return -1;
> > + dest->data = b->data + off;
> > dest->len = size;
> > return size;
> > }
>
> So, we used to delay the allocation and reading of a block until
> this function gets called. Now, by the time the control reaches
> the function, we are expected to have the data handy at b->data.
> That is ensured by reftable_block_source_from_file(), I presume?
>
> > @@ -127,8 +136,10 @@ int reftable_block_source_from_file(struct reftable_block_source *bs,
> > {
> > struct stat st = { 0 };
> > int err = 0;
> > - int fd = open(name, O_RDONLY);
> > + int fd;
> > struct file_block_source *p = NULL;
> > +
> > + fd = open(name, O_RDONLY);
> > if (fd < 0) {
> > if (errno == ENOENT) {
> > return REFTABLE_NOT_EXIST_ERROR;
>
> This is a no-op clean-up that would have been better in a separate
> clean-up step. Again, not a huge deal.
Yeah, fair enough. One problem I have with the reftable library is that
its coding style doesn't quite blend in with the rest of the Git code
base, so I want to refactor code that I'm touching to match our coding
style better. This means that there will be a lot of such smallish
refactorings, and I'm often not sure whether to evict them into their
own commit or whether to do the refactoring in the same step.
For small changes like this one here I think it's reasonable to do so in
the same commit. But larger refactorings like e.g. the introduction of
the common exit path in the first patch of this series I of course put
into their own commit.
> > @@ -144,7 +155,18 @@ int reftable_block_source_from_file(struct reftable_block_source *bs,
> >
> > p = reftable_calloc(sizeof(struct file_block_source));
> > p->size = st.st_size;
> > - p->fd = fd;
> > + if (use_mmap) {
> > + p->data = xmmap(NULL, st.st_size, PROT_READ, MAP_PRIVATE, fd, 0);
> > + p->fd = fd;
>
> This is a bit unusual for our use of mmap() where the norm is to
> close the file descriptor once we mapped (we only need the memory
> region itself and not the originating file descriptor to unmap it).
>
> Is there a reason why we need to keep p->fd? Now the other side of
> this if/else preallocates the whole thing and slurps everything in
> core to allow the remainder of the code to mimic what happens on the
> mmaped block memory (e.g., we saw how file_read_block() assumes that
> b->data[0..b->size] are valid) and does not obviously need p->fd,
> can we just remove .fd member from the file_block_source structure?
>
> That way, file_close() can also lose the conditional close() call.
>
> For that matter, do we need any codepath in this file that is
> enabled by !use_mmap? Can't we just use xmmap() and let its
> "instead, we allocate, read into it and return" emulation?
Not really, I'll update the code accordingly.
Thanks!
Patrick
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply
* gitweb: Exiting subroutine via next at /usr/lib/cgi-bin/gitweb.cgi line 3142
From: martin f krafft @ 2024-01-11 8:28 UTC (permalink / raw)
To: git
Hi team,
I get tens of thousands of messages like
```
[Thu Jan 11 09:01:11 2024] gitweb.cgi: Exiting subroutine via next at /usr/lib/cgi-bin/gitweb.cgi line 3142.
```
in my logs, presumably from crawlers hitting the CGI wrongly. This
is using gitweb 2.39.2.
I know no Perl whatsoever, but it's my impression that the following
patch would solve this:
```
--- /tmp/gitweb.cgi 2024-01-11 09:07:47.283764508 +0100
+++ /usr/lib/cgi-bin/gitweb.cgi 2024-01-11 09:01:43.459218235 +0100
@@ -3139,7 +3139,7 @@
my $path = substr($File::Find::name, $pfxlen + 1);
# paranoidly only filter here
if ($paranoid && $filter && $path !~ m!^\Q$filter\E/!) {
- next;
+ return;
}
# we check related file in $projectroot
if (check_export_ok("$projectroot/$path")) {
```
This is within the wanted sub for `File::Find::find`, and it returns
next so as to skip the currently visited file during the tree walk.
Returning a false value, which `return` without an argument
presumably does, should have the same effect.
Cheers,
--
martin krafft | https://matrix.to/#/#madduck:madduck.net
"it usually takes more than three weeks
to prepare a good impromptu speech.
-- mark twain
{: .blockquote }
spamtraps: madduck.bogus@madduck.net
{: .hidden }
^ permalink raw reply
* [PATCH] builtin/revert.c: refactor using an enum for cmd-action
From: Michael Lohmann @ 2024-01-11 8:04 UTC (permalink / raw)
To: git; +Cc: Michael Lohmann, Wanja Henze
This is done to avoid having to keep the char values in sync in
different places and also to get compiler warnings on non-exhaustive
switches.
The newly introduced `revert_action`-enum aligns with the
builtin/rebase.c `action`-enum though the name `revert_action` is chosen
to better differentiate it from `replay_opts->action` with a different
function. For rebase the equivalent of the latter is
`rebase_options->type` and `rebase_options->action` corresponds to the
`cmd` variable in revert.c.
In the rebase `action` enum there is the enumeration constant
`ACTION_NONE` which is not particularly descriptive, since it seems to
imply that no action should be taken. Instead it signals a start of a
revert/cherry-pick process, so here `REVERT_ACTION_START` was chosen.
Co-authored-by: Wanja Henze <wanja.hentze@bevuta.com>
Signed-off-by: Michael Lohmann <mi.al.lohmann@gmail.com>
---
Hello!
When I was working on `revert/cherry-pick --show-current-patch` (still
needs a little bit more discussion, if actually wanted, see thread [1])
I found the construct with the `cmd` as an int a bit irritating. I hope
this patch makes it more obvious what is actually going on.
Is there a reason why `ACTION_NONE` was chosen as a name in
builtin/rebase.c? My best guess is that it came along because it is the
implied action when no other specific action is passed in, but I don't
find that particularly descriptive on what its actual function is...
(Yes, naming things is hard... :D)
An alternative to prefixing the enum name with "revert_" would be to
rename `replay_opts->action` to `replay_opts->type` so it aligns with
rebase. Would you prefer that instead?
Cheers
Michael
[1] https://lore.kernel.org/git/20231218121048.68290-1-mi.al.lohmann@gmail.com/
builtin/revert.c | 80 +++++++++++++++++++++++++++++-------------------
1 file changed, 49 insertions(+), 31 deletions(-)
diff --git a/builtin/revert.c b/builtin/revert.c
index 89821bab95..b5278b7a3b 100644
--- a/builtin/revert.c
+++ b/builtin/revert.c
@@ -20,6 +20,14 @@
* Copyright (c) 2005 Junio C Hamano
*/
+enum revert_action {
+ REVERT_ACTION_START = 0,
+ REVERT_ACTION_CONTINUE,
+ REVERT_ACTION_SKIP,
+ REVERT_ACTION_ABORT,
+ REVERT_ACTION_QUIT,
+};
+
static const char * const revert_usage[] = {
N_("git revert [--[no-]edit] [-n] [-m <parent-number>] [-s] [-S[<keyid>]] <commit>..."),
N_("git revert (--continue | --skip | --abort | --quit)"),
@@ -85,12 +93,12 @@ static int run_sequencer(int argc, const char **argv, const char *prefix,
const char * const * usage_str = revert_or_cherry_pick_usage(opts);
const char *me = action_name(opts);
const char *cleanup_arg = NULL;
- int cmd = 0;
+ enum revert_action cmd = REVERT_ACTION_START;
struct option base_options[] = {
- OPT_CMDMODE(0, "quit", &cmd, N_("end revert or cherry-pick sequence"), 'q'),
- OPT_CMDMODE(0, "continue", &cmd, N_("resume revert or cherry-pick sequence"), 'c'),
- OPT_CMDMODE(0, "abort", &cmd, N_("cancel revert or cherry-pick sequence"), 'a'),
- OPT_CMDMODE(0, "skip", &cmd, N_("skip current commit and continue"), 's'),
+ OPT_CMDMODE(0, "quit", &cmd, N_("end revert or cherry-pick sequence"), REVERT_ACTION_QUIT),
+ OPT_CMDMODE(0, "continue", &cmd, N_("resume revert or cherry-pick sequence"), REVERT_ACTION_CONTINUE),
+ OPT_CMDMODE(0, "abort", &cmd, N_("cancel revert or cherry-pick sequence"), REVERT_ACTION_ABORT),
+ OPT_CMDMODE(0, "skip", &cmd, N_("skip current commit and continue"), REVERT_ACTION_SKIP),
OPT_CLEANUP(&cleanup_arg),
OPT_BOOL('n', "no-commit", &opts->no_commit, N_("don't automatically commit")),
OPT_BOOL('e', "edit", &opts->edit, N_("edit the commit message")),
@@ -144,30 +152,37 @@ static int run_sequencer(int argc, const char **argv, const char *prefix,
}
/* Check for incompatible command line arguments */
- if (cmd) {
- char *this_operation;
- if (cmd == 'q')
+ {
+ char *this_operation = 0;
+ switch (cmd) {
+ case REVERT_ACTION_START:
+ break;
+ case REVERT_ACTION_QUIT:
this_operation = "--quit";
- else if (cmd == 'c')
+ break;
+ case REVERT_ACTION_CONTINUE:
this_operation = "--continue";
- else if (cmd == 's')
+ break;
+ case REVERT_ACTION_SKIP:
this_operation = "--skip";
- else {
- assert(cmd == 'a');
+ break;
+ case REVERT_ACTION_ABORT:
this_operation = "--abort";
+ break;
}
- verify_opt_compatible(me, this_operation,
- "--no-commit", opts->no_commit,
- "--signoff", opts->signoff,
- "--mainline", opts->mainline,
- "--strategy", opts->strategy ? 1 : 0,
- "--strategy-option", opts->xopts.nr ? 1 : 0,
- "-x", opts->record_origin,
- "--ff", opts->allow_ff,
- "--rerere-autoupdate", opts->allow_rerere_auto == RERERE_AUTOUPDATE,
- "--no-rerere-autoupdate", opts->allow_rerere_auto == RERERE_NOAUTOUPDATE,
- NULL);
+ if (this_operation)
+ verify_opt_compatible(me, this_operation,
+ "--no-commit", opts->no_commit,
+ "--signoff", opts->signoff,
+ "--mainline", opts->mainline,
+ "--strategy", opts->strategy ? 1 : 0,
+ "--strategy-option", opts->xopts.nr ? 1 : 0,
+ "-x", opts->record_origin,
+ "--ff", opts->allow_ff,
+ "--rerere-autoupdate", opts->allow_rerere_auto == RERERE_AUTOUPDATE,
+ "--no-rerere-autoupdate", opts->allow_rerere_auto == RERERE_NOAUTOUPDATE,
+ NULL);
}
if (!opts->strategy && opts->default_strategy) {
@@ -183,9 +198,7 @@ static int run_sequencer(int argc, const char **argv, const char *prefix,
"--edit", opts->edit > 0,
NULL);
- if (cmd) {
- opts->revs = NULL;
- } else {
+ if (cmd == REVERT_ACTION_START) {
struct setup_revision_opt s_r_opt;
opts->revs = xmalloc(sizeof(*opts->revs));
repo_init_revisions(the_repository, opts->revs, NULL);
@@ -198,6 +211,8 @@ static int run_sequencer(int argc, const char **argv, const char *prefix,
memset(&s_r_opt, 0, sizeof(s_r_opt));
s_r_opt.assume_dashdash = 1;
argc = setup_revisions(argc, argv, opts->revs, &s_r_opt);
+ } else {
+ opts->revs = NULL;
}
if (argc > 1)
@@ -210,19 +225,22 @@ static int run_sequencer(int argc, const char **argv, const char *prefix,
opts->strategy = xstrdup(getenv("GIT_TEST_MERGE_ALGORITHM"));
free(options);
- if (cmd == 'q') {
+ switch (cmd) {
+ case REVERT_ACTION_QUIT: {
int ret = sequencer_remove_state(opts);
if (!ret)
remove_branch_state(the_repository, 0);
return ret;
}
- if (cmd == 'c')
+ case REVERT_ACTION_CONTINUE:
return sequencer_continue(the_repository, opts);
- if (cmd == 'a')
+ case REVERT_ACTION_ABORT:
return sequencer_rollback(the_repository, opts);
- if (cmd == 's')
+ case REVERT_ACTION_SKIP:
return sequencer_skip(the_repository, opts);
- return sequencer_pick_revisions(the_repository, opts);
+ case REVERT_ACTION_START:
+ return sequencer_pick_revisions(the_repository, opts);
+ }
}
int cmd_revert(int argc, const char **argv, const char *prefix)
--
2.42.0
^ permalink raw reply related
* Re: [PATCH 3/3] advice: allow disabling the automatic hint in advise_if_enabled()
From: Jeff King @ 2024-01-11 8:04 UTC (permalink / raw)
To: Dragan Simic; +Cc: Junio C Hamano, Rubén Justo, Git List
In-Reply-To: <d6d72ec5431ad1ca775e6e4e9921f94c@manjaro.org>
On Wed, Jan 10, 2024 at 06:45:49PM +0100, Dragan Simic wrote:
> 4) As a careful git user that remembers important things, you go back
> to your git configuration file and set core.verboseAdvice to true, and
> the additional advices are back, telling you how to disable the
> unwanted advice.
>
> 5) After you disable the unwanted advice, you set core.verboseAdvice
> back to false and keep it that way until the next redundant advice
> pops up.
>
> However, I do see that this approach has its downsides, for example
> the need for the unwanted advice to be displayed again together with
> the additional advice, by executing the appropriate git commands,
> after the above-described point #4.
Right, the need to re-trigger the advice is the problematic part here, I
think. In some cases it is easy. But in others, especially commands
which mutate the repo state (like the empty-commit rebase that started
this thread), you may need to do a lot of work to recreate the
situation.
> Let's see what it would look like with the granular, per-advice on/off
> knobs:
>
> 1) You use git and find some advice useful, so you decide to keep it
> displayed. However, the additional advice about turning the advice
> off becomes annoying a bit, or better said, it becomes redundant
> because the main advice stays.
>
> 2) As a result, you follow the additional advice and set the specific
> knob to false, and voila, the redundant additional advice disappears.
> Of course, it once again isn't perfect, as the next point will clearly
> show.
>
> 3) You keep using git, and one of the advices that you previously used
> to find useful becomes no longer needed. But, what do you do, where's
> that helpful additional advice telling you how to turn the advice off?
>
> 4) Now you need to dig though the manual pages, or to re-enable the
> additional advices in your git configuration file, perhaps all of them
> at once, while keeping a backup of your original settings, to restore
> it later. Then, you again need to wait until the original advice gets
> displayed.
These steps (3) and (4) seem unlikely to me. These are by definition
messages you have seen before and decided to configure specifically (to
"always", and not just "off"). So you probably only have a handful (if
even more than one) of them in your config file.
Whereas for the case I am worried about, you are exposed to a _new_
message that you haven't seen before (and is maybe even new to Git!),
from the much-larger pool of "all advice messages Git knows about".
It's possible we could implement both mechanisms and let people choose
which one they want, depending on their preferences. It's not very much
code. I just hesitate to make things more complex than they need to be.
-Peff
^ permalink raw reply
* Re: [PATCH] commit-graph: retain commit slab when closing NULL commit_graph
From: Jeff King @ 2024-01-11 7:53 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Taylor Blau, git, Scott Leggett
In-Reply-To: <xmqq34v5dtz9.fsf@gitster.g>
On Wed, Jan 10, 2024 at 08:38:18AM -0800, Junio C Hamano wrote:
> > It should be easy-ish to iterate through the slab and look at the
> > commits that are mentioned in it. Though maybe not? Each commit knows
> > its slab-id, but I'm not sure if we have a master list of commits to go
> > the other way.
>
> We have table of all in-core objects, don't we?
Oh, duh. Yes, we could iterate over obj_hash. I do think the "on demand"
version I showed later in the message is better, though, as the work
both scales with the number of affected commits (rather than the total
number of objects) and can be done lazily (so callers that are not buggy
pay no price at all).
> > + * This will throw away the parents list, which is potentially sketchy.
> > + * A better version of this would just unset commit->object.parsed
> > + * and then do a minimal version of parse_commit() that _just_ loads
> > + * the tree data (and/or graph position if available).
>
> Yeah, it is a concern that we may be working with an in-core commit
> object whose parent list has already been rewritten during revision
> traversal. Well thought out.
Hmm. Looking at my "a better version" sentence, I guess we know that
either:
1. The object really is corrupt (i.e., we could not load the tree).
2. It came from a graph in the first place.
So what if we just tried harder to look it up in the graph file (rather
than the slab) when we see COMMIT_NOT_FROM_GRAPH? And indeed, we even
have a function to do this already!
So something like this almost works:
diff --git a/commit.c b/commit.c
index b3223478bc..096a3d18d0 100644
--- a/commit.c
+++ b/commit.c
@@ -418,10 +418,12 @@ static inline void set_commit_tree(struct commit *c, struct tree *t)
struct tree *repo_get_commit_tree(struct repository *r,
const struct commit *commit)
{
+ uint32_t pos;
+
if (commit->maybe_tree || !commit->object.parsed)
return commit->maybe_tree;
- if (commit_graph_position(commit) != COMMIT_NOT_FROM_GRAPH)
+ if (repo_find_commit_pos_in_graph(r, commit, &pos))
return get_commit_tree_in_graph(r, commit);
return NULL;
but:
1. It doesn't update the slab, so get_commit_tree_in_graph() then
complains immediately, because it uses only
commit_graph_position(). I guess we'd need to do:
commit_graph_data_at(commit)->graph_pos = pos;
somewhere. That does make the recently-found segfault go away.
But...
2. I'm not sure if other spots would want similar treatment. We store
generation numbers in the slab, too. I think we'll figure things
out when they're not available, so there's no segfault problem. But
it's maybe a potential performance issue. Likewise, it is probably
a bug that the graph-writing code is looking at this commit at all
(since we know it is already in a graph file). So we might be
papering over that bug (that is, even though the segfault is gone,
we are perhaps still writing a duplicate graph entry).
> > I dunno. I do feel like this is a lurking maintenance headache, and
> > might even be a triggerable bug. But without knowing of a way that it
> > happens in the current code base, it feels like it would be easy to make
> > things worse rather than better.
>
> Unfortunately I share the feeling X-<.
I somehow sniped myself into thinking about it more, but that has only
reinforced my feeling that I'm afraid to touch it. ;)
-Peff
^ permalink raw reply related
* Re: [PATCH 3/4] reftable/stack: use stat info to avoid re-reading stack list
From: Patrick Steinhardt @ 2024-01-11 7:41 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Han-Wen Nienhuys
In-Reply-To: <xmqqo7dt9ckn.fsf@gitster.g>
[-- Attachment #1: Type: text/plain, Size: 2339 bytes --]
On Wed, Jan 10, 2024 at 12:07:52PM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
>
> > We can do better and use the same stat(3P)-based mechanism that the
> > "packed" backend uses. Instead of reading the file, we will only open
> > the file descriptor, fstat(3P) it, and then compare the info against the
> > cached value from the last time we have updated the stack. This should
> > always work alright because "tables.list" is updated atomically via a
> > rename, so even if the ctime or mtime wasn't granular enough to identify
> > a change, at least the inode number should have changed.
>
> Or the file size. Let's keep in mind that many users get useless
> inum from their filesystem X-<.
>
> > Summary
> > update-ref: create many refs (refcount = 1, revision = HEAD~) ran
> > 1.01 ± 0.09 times faster than update-ref: create many refs (refcount = 1, revision = HEAD)
> > 2.72 ± 0.11 times faster than update-ref: create many refs (refcount = 100, revision = HEAD)
> > 3.42 ± 0.13 times faster than update-ref: create many refs (refcount = 100, revision = HEAD~)
> > 163.59 ± 5.62 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD)
> > 233.91 ± 7.92 times faster than update-ref: create many refs (refcount = 10000, revision = HEAD~)
> > ---
>
> Nice.
>
> > @@ -374,6 +375,8 @@ static int reftable_stack_reload_maybe_reuse(struct reftable_stack *st,
> > sleep_millisec(delay);
> > }
> >
> > + stat_validity_update(&st->list_validity, fd);
> > +
> > out:
> > if (fd >= 0)
> > close(fd);
>
> The stat_validity_update() does not happen in the error codepath.
>
> Should we be clearing the validity of the list when somebody jumps
> to "out:" due to an error? Or by the time this function gets
> called, the caller would already have cleared the validity and an
> error that jumps to "out:" keeps the list invalid?
It likely does not matter much. This function only gets called when we
have previously determined the stack to be out of date already. So
unless we update the validity cache, we know that the next validity
check would continue to treat the stack as outdated.
That being said I think it's good hygiene to clear the validity cache
regardless of that.
Patrick
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply
* Re: [PATCH 1/4] reftable/stack: refactor stack reloading to have common exit path
From: Patrick Steinhardt @ 2024-01-11 7:33 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Han-Wen Nienhuys
In-Reply-To: <xmqqmstdccz8.fsf@gitster.g>
[-- Attachment #1: Type: text/plain, Size: 796 bytes --]
On Wed, Jan 10, 2024 at 09:30:51AM -0800, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
>
> > The `reftable_stack_reload_maybe_reuse()` function is responsible for
> > reloading the reftable list from disk. The function is quite hard to
> > follow though because it has a bunch of different exit paths, many of
> > which have to free the same set of resources.
> >
> > Refactor the function to have a common exit path. While at it, touch up
> > the style of this function a bit to match our usual coding style better.
> > ---
> > reftable/stack.c | 86 +++++++++++++++++++++++-------------------------
> > 1 file changed, 42 insertions(+), 44 deletions(-)
>
> Missing sign-off.
Ah, sorry, I forgot my usual `git rebase --signoff`. Will fix in v2.
Patrick
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply
* Re: Limited operations in unsafe repositories
From: Jeff King @ 2024-01-11 7:30 UTC (permalink / raw)
To: Patrick Steinhardt; +Cc: brian m. carlson, git
In-Reply-To: <ZZ-V-vwnm2hOkrMC@tanuki>
On Thu, Jan 11, 2024 at 08:17:14AM +0100, Patrick Steinhardt wrote:
> > Right, it's not config itself that's unsafe; it's that many options are.
> > We could try to annotate them to say "it is OK to parse core.eol but not
> > core.pager", presumably with an allow-known-good approach (since so many
> > ard bad!). But that feels like an ongoing maintenance headache, and an
> > easy way to make a mistake (your mention of terminal sequences makes me
> > assume you're thinking of "color.diff.*", etc). A rule like "we do not
> > read repo-level config at all" seems easier to explain (to me, anyway).
>
> With the exemption of the repository format, I assume? We have to parse
> things like `core.repositoryFormatVersion` and extensions in order to
> figure out how a repository has to be accessed. So I agree that we
> should not partition config based on safeness, which is going to be a
> headache as you rightly point out. But we can partition based on whether
> or not config is required in order to access the repository, where the
> set of relevant config keys is a whole lot smaller.
Right. See the pseudo-patch I posted earlier. I think we just want to
touch do_git_config_sequence(), which leaves repo discovery and stuff
like "git config --local" working.
-Peff
^ permalink raw reply
* Re: [PATCH 2/2] t5541: generalize reference locking
From: Jeff King @ 2024-01-11 7:28 UTC (permalink / raw)
To: Justin Tobler via GitGitGadget; +Cc: git, Justin Tobler
In-Reply-To: <11fd5091d61b54d8862ab2e316bbd25fff63ce0f.1704912750.git.gitgitgadget@gmail.com>
On Wed, Jan 10, 2024 at 06:52:30PM +0000, Justin Tobler via GitGitGadget wrote:
> From: Justin Tobler <jltobler@gmail.com>
>
> Some tests set up reference locks by directly creating the lockfile.
> While this works for the files reference backend, reftable reference
> locks operate differently and are incompatible with this approach.
> Generalize reference locking by preparing a reference transaction.
As with the first patch, I think we could use d/f conflicts to get the
same effect. Perhaps something like this:
diff --git a/t/t5541-http-push-smart.sh b/t/t5541-http-push-smart.sh
index df758e187d..7eb0e887e1 100755
--- a/t/t5541-http-push-smart.sh
+++ b/t/t5541-http-push-smart.sh
@@ -233,7 +233,8 @@ test_expect_success 'push --atomic fails on server-side errors' '
up="$HTTPD_URL"/smart/atomic-branches.git &&
# break ref updates for other on the remote site
- mkdir "$d/refs/heads/other.lock" &&
+ git -C "$d" update-ref -d refs/heads/other &&
+ git -C "$d" update-ref refs/heads/other/block-me HEAD &&
# add the new commit to other
git branch -f other collateral &&
@@ -244,12 +245,8 @@ test_expect_success 'push --atomic fails on server-side errors' '
# the new branch should not have been created upstream
test_must_fail git -C "$d" show-ref --verify refs/heads/atomic &&
- # upstream should still reflect atomic2, the last thing we pushed
- # successfully
- git rev-parse atomic2 >expected &&
- # ...to other.
- git -C "$d" rev-parse refs/heads/other >actual &&
- test_cmp expected actual &&
+ # upstream should not have updated, as it cannot be written at all.
+ test_must_fail git -C "$d" rev-parse --verify refs/heads/other &&
# the new branch should not have been created upstream
test_must_fail git -C "$d" show-ref --verify refs/heads/atomic &&
I do think that the original was slightly more interesting (since we
could check that "other" still existed but was not updated), but I think
the main point of the test is that "atomic" was not pushed at all.
-Peff
^ permalink raw reply related
* Re: Limited operations in unsafe repositories
From: Patrick Steinhardt @ 2024-01-11 7:17 UTC (permalink / raw)
To: Jeff King; +Cc: brian m. carlson, git
In-Reply-To: <20240111070114.GB48154@coredump.intra.peff.net>
[-- Attachment #1: Type: text/plain, Size: 2283 bytes --]
On Thu, Jan 11, 2024 at 02:01:14AM -0500, Jeff King wrote:
> On Wed, Jan 10, 2024 at 11:34:04PM +0000, brian m. carlson wrote:
>
> > On 2024-01-10 at 12:05:31, Jeff King wrote:
> > > My thinking is to flip that around: run all code, but put protection in
> > > the spots that do unsafe things, like loading config or examining
> > > hooks. I.e., a patch like this:
> >
> > I think that's much what I had intended to do with not invoking binaries
> > at all, except that it was limited to rev-parse. I wonder if perhaps we
> > could do something similar if we had the `--assume-unsafe` argument you
> > proposed, except that we would only allow the `git` binary and always
> > pass that argument to it in such a case.
>
> I'm not sure what you mean by "invoking binaries". I had assumed that
> meant just running other Git sub-processes. But if "--assume-unsafe" is
> just setting an environment variable, they'd automatically be protected.
>
> > I don't think reading config is intrinsically unsafe; it's more of what
> > we do with it, which is spawning external processes, that's the problem.
> > I suppose an argument could be made for injecting terminal sequences or
> > such, though. Hooks, obviously, are definitely unsafe.
>
> Right, it's not config itself that's unsafe; it's that many options are.
> We could try to annotate them to say "it is OK to parse core.eol but not
> core.pager", presumably with an allow-known-good approach (since so many
> ard bad!). But that feels like an ongoing maintenance headache, and an
> easy way to make a mistake (your mention of terminal sequences makes me
> assume you're thinking of "color.diff.*", etc). A rule like "we do not
> read repo-level config at all" seems easier to explain (to me, anyway).
With the exemption of the repository format, I assume? We have to parse
things like `core.repositoryFormatVersion` and extensions in order to
figure out how a repository has to be accessed. So I agree that we
should not partition config based on safeness, which is going to be a
headache as you rightly point out. But we can partition based on whether
or not config is required in order to access the repository, where the
set of relevant config keys is a whole lot smaller.
Patrick
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply
* Re: [PATCH 1/2] t1401: generalize reference locking
From: Jeff King @ 2024-01-11 7:13 UTC (permalink / raw)
To: Justin Tobler via GitGitGadget; +Cc: git, Justin Tobler
In-Reply-To: <cb78b549e5e826ffef39c55bd726164e6b7bb755.1704912750.git.gitgitgadget@gmail.com>
On Wed, Jan 10, 2024 at 06:52:29PM +0000, Justin Tobler via GitGitGadget wrote:
> From: Justin Tobler <jltobler@gmail.com>
>
> Some tests set up reference locks by directly creating the lockfile.
> While this works for the files reference backend, reftable reference
> locks operate differently and are incompatible with this approach.
> Refactor the test to use git-update-ref(1) to lock refs instead so that
> the test does not need to be aware of how the ref backend locks refs.
It looks like you re-create this situation in a backend-agnostic way by
having two simultaneous updates that conflict on the lock (but don't
care how that lock is implemented).
That works, but I think we could keep it simple. This test doesn't care
about the exact error condition we create. The point was just to die in
create_symref() and make sure the exit code was propagated. So something
like this would work:
$ git symbolic-ref refs/heads refs/heads/foo
error: unable to write symref for refs/heads: Is a directory
(note that you get a different error message if the refs are packed,
since there we can notice the d/f conflict manually).
There may be other ways to stimulate a failure. I thought "symbolic-ref
HEAD refs/heads/.invalid" might work, but sadly the refname format check
happens earlier.
I think it is worth avoiding the fifo magic if we can. It's complicated,
and it means that not all platforms run the test.
-Peff
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox