* [PATCHv2 3/5] unpack-trees: factor progress setup out of check_updates
From: Stefan Beller @ 2017-01-09 19:46 UTC (permalink / raw)
To: gitster, peff, l.s.r; +Cc: git, Stefan Beller
In-Reply-To: <20170109194621.17013-1-sbeller@google.com>
This makes check_updates shorter and easier to understand.
Signed-off-by: Stefan Beller <sbeller@google.com>
---
unpack-trees.c | 32 ++++++++++++++++++++------------
1 file changed, 20 insertions(+), 12 deletions(-)
diff --git a/unpack-trees.c b/unpack-trees.c
index d443decb23..b564024472 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -218,9 +218,27 @@ static void unlink_entry(const struct cache_entry *ce)
schedule_dir_for_removal(ce->name, ce_namelen(ce));
}
-static int check_updates(struct unpack_trees_options *o)
+static struct progress *get_progress(struct unpack_trees_options *o)
{
unsigned cnt = 0, total = 0;
+ struct index_state *index = &o->result;
+
+ if (!o->update || !o->verbose_update)
+ return NULL;
+
+ for (; cnt < index->cache_nr; cnt++) {
+ const struct cache_entry *ce = index->cache[cnt];
+ if (ce->ce_flags & (CE_UPDATE | CE_WT_REMOVE))
+ total++;
+ }
+
+ return start_progress_delay(_("Checking out files"),
+ total, 50, 1);
+}
+
+static int check_updates(struct unpack_trees_options *o)
+{
+ unsigned cnt = 0;
int i, errs = 0;
struct progress *progress = NULL;
@@ -232,17 +250,7 @@ static int check_updates(struct unpack_trees_options *o)
state.refresh_cache = 1;
state.istate = index;
- if (o->update && o->verbose_update) {
- for (total = cnt = 0; cnt < index->cache_nr; cnt++) {
- const struct cache_entry *ce = index->cache[cnt];
- if (ce->ce_flags & (CE_UPDATE | CE_WT_REMOVE))
- total++;
- }
-
- progress = start_progress_delay(_("Checking out files"),
- total, 50, 1);
- cnt = 0;
- }
+ progress = get_progress(o);
if (o->update)
git_attr_set_direction(GIT_ATTR_CHECKOUT, index);
--
2.11.0.rc2.30.g7c4be45.dirty
^ permalink raw reply related
* [PATCHv2 4/5] unpack-trees: factor file removal out of check_updates
From: Stefan Beller @ 2017-01-09 19:46 UTC (permalink / raw)
To: gitster, peff, l.s.r; +Cc: git, Stefan Beller
In-Reply-To: <20170109194621.17013-1-sbeller@google.com>
This makes check_updates shorter and easier to understand.
Signed-off-by: Stefan Beller <sbeller@google.com>
---
unpack-trees.c | 29 +++++++++++++++++++++--------
1 file changed, 21 insertions(+), 8 deletions(-)
diff --git a/unpack-trees.c b/unpack-trees.c
index b564024472..ac59510251 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -218,6 +218,26 @@ static void unlink_entry(const struct cache_entry *ce)
schedule_dir_for_removal(ce->name, ce_namelen(ce));
}
+static unsigned remove_workingtree_files(struct unpack_trees_options *o,
+ struct progress *progress)
+{
+ int i;
+ unsigned cnt = 0;
+ struct index_state *index = &o->result;
+
+ for (i = 0; i < index->cache_nr; i++) {
+ const struct cache_entry *ce = index->cache[i];
+
+ if (ce->ce_flags & CE_WT_REMOVE) {
+ display_progress(progress, ++cnt);
+ if (o->update && !o->dry_run)
+ unlink_entry(ce);
+ }
+ }
+
+ return cnt;
+}
+
static struct progress *get_progress(struct unpack_trees_options *o)
{
unsigned cnt = 0, total = 0;
@@ -254,15 +274,8 @@ static int check_updates(struct unpack_trees_options *o)
if (o->update)
git_attr_set_direction(GIT_ATTR_CHECKOUT, index);
- for (i = 0; i < index->cache_nr; i++) {
- const struct cache_entry *ce = index->cache[i];
- if (ce->ce_flags & CE_WT_REMOVE) {
- display_progress(progress, ++cnt);
- if (o->update && !o->dry_run)
- unlink_entry(ce);
- }
- }
+ cnt = remove_workingtree_files(o, progress);
remove_marked_cache_entries(index);
remove_scheduled_dirs();
--
2.11.0.rc2.30.g7c4be45.dirty
^ permalink raw reply related
* [PATCHv2 0/5] refactor unpack-trees
From: Stefan Beller @ 2017-01-09 19:46 UTC (permalink / raw)
To: gitster, peff, l.s.r; +Cc: git, Stefan Beller
v2:
* Fixed the return type to be unsigned in patch 4.
This replaces origin/rs/unpack-trees-reduce-file-scope-global,
the first patch has the following diff to that branch
diff --git a/unpack-trees.c b/unpack-trees.c
index edf9fa2f6c..8e6768f283 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -221,10 +221,11 @@ static void unlink_entry(const struct cache_entry *ce)
static int check_updates(struct unpack_trees_options *o)
{
unsigned cnt = 0, total = 0;
+ int i, errs = 0;
+
struct progress *progress = NULL;
struct index_state *index = &o->result;
struct checkout state = CHECKOUT_INIT;
- int i, errs = 0;
state.force = 1;
state.quiet = 1;
Thanks,
Stefan
v1:
unpack-trees is a central file needed for the understanding
of working tree manipulation. To help with the understanding
refactor the code to be more readable.
The first patch was a standalone patch 8 days ago;
now incorporated into this series as a v3,
reducing the scope of the checkout state.
The second patch removes a single continue statement;
it needed some digging to explain, but looks trivial.
The last 3 patches shorten the check_updates function by adding more
functions. If we ever want to parallelize file IO then these smaller
functions would be the scope to do it, keeping the check_updates as
a high level function guiding through the steps what is happening during
a working tree update.
Thanks,
Stefan
Stefan Beller (5):
unpack-trees: move checkout state into check_updates
unpack-trees: remove unneeded continue
unpack-trees: factor progress setup out of check_updates
unpack-trees: factor file removal out of check_updates
unpack-trees: factor working tree update out of check_updates
unpack-trees.c | 96 ++++++++++++++++++++++++++++++++++++++--------------------
1 file changed, 64 insertions(+), 32 deletions(-)
--
2.11.0.31.g919a8d0.dirty
^ permalink raw reply related
* [PATCHv2 5/5] unpack-trees: factor working tree update out of check_updates
From: Stefan Beller @ 2017-01-09 19:46 UTC (permalink / raw)
To: gitster, peff, l.s.r; +Cc: git, Stefan Beller
In-Reply-To: <20170109194621.17013-1-sbeller@google.com>
This makes check_updates shorter and easier to understand.
Signed-off-by: Stefan Beller <sbeller@google.com>
---
unpack-trees.c | 40 ++++++++++++++++++++++++++--------------
1 file changed, 26 insertions(+), 14 deletions(-)
diff --git a/unpack-trees.c b/unpack-trees.c
index ac59510251..ddb3e3dcff 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -256,12 +256,13 @@ static struct progress *get_progress(struct unpack_trees_options *o)
total, 50, 1);
}
-static int check_updates(struct unpack_trees_options *o)
+static int update_working_tree_files(struct unpack_trees_options *o,
+ struct progress *progress,
+ unsigned start_cnt)
{
- unsigned cnt = 0;
+ unsigned cnt = start_cnt;
int i, errs = 0;
- struct progress *progress = NULL;
struct index_state *index = &o->result;
struct checkout state = CHECKOUT_INIT;
@@ -270,15 +271,6 @@ static int check_updates(struct unpack_trees_options *o)
state.refresh_cache = 1;
state.istate = index;
- progress = get_progress(o);
-
- if (o->update)
- git_attr_set_direction(GIT_ATTR_CHECKOUT, index);
-
- cnt = remove_workingtree_files(o, progress);
- remove_marked_cache_entries(index);
- remove_scheduled_dirs();
-
for (i = 0; i < index->cache_nr; i++) {
struct cache_entry *ce = index->cache[i];
@@ -288,11 +280,31 @@ static int check_updates(struct unpack_trees_options *o)
ce->name);
display_progress(progress, ++cnt);
ce->ce_flags &= ~CE_UPDATE;
- if (o->update && !o->dry_run) {
+ if (o->update && !o->dry_run)
errs |= checkout_entry(ce, &state, NULL);
- }
}
}
+
+ return errs;
+}
+
+static int check_updates(struct unpack_trees_options *o)
+{
+ struct progress *progress = NULL;
+ struct index_state *index = &o->result;
+ int errs;
+ unsigned total_removed;
+
+ progress = get_progress(o);
+
+ if (o->update)
+ git_attr_set_direction(GIT_ATTR_CHECKOUT, index);
+
+ total_removed = remove_workingtree_files(o, progress);
+ remove_marked_cache_entries(index);
+ remove_scheduled_dirs();
+ errs = update_working_tree_files(o, progress, total_removed);
+
stop_progress(&progress);
if (o->update)
git_attr_set_direction(GIT_ATTR_CHECKIN, NULL);
--
2.11.0.rc2.30.g7c4be45.dirty
^ permalink raw reply related
* Re: [PATCH 4/5] unpack-trees: factor file removal out of check_updates
From: Stefan Beller @ 2017-01-09 19:30 UTC (permalink / raw)
To: Jeff King; +Cc: Junio C Hamano, René Scharfe, git@vger.kernel.org
In-Reply-To: <20170107013602.act7aouuzgzpwb54@sigill.intra.peff.net>
On Fri, Jan 6, 2017 at 5:36 PM, Jeff King <peff@peff.net> wrote:
>> +static int remove_workingtree_files(struct unpack_trees_options *o,
>> + unsigned cnt = 0;
>
> "cnt" is unsigned here, as it is in the caller. Should the return value
> match?
Yes, obviously. :/
Thanks for catching,
Stefan
^ permalink raw reply
* Re: [PATCH v5 0/5] road to reentrant real_path
From: Junio C Hamano @ 2017-01-09 19:26 UTC (permalink / raw)
To: Brandon Williams
Cc: git, sbeller, peff, jacob.keller, ramsay, tboegi, j6t, pclouds,
larsxschneider
In-Reply-To: <20170109182433.GC62878@google.com>
Brandon Williams <bmwill@google.com> writes:
> On 01/09, Junio C Hamano wrote:
>> Brandon Williams <bmwill@google.com> writes:
>>
>> >> How does this relate to the 5-patch real_path: series that has been
>> >> on 'next' since last year?
>> >
>> > The only difference should be in the first patch of the series which
>> > handles the #define a bit differently due to the discussion that
>> > happened last week.
>> >
>> > Here is the interdiff between v5 and 'origin/bw/realpath-wo-chdir':
>>
>> Then can you make that an incremental patch (or two) with its own
>> log message instead? It (or they) would look and smell like a
>> bugfix patch that follows up a change that has already landed. As
>> you know, we won't eject and replace patches that are already in
>> 'next' during a development cycle.
>>
>> Thanks.
>
> Yes I'll get right on that.
Thanks. Will queue.
^ permalink raw reply
* Re: [PATCH v4 1/4] Avoid Coverity warning about unfree()d git_exec_path()
From: Stefan Beller @ 2017-01-09 19:21 UTC (permalink / raw)
To: Junio C Hamano
Cc: Johannes Schindelin, git@vger.kernel.org, David Aguilar,
Dennis Kaarsemaker, Paul Sbarra
In-Reply-To: <xmqqy3ylyqhf.fsf@gitster.mtv.corp.google.com>
On Sun, Jan 8, 2017 at 5:25 PM, Junio C Hamano <gitster@pobox.com> wrote:
> So with the above, are you saying "Dscho said 'hopefully', and I
> confirm that this change does squelch misdiagnosis by Coverity"?
I could not find the coverity issue any more.
(It really misses easy access to "recently fixed" problems)
> commit c9bb5d101ca657fa466afa8c4368c43ea7b7aca8
> Author: Johannes Schindelin <johannes.schindelin@gmx.de>
> Date: Mon Jan 2 17:22:33 2017 +0100
>
> git_exec_path: avoid Coverity warning about unfree()d result
>
> Technically, it is correct that git_exec_path() returns a possibly
> malloc()ed string returned from system_path(), and it is sometimes
> not allocated. Cache the result in a static variable and make sure
> that we call system_path() only once, which plugs a potential leak.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sounds good to me,
Thanks,
Stefan
^ permalink raw reply
* Re: [PATCH v3 13/13] mergetool: fix running in subdir when rerere enabled
From: Junio C Hamano @ 2017-01-09 19:05 UTC (permalink / raw)
To: Richard Hansen; +Cc: git, davvid, j6t, sbeller, simon
In-Reply-To: <xmqq4m18ump1.fsf@gitster.mtv.corp.google.com>
Junio C Hamano <gitster@pobox.com> writes:
> I wonder if it makes more sense to always move to toplevel upfront
> and consistently use path from the toplevel, perhaps like the patch
s/the patch/the attached patch/ I meant.
> does. The first hunk is what you wrote but only inside MERGE_RR
> block, and the second hunk deals with converting end-user supplied
> paths that are relative to the original relative to the top-level.
>
> The tweaking of $orderfile you have in the first hunk may have to be
> tightened mimicking the way how "eval ... --sq ... ; shift" is used
> in the second hunk to avoid confusion in case orderfile specified by
> the end user happens to be the same as a valid revname
> (e.g. "master").
And here is a squash-able patch to illustrate what I mean.
I removed both of the comment blocks as the code always works with
the worktree-relative pathname after this patch while adjusting
end-user supplied paths from relative to original cwd. As that is
how the core parts of the system (including the parts written in C)
work, even though an explanation you did in the log message is
needed to explain why the change was needed and what the change
intended to do to readers of "git log", it is not necessary to
explain it to the readers of the latest code, which is what the
in-code comment is about.
The single-liner addition to the test creates a branch whose name is
the same as the specified orderfile to deliberately create a
confusing situation. I haven't tried, but I am fairly sure that the
test will demonstrate how broken the orderfile=$(...) in the
original is, if you apply the test part of the attached patch,
without the changes to git-mergetool.sh, to your version.
diff --git a/git-mergetool.sh b/git-mergetool.sh
index 22f56c25a2..21f82d5b58 100755
--- a/git-mergetool.sh
+++ b/git-mergetool.sh
@@ -454,53 +454,34 @@ main () {
merge_keep_backup="$(git config --bool mergetool.keepBackup || echo true)"
merge_keep_temporaries="$(git config --bool mergetool.keepTemporaries || echo false)"
- if test $# -eq 0 && test -e "$GIT_DIR/MERGE_RR"
+ prefix=$(git rev-parse --show-prefix) || exit 1
+ cd_to_toplevel
+
+ if test -n "$orderfile"
then
- # The pathnames output by the 'git rerere remaining'
- # command below are relative to the top-level
- # directory but the 'git diff --name-only' command
- # further below expects the pathnames to be relative
- # to the current working directory. Thus, we cd to
- # the top-level directory before running 'git diff
- # --name-only'. We change directories even earlier
- # (before running 'git rerere remaining') in case 'git
- # rerere remaining' is ever changed to output
- # pathnames relative to the current working directory.
- #
- # Changing directories breaks a relative $orderfile
- # pathname argument, so fix it up to be relative to
- # the top-level directory.
-
- prefix=$(git rev-parse --show-prefix) || exit 1
- cd_to_toplevel
- if test -n "$orderfile"
- then
- orderfile=$(git rev-parse --prefix "$prefix" "$orderfile") || exit 1
- fi
+ orderfile=$(
+ git rev-parse --prefix "$prefix" -- "$orderfile" |
+ sed -e 1d
+ )
+ fi
+ if test $# -eq 0 && test -e "$GIT_DIR/MERGE_RR"
+ then
set -- $(git rerere remaining)
if test $# -eq 0
then
print_noop_and_exit
fi
+ elif test $# -ge 0
+ then
+ eval "set -- $(git rev-parse --sq --prefix "$prefix" -- "$@")"
+ shift
fi
- # Note: The pathnames output by 'git diff --name-only' are
- # relative to the top-level directory, but it expects input
- # pathnames to be relative to the current working directory.
- # Thus:
- # * Either cd_to_toplevel must not be run before this or all
- # relative input pathnames must be converted to be
- # relative to the top-level directory (or absolute).
- # * Either cd_to_toplevel must be run after this or all
- # relative output pathnames must be converted to be
- # relative to the current working directory (or absolute).
files=$(git -c core.quotePath=false \
diff --name-only --diff-filter=U \
${orderfile:+"-O$orderfile"} -- "$@")
- cd_to_toplevel
-
if test -z "$files"
then
print_noop_and_exit
diff --git a/t/t7610-mergetool.sh b/t/t7610-mergetool.sh
index dfd641d34b..180dd7057a 100755
--- a/t/t7610-mergetool.sh
+++ b/t/t7610-mergetool.sh
@@ -678,6 +678,11 @@ test_expect_success 'diff.orderFile configuration is honored' '
b
a
EOF
+
+ # make sure "order-file" that is ambiguous between
+ # rev and path is understood correctly.
+ git branch order-file HEAD &&
+
git mergetool --no-prompt --tool myecho >output &&
git grep --no-index -h -A2 Merging: output >actual &&
test_cmp expect actual
^ permalink raw reply related
* Re: Rebasing a branch with merges
From: Robert Dailey @ 2017-01-09 18:52 UTC (permalink / raw)
To: Philip Oakley; +Cc: Git, Johannes Schindelin
In-Reply-To: <49C0981FE7D04AE2AC0BBB011FD74B90@PhilipOakley>
On Fri, Jan 6, 2017 at 3:28 PM, Philip Oakley <philipoakley@iee.org> wrote:
> From: "Robert Dailey" <rcdailey.lists@gmail.com>
>>
>> Here's the scenario:
>>
>> I create a topic branch so one other developer and myself can work on
>> a feature that takes 2 weeks to complete. During that 2 week period,
>> changes are occurring on master that I need in my topic branch. Since
>> I have a collaborator on the branch, I opt for merges instead of
>> rebase.
>>
>> Each day I merge from master to the topic branch, which changes code
>> I'm actively working in and requires semantic changes (functions
>> renamed, moved, etc).
>>
>> Once I'm ready to merge the topic branch back into master, I have two
>> options (bearing in mind the goal is to keep history as clean as
>> possible. Furthermore this implies that the constant merging into
>> topic from master has made the topic branch look unwieldy and
>> difficult to audit):
>
>
> a broader question zero;
> 0. Is the merge always clean? Do you always do a preparatory fixup! to
> ensure that the merge will be clean?
>
> Ensuring that the merge will be clean should greatly simplify your decision
> about process.
I don't understand what you're asking. How would I do a fixup with
merges? Can you explain a bit? Normally the only time I use fixup! or
squash! is for local changes prior to pushing.
>> 1. Do a squash merge, which keeps history clean but we lose context
>> for the important bits (the commits representing units of work that
>> contribute to the topic itself).
>>
>> 2. Do a final rebase prior to merging.
>>
>> #2 doesn't seem to be possible due to patch ordering. For example, if
>> I have real commits after merge commits that depend on those changes
>> from master being present as a base at that point in time, the rebase
>> will cause the patch before it to no longer include those changes from
>> master.
>
>
> How much of the historic fixups to cover changes on master do you want to
> keep visible? i.e. how many fork-points are truly needed (a. by you, b. by
> the project - personal knowledge vs corporate knowledge).?
Again, I do not understand. Maybe the first question you asked needs
to be understood before I can answer this one. Sorry for the trouble.
^ permalink raw reply
* [PATCH 2/2] real_path: set errno when max number of symlinks is exceeded
From: Brandon Williams @ 2017-01-09 18:50 UTC (permalink / raw)
To: git; +Cc: Brandon Williams, peff, gitster, larsxschneider
In-Reply-To: <20170109185024.73006-1-bmwill@google.com>
Set errno to ELOOP when the maximum number of symlinks is exceeded, as
would be done by other symlink-resolving functions.
Signed-off-by: Brandon Williams <bmwill@google.com>
---
abspath.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/abspath.c b/abspath.c
index 0393213e5..fce40fddc 100644
--- a/abspath.c
+++ b/abspath.c
@@ -141,6 +141,8 @@ char *strbuf_realpath(struct strbuf *resolved, const char *path,
strbuf_reset(&symlink);
if (num_symlinks++ > MAXSYMLINKS) {
+ errno = ELOOP;
+
if (die_on_error)
die("More than %d nested symlinks "
"on path '%s'", MAXSYMLINKS, path);
--
2.11.0.390.gc69c2f50cf-goog
^ permalink raw reply related
* [PATCH 1/2] real_path: prevent redefinition of MAXSYMLINKS
From: Brandon Williams @ 2017-01-09 18:50 UTC (permalink / raw)
To: git; +Cc: Brandon Williams, peff, gitster, larsxschneider
In-Reply-To: <xmqqzij0t7uh.fsf@gitster.mtv.corp.google.com>
The macro 'MAXSYMLINKS' is already defined on macOS and Linux in
<sys/param.h>. If 'MAXSYMLINKS' has already been defined, use the value
defined by the OS otherwise default to a value of 32 which is more
inline with what is allowed by many systems.
Signed-off-by: Brandon Williams <bmwill@google.com>
---
abspath.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/abspath.c b/abspath.c
index 1d56f5ed9..0393213e5 100644
--- a/abspath.c
+++ b/abspath.c
@@ -62,7 +62,9 @@ static void get_root_part(struct strbuf *resolved, struct strbuf *remaining)
}
/* We allow "recursive" symbolic links. Only within reason, though. */
-#define MAXSYMLINKS 5
+#ifndef MAXSYMLINKS
+#define MAXSYMLINKS 32
+#endif
/*
* Return the real path (i.e., absolute path, with symlinks resolved
--
2.11.0.390.gc69c2f50cf-goog
^ permalink raw reply related
* Re: [PATCH v3 00/13] fix mergetool+rerere+subdir regression
From: Stefan Beller @ 2017-01-09 18:49 UTC (permalink / raw)
To: Richard Hansen
Cc: git@vger.kernel.org, David Aguilar, Johannes Sixt, Simon Ruderich
In-Reply-To: <20170109054238.42599-1-hansenr@google.com>
On Sun, Jan 8, 2017 at 9:42 PM, Richard Hansen <hansenr@google.com> wrote:
> If rerere is enabled, no pathnames are given, and mergetool is run
> from a subdirectory, mergetool always prints "No files need merging".
> Fix the bug.
>
> This regression was introduced in
> 57937f70a09c12ef484c290865dac4066d207c9c (v2.11.0).
>
> Changes since v2:
> * Added entries to .mailmap.
> * Patch 2/4 was split into multiple separate commits.
> * Moved '-O' out of $orderfile.
> * $orderfile is now adjusted after running cd_to_toplevel.
> * Added lots of comments.
> * Other minor tweaks to address review feedback.
>
The changes to tests look all good to me.
Thanks,
Stefan
^ permalink raw reply
* Re: [PATCH v5 0/5] road to reentrant real_path
From: Brandon Williams @ 2017-01-09 18:24 UTC (permalink / raw)
To: Junio C Hamano
Cc: git, sbeller, peff, jacob.keller, ramsay, tboegi, j6t, pclouds,
larsxschneider
In-Reply-To: <xmqqzij0t7uh.fsf@gitster.mtv.corp.google.com>
On 01/09, Junio C Hamano wrote:
> Brandon Williams <bmwill@google.com> writes:
>
> >> How does this relate to the 5-patch real_path: series that has been
> >> on 'next' since last year?
> >
> > The only difference should be in the first patch of the series which
> > handles the #define a bit differently due to the discussion that
> > happened last week.
> >
> > Here is the interdiff between v5 and 'origin/bw/realpath-wo-chdir':
>
> Then can you make that an incremental patch (or two) with its own
> log message instead? It (or they) would look and smell like a
> bugfix patch that follows up a change that has already landed. As
> you know, we won't eject and replace patches that are already in
> 'next' during a development cycle.
>
> Thanks.
Yes I'll get right on that.
--
Brandon Williams
^ permalink raw reply
* Re: [PATCH v5 0/5] road to reentrant real_path
From: Junio C Hamano @ 2017-01-09 18:18 UTC (permalink / raw)
To: Brandon Williams
Cc: git, sbeller, peff, jacob.keller, ramsay, tboegi, j6t, pclouds,
larsxschneider
In-Reply-To: <20170109180418.GB62878@google.com>
Brandon Williams <bmwill@google.com> writes:
>> How does this relate to the 5-patch real_path: series that has been
>> on 'next' since last year?
>
> The only difference should be in the first patch of the series which
> handles the #define a bit differently due to the discussion that
> happened last week.
>
> Here is the interdiff between v5 and 'origin/bw/realpath-wo-chdir':
Then can you make that an incremental patch (or two) with its own
log message instead? It (or they) would look and smell like a
bugfix patch that follows up a change that has already landed. As
you know, we won't eject and replace patches that are already in
'next' during a development cycle.
Thanks.
>
> diff --git a/abspath.c b/abspath.c
> index 1d56f5ed9..fce40fddc 100644
> --- a/abspath.c
> +++ b/abspath.c
> @@ -62,7 +62,9 @@ static void get_root_part(struct strbuf *resolved, struct strbuf *remaining)
> }
>
> /* We allow "recursive" symbolic links. Only within reason, though. */
> -#define MAXSYMLINKS 5
> +#ifndef MAXSYMLINKS
> +#define MAXSYMLINKS 32
> +#endif
>
> /*
> * Return the real path (i.e., absolute path, with symlinks resolved
> @@ -139,6 +141,8 @@ char *strbuf_realpath(struct strbuf *resolved, const char *path,
> strbuf_reset(&symlink);
>
> if (num_symlinks++ > MAXSYMLINKS) {
> + errno = ELOOP;
> +
> if (die_on_error)
> die("More than %d nested symlinks "
> "on path '%s'", MAXSYMLINKS, path);
^ permalink raw reply
* Re: [PATCH v3 13/13] mergetool: fix running in subdir when rerere enabled
From: Junio C Hamano @ 2017-01-09 18:12 UTC (permalink / raw)
To: Richard Hansen; +Cc: git, davvid, j6t, sbeller, simon
In-Reply-To: <20170109054238.42599-14-hansenr@google.com>
Richard Hansen <hansenr@google.com> writes:
> If rerere is enabled and no pathnames are given, run cd_to_toplevel
> before running 'git diff --name-only' so that 'git diff --name-only'
> sees the files named by 'git rerere remaining', which outputs
> pathnames relative to the top-level directory.
>
> The cd_to_toplevel command could be run after 'git rerere remaining',
> but it is run before just in case 'git rerere remaining' is ever
> changed to print pathnames relative to the current working directory
> rather than relative to the top-level directory.
>
> An alternative approach would be to unconditionally convert all
> relative pathnames (including the orderfile pathname) to be relative
> to the top-level directory and then run cd_to_toplevel before 'git
> diff --name-only', but unfortunately 'git rev-parse --prefix' requires
> valid pathnames, which would break some valid use cases.
>
> This fixes a regression introduced in
> 57937f70a09c12ef484c290865dac4066d207c9c (v2.11.0).
>
> Signed-off-by: Richard Hansen <hansenr@google.com>
> ---
> git-mergetool.sh | 32 ++++++++++++++++++++++++++++++++
> t/t7610-mergetool.sh | 2 +-
> 2 files changed, 33 insertions(+), 1 deletion(-)
>
> diff --git a/git-mergetool.sh b/git-mergetool.sh
> index b506896dc..22f56c25a 100755
> --- a/git-mergetool.sh
> +++ b/git-mergetool.sh
> @@ -456,6 +456,28 @@ main () {
>
While doing the extra cd_to_toplevel added by this patch may not
break anything that comes after the existing cd_to_toplevel, I find
the result of applying this patch unnecessarily confusing. As "if
the user didn't give any pathnames from the command line, ask rerere
what paths it thinks are necessary to be handled" is merely a
laziness fallback, it feels conceptually wrong to use different
invocations of -O$orderfile when rerere is and is not in effect.
I wonder if it makes more sense to always move to toplevel upfront
and consistently use path from the toplevel, perhaps like the patch
does. The first hunk is what you wrote but only inside MERGE_RR
block, and the second hunk deals with converting end-user supplied
paths that are relative to the original relative to the top-level.
The tweaking of $orderfile you have in the first hunk may have to be
tightened mimicking the way how "eval ... --sq ... ; shift" is used
in the second hunk to avoid confusion in case orderfile specified by
the end user happens to be the same as a valid revname
(e.g. "master").
diff --git a/git-mergetool.sh b/git-mergetool.sh
index b506896dc1..adbbeceb47 100755
--- a/git-mergetool.sh
+++ b/git-mergetool.sh
@@ -454,6 +454,14 @@ main () {
merge_keep_backup="$(git config --bool mergetool.keepBackup || echo true)"
merge_keep_temporaries="$(git config --bool mergetool.keepTemporaries || echo false)"
+ prefix=$(git rev-parse --show-prefix) || exit 1
+ cd_to_toplevel
+
+ if test -n "$orderfile"
+ then
+ orderfile=$(git rev-parse --prefix "$prefix" "$orderfile") || exit 1
+ fi
+
if test $# -eq 0 && test -e "$GIT_DIR/MERGE_RR"
then
set -- $(git rerere remaining)
@@ -461,14 +469,16 @@ main () {
then
print_noop_and_exit
fi
+ elif test $# -ge 0
+ then
+ eval "set -- $(git rev-parse --sq --prefix "$prefix" -- "$@")"
+ shift
fi
files=$(git -c core.quotePath=false \
diff --name-only --diff-filter=U \
${orderfile:+"-O$orderfile"} -- "$@")
- cd_to_toplevel
-
if test -z "$files"
then
print_noop_and_exit
^ permalink raw reply related
* Re: [PATCH v5 0/5] road to reentrant real_path
From: Brandon Williams @ 2017-01-09 18:04 UTC (permalink / raw)
To: Junio C Hamano
Cc: git, sbeller, peff, jacob.keller, ramsay, tboegi, j6t, pclouds,
larsxschneider
In-Reply-To: <xmqqpojy1c49.fsf@gitster.mtv.corp.google.com>
On 01/07, Junio C Hamano wrote:
> Brandon Williams <bmwill@google.com> writes:
>
> > changes in v5:
> > * set errno to ELOOP when MAXSYMLINKS is exceded.
> > * revert to use MAXSYMLINKS instead of MAXDEPTH.
> > * If the OS hasn't defined MAXSYMLINKS, use a fallback value of 32.
> >
> > Brandon Williams (4):
> > real_path: resolve symlinks by hand
> > real_path: convert real_path_internal to strbuf_realpath
> > real_path: create real_pathdup
> > real_path: have callers use real_pathdup and strbuf_realpath
> >
> > Johannes Sixt (1):
> > real_path: canonicalize directory separators in root parts
> >
>
> How does this relate to the 5-patch real_path: series that has been
> on 'next' since last year?
The only difference should be in the first patch of the series which
handles the #define a bit differently due to the discussion that
happened last week.
Here is the interdiff between v5 and 'origin/bw/realpath-wo-chdir':
diff --git a/abspath.c b/abspath.c
index 1d56f5ed9..fce40fddc 100644
--- a/abspath.c
+++ b/abspath.c
@@ -62,7 +62,9 @@ static void get_root_part(struct strbuf *resolved, struct strbuf *remaining)
}
/* We allow "recursive" symbolic links. Only within reason, though. */
-#define MAXSYMLINKS 5
+#ifndef MAXSYMLINKS
+#define MAXSYMLINKS 32
+#endif
/*
* Return the real path (i.e., absolute path, with symlinks resolved
@@ -139,6 +141,8 @@ char *strbuf_realpath(struct strbuf *resolved, const char *path,
strbuf_reset(&symlink);
if (num_symlinks++ > MAXSYMLINKS) {
+ errno = ELOOP;
+
if (die_on_error)
die("More than %d nested symlinks "
"on path '%s'", MAXSYMLINKS, path);
--
Brandon Williams
^ permalink raw reply related
* Re: [PATCH v5 00/16] pathspec cleanup
From: Brandon Williams @ 2017-01-09 17:57 UTC (permalink / raw)
To: Duy Nguyen; +Cc: Git Mailing List, Stefan Beller, Junio C Hamano
In-Reply-To: <CACsJy8CM6FfHpVuqby=hjPmiYAxvJjzr1W6LdO5B82KQnTmmog@mail.gmail.com>
On 01/07, Duy Nguyen wrote:
> On Thu, Jan 5, 2017 at 1:03 AM, Brandon Williams <bmwill@google.com> wrote:
> > Changes in v5:
> > * Move GUARD_PATHSPEC to prevent checking if pathspec is null twice.
> > * Mark a string containing 'mnemonic' for translation.
>
> Argh.. I've run out of things to complain about! Ack!
haha! Well thanks again for your help in reviewing this series!
--
Brandon Williams
^ permalink raw reply
* Re: [PATCH 0/6] git worktree move/remove
From: Junio C Hamano @ 2017-01-09 17:34 UTC (permalink / raw)
To: Nguyễn Thái Ngọc Duy; +Cc: git
In-Reply-To: <20170108094003.637-1-pclouds@gmail.com>
Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes:
> This version is the same as nd/worktree-move but with the recursive
> directory copy code removed. We rely on rename() to move directories.
Much simpler ;-)
Will replace; thanks.
> Nguyễn Thái Ngọc Duy (6):
> worktree.c: add validate_worktree()
> worktree.c: add update_worktree_location()
> worktree move: new command
> worktree move: accept destination as directory
> worktree move: refuse to move worktrees with submodules
> worktree remove: new command
>
> Documentation/git-worktree.txt | 28 ++++--
> builtin/worktree.c | 162 +++++++++++++++++++++++++++++++++
> contrib/completion/git-completion.bash | 5 +-
> t/t2028-worktree-move.sh | 56 ++++++++++++
> worktree.c | 84 +++++++++++++++++
> worktree.h | 11 +++
> 6 files changed, 335 insertions(+), 11 deletions(-)
^ permalink raw reply
* Re: [PATCH v4] log --graph: customize the graph lines with config log.graphColors
From: Junio C Hamano @ 2017-01-09 17:29 UTC (permalink / raw)
To: Nguyễn Thái Ngọc Duy; +Cc: git, Jeff King
In-Reply-To: <20170109103258.25341-1-pclouds@gmail.com>
Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes:
> + end = string + strlen(string);
> + while (start < end) {
> + const char *comma = strchrnul(start, ',');
> + char color[COLOR_MAXLEN];
> +
> + while (start < comma && isspace(*start))
> + start++;
> + if (start == comma) {
> + start = comma + 1;
> + continue;
> + }
> +
> + if (!color_parse_mem(start, comma - start, color))
So you skip the leading blanks but let color_parse_mem() trim the
trailing blanks? It would work once the control reaches the loop,
but wouldn't that miss
git -c log.graphColors=' reset , blue, red' log --graph
as "reset" is not understood by parse_color() and is understood by
color_parse_mem() only when the length is given correctly?
I am undecided, but leaning towards declaring that it is a bug in
color_parse_mem() not in this caller.
> + argv_array_push(&colors, color);
> + else
> + warning(_("ignore invalid color '%.*s' in log.graphColors"),
> + (int)(comma - start), start);
> + start = comma + 1;
> + }
> + free(string);
> + argv_array_push(&colors, GIT_COLOR_RESET);
> + /* graph_set_column_colors takes a max-index, not a count */
> + graph_set_column_colors(colors.argv, colors.argc - 1);
> +}
> +
> void graph_set_column_colors(const char **colors, unsigned short colors_max)
> {
> column_colors = colors;
> @@ -208,8 +248,7 @@ struct git_graph *graph_init(struct rev_info *opt)
> struct git_graph *graph = xmalloc(sizeof(struct git_graph));
>
> if (!column_colors)
> - graph_set_column_colors(column_colors_ansi,
> - column_colors_ansi_max);
> + read_graph_colors_config();
Now that the helper is renamed to be about "reading the
configuration to figure out the graph colors", the division of labor
between the caller and the callee we see here is suboptimal for
readability, I would think.
We would want to see the caller to either be
if (!column_colors) {
if (read_graph_colors_config())
; /* ok */
else
graph_set_column_colors(ansi, ansi_max);
}
or better yet, something like:
if (!column_colors) {
const char **colors = column_colors_ansi;
int colors_max = column_colors_ansi_max;
read_graph_colors_config(&colors, &colors_max);
graph_set_collumn_colors(colors, colors_max);
}
The last one would make it clear that by default we use ansi but
override that default by calling that function whose purpose is to
read the values that override the default from the configuration.
I haven't thought things thru regarding memory leakages etc., so
there may be valid reasons why the last one is infeasible.
^ permalink raw reply
* Re: Preserve/Prune Old Pack Files
From: Martin Fick @ 2017-01-09 16:20 UTC (permalink / raw)
To: Jeff King; +Cc: Mike Hommey, git
In-Reply-To: <20170109105545.gchfklcuzwhlfbtf@sigill.intra.peff.net>
On Monday, January 09, 2017 05:55:45 AM Jeff King wrote:
> On Mon, Jan 09, 2017 at 04:01:19PM +0900, Mike Hommey
wrote:
> > > That's _way_ more complicated than your problem, and
> > > as I said, I do not have a finished solution. But it
> > > seems like they touch on a similar concept (a
> > > post-delete holding area for objects). So I thought
> > > I'd mention it in case if spurs any brilliance.
> >
> > Something that is kind-of in the same family of problems
> > is the "loosening" or objects on repacks, before they
> > can be pruned.
...
> Yes, this can be a problem. The repack is smart enough not
> to write out objects which would just get pruned
> immediately, but since the grace period is 2 weeks, that
> can include a lot of objects (especially with history
> rewriting as you note). It would be possible to write
> those loose objects to a "cruft" pack, but there are some
> management issues around the cruft pack. You do not want
> to keep repacking them into a new cruft pack at each
> repack, since then they would never expire. So you need
> some way of marking the pack as cruft, letting it age
> out, and then deleting it after the grace period expires.
>
> I don't think it would be _that_ hard, but AFAIK nobody
> has ever made patches.
FYI, jgit does this,
-Martin
--
The Qualcomm Innovation Center, Inc. is a member of Code
Aurora Forum, hosted by The Linux Foundation
^ permalink raw reply
* Re: Preserve/Prune Old Pack Files
From: Martin Fick @ 2017-01-09 16:17 UTC (permalink / raw)
To: Jeff King; +Cc: repo-discuss, jmelvin, jgit-dev, git
In-Reply-To: <20170109062137.zghmurndlbts5x44@sigill.intra.peff.net>
On Monday, January 09, 2017 01:21:37 AM Jeff King wrote:
> On Wed, Jan 04, 2017 at 09:11:55AM -0700, Martin Fick
wrote:
> > I am replying to this email across lists because I
> > wanted to highlight to the git community this jgit
> > change to repacking that we have up for review
> >
> > https://git.eclipse.org/r/#/c/87969/
> >
> > This change introduces a new convention for how to
> > preserve old pack files in a staging area
> > (.git/objects/packs/preserved) before deleting them. I
> > wanted to ensure that the new proposed convention would
> > be done in a way that would be satisfactory to the git
> > community as a whole so that it would be more easy to
> > provide the same behavior in git eventually. The
> > preserved pack files (and accompanying index and bitmap
> > files), are not only moved, but they are also renamed
> > so that they no longer will match recursive finds
> > looking for pack files.
> It looks like objects/pack/pack-123.pack becomes
> objects/pack/preserved/pack-123.old-pack,
Yes, that's the idea.
> and so forth. Which seems reasonable, and I'm happy that:
>
> find objects/pack -name '*.pack'
>
> would not find it. :)
Cool.
> I suspect the name-change will break a few tools that you
> might want to use to look at a preserved pack (like
> verify-pack). I know that's not your primary use case,
> but it seems plausible that somebody may one day want to
> use a preserved pack to try to recover from corruption. I
> think "git index-pack --stdin
> <objects/packs/preserved/pack-123.old-pack" could always
> be a last-resort for re-admitting the objects to the
> repository.
or even a simple manual rename/move back to its orginal
place?
> I notice this doesn't do anything for loose objects. I
> think they technically suffer the same issue, though the
> race window is much shorter (we mmap them and zlib
> inflate immediately, whereas packfiles may stay mapped
> across many object requests).
Hmm, yeah that's the next change, didn't you see it? :) No,
actually I forgot about those. Our server tends to not have
too many of those (loose objects), and I don't think we have
seen any exceptions yet for them. But, of course, you are
right, they should get fixed too. I will work on a followup
change to do that.
Where would you suggest we store those? Maybe under
".git/objects/preserved/<xx>/<sha1>"? Do they need to be
renamed also somehow to avoid a find?
...
> I've wondered if we could make object pruning more atomic
> by speculatively moving items to be deleted into some
> kind of "outgoing" object area.
...
> I don't have a solution here. I don't think we want to
> solve it by locking the repository for updates during a
> repack. I have a vague sense that a solution could be
> crafted around moving the old pack into a holding area
> instead of deleting (during which time nobody else would
> see the objects, and thus not reference them), while the
> repacking process checks to see if the actual deletion
> would break any references (and rolls back the deletion
> if it would).
>
> That's _way_ more complicated than your problem, and as I
> said, I do not have a finished solution. But it seems
> like they touch on a similar concept (a post-delete
> holding area for objects). So I thought I'd mention it in
> case if spurs any brilliance.
I agree, this is a problem I have wanted to solve also. I
think having a "preserved" directory does open the door to
such "recovery" solutions, although I think you would
actually want to modify the many read code paths to fall
back to looking at the preserved area and performing
immediate "recovery" of the pack file if it ends up being
needed. That's a lot of work, but having the packs (and
eventually the loose objects) preserved into a location
where no new references will be built to depend on them is
likely the first step. Does the name "preserved" do well for
that use case also, or would there be some better name, what
would a transactional system call them?
Thanks for the review Peff!
-Martin
--
The Qualcomm Innovation Center, Inc. is a member of Code
Aurora Forum, hosted by The Linux Foundation
^ permalink raw reply
* Re: Refreshing index timestamps without reading content
From: Quentin Casasnovas @ 2017-01-09 15:55 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Duy Nguyen, Quentin Casasnovas, Git Mailing List
In-Reply-To: <xmqqr34cuvjj.fsf@gitster.mtv.corp.google.com>
[-- Attachment #1: Type: text/plain, Size: 6095 bytes --]
On Mon, Jan 09, 2017 at 07:01:36AM -0800, Junio C Hamano wrote:
> Duy Nguyen <pclouds@gmail.com> writes:
>
> > On Thu, Jan 5, 2017 at 6:23 PM, Quentin Casasnovas
> > <quentin.casasnovas@oracle.com> wrote:
> >> Is there any way to tell git, after the git ls-tree command above, to
> >> refresh its stat cache information and trust us that the file content has
> >> not changed, as to avoid any useless file read (though it will obviously
> >> will have to stat all of them, but that's not something we can really
> >> avoid)
> >
> > I don't think there's any way to do that, unfortunately.
>
> Lose "unfortunately".
>
> >> If not, I am willing to implement a --assume-content-unchanged to the git
> >> update-index if you guys don't see something fundamentally wrong with this
> >> approach.
> >
> > If you do that, I think you should go with either of the following options
> >
> > - Extend git-update-index --index-info to take stat info as well (or
> > maybe make a new option instead). Then you can feed stat info directly
> > to git without a use-case-specific "assume-content-unchanged".
> >
> > - Add "git update-index --touch" that does what "touch" does. In this
> > case, it blindly updates stat info to latest. But like touch, we can
> > also specify mtime from command line if we need to. It's a bit less
> > generic than the above option, but easier to use.
>
> Even if we assume that it is a good idea to let people muck with the
> index like this, either of the above would be a usable addition,
> because the cached stat information does not consist solely of
> mtime.
>
> "git update-index --index-info" was invented for the case where a
> user or a script _knows_ the object ID of the blob that _would_
> result if a contents of a file on the filesystem were run through
> hash-object. So from the interface's point of view, it may make
> sense to teach it to take an extra/optional argument that is the
> path to the file and take the stat info out of the named file when
> the extra/optional argument was given.
>
> But that assumes that it is a good idea to do this in the first
> place. It was deliberate design decision that setting the cached
> stat info for the entry was protected behind actual content
> comparison, and removing that protection will open the index to
> abuse.
>
Hi Junio,
Thanks for your feedback, appreciated :)
I do understand how it would be possible for someone to shoot themselves in
the feet with such option, but it solves real life use cases and improved
build times very signficantly here.
Another use case we have is setting up very lightweight linux work trees,
by reflinking from a base work-tree. This allows for a completely
different work-tree taking up almost no size at first, whereas using a
shared clone or the recent worktree subcommand would "waste" ~500MB*:
# linux-2.6 is a shared clone of a bare clone residing locally
~ $ cp --reflink -a linux-2.6 linux-2.6-reflinked
# At this point, the mtime inside linux-2.6-reflinked are matching the
# mtime of the source linux-2.6 (since we used the '-a' option of 'cp)
~ $ diff -u <(stat linux-2.6/README) <(stat linux-2.6-reflinked/README)
--- /proc/self/fd/11 2017-01-09 16:34:04.523438942 +0100
+++ /proc/self/fd/12 2017-01-09 16:34:04.523438942 +0100
@@ -1,8 +1,8 @@
- File: 'linux-2.6/README'
+ File: 'linux-2.6-reflinked/README'
Size: 18372 Blocks: 40 IO Block: 4096 regular file
-Device: fd00h/64768d Inode: 268467090 Links: 1
+Device: fd00h/64768d Inode: 805970606 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 1000/ quentin) Gid: ( 1000/ quentin)
Access: 2017-01-09 12:04:15.317758718 +0100
Modify: 2017-01-09 12:04:12.566758772 +0100
-Change: 2017-01-09 12:04:12.566758772 +0100
+Change: 2017-01-09 16:29:48.305444003 +0100
Birth:
# Now let's check how long it takes to refresh the index from the source
# and destination..
~/linux-2.6 $ time git update-index --refresh
git update-index --refresh 0.04s user 0.08s system 204% cpu 0.058 total
~~~~~~~~~~~
~/linux-2.6-reflinked $ time git update-index --refresh
git update-index --refresh 2.40s user 1.43s system 38% cpu 10.003 total
~~~~~~~~~~~~
This is quite a high penalty when a power user knows that his lightweight
copy worktree matches the index! That's on a single worktree on a fairly
decent SSD but our build farm would do this with hundreds of work trees in
parallel, all residing on a spinning disks - the penalty would be much
worse than 10 seconds wasted.
I might have a look at adding reflinking awareness to the worktree
subcommand (if it hasn't been implemented already) to avoid this index
force refreshing but that still would not fix our other use case where we
knowingly change the mtime of the files on our worktrees.
* That's a rather low estimate:
git ls-tree -zr --name-only v4.9 | du -csh --files0-from=- | tail -n1
747M total
> The userbase of Git has grown wide enough that it is harder to say "If
> you lie that a file whose contents does not match the index is up to date
> using this mechanism, you will lose data and all bad things happen---you
> can keep both halves". Once we release a version of Git with such a
> "feature", the first bug report will be "I did not want to run
> 'update-index --refresh' because it takes time, and some index entries
> apparently did not match what is on the filesystem, and I got a corrupt
> working file after a merge. Git should make sure that the contents match
> when using the new 'path to the file' argument when updating the cached
> stat info!". I do not have a good answer to such a bug report.
>
The good answer would probably be that this is intended behaviour and would
explicitly be mentionned in the manual for that specific option? :)
How about renaming the option (which is more of a sub-command) to
--i-know-this-can-eat-my-data-but-refresh-stat-only? Then we can print a
big warning saying this is generally a VERY bad idea?
Q
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 801 bytes --]
^ permalink raw reply
* Re: What's cooking in git.git (preview)
From: Junio C Hamano @ 2017-01-09 15:05 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: git
In-Reply-To: <alpine.DEB.2.20.1701091228460.3469@virtualbox>
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> Hi Junio,
>
> On Sun, 8 Jan 2017, Junio C Hamano wrote:
>
>> * js/difftool-builtin (2017-01-08) 4 commits
>> - t7800: run both builtin and scripted difftool, for now
>> - difftool: implement the functionality in the builtin
>> - difftool: add a skeleton for the upcoming builtin
>> - git_exec_path: avoid Coverity warning about unfree()d result
>>
>> Rewrite a scripted porcelain "git difftool" in C.
>>
>> What's the doneness of this topic?
>
> There was only one bug report since the first iteration, and it has been
> fixed as of v4.
This is more RFH than FYI, but some topics in flight add more tests
to t7800 and with everything merged, the test breaks at the tip of
'pu' as of tonight (eh, technically last night, as I ended up not
sleeping due to jetlag), even though each individual topic passes
it. It _might_ indicate a problem with this topic, or it may merely
be a biproduct of a mismerge.
^ permalink raw reply
* Re: Refreshing index timestamps without reading content
From: Junio C Hamano @ 2017-01-09 15:01 UTC (permalink / raw)
To: Duy Nguyen; +Cc: Quentin Casasnovas, Git Mailing List
In-Reply-To: <CACsJy8BRfJG6L49VyC+qsrQ9Arz0gCGpMATpK9uLq61Lx6_Jtg@mail.gmail.com>
Duy Nguyen <pclouds@gmail.com> writes:
> On Thu, Jan 5, 2017 at 6:23 PM, Quentin Casasnovas
> <quentin.casasnovas@oracle.com> wrote:
>> Is there any way to tell git, after the git ls-tree command above, to
>> refresh its stat cache information and trust us that the file content has
>> not changed, as to avoid any useless file read (though it will obviously
>> will have to stat all of them, but that's not something we can really
>> avoid)
>
> I don't think there's any way to do that, unfortunately.
Lose "unfortunately".
>> If not, I am willing to implement a --assume-content-unchanged to the git
>> update-index if you guys don't see something fundamentally wrong with this
>> approach.
>
> If you do that, I think you should go with either of the following options
>
> - Extend git-update-index --index-info to take stat info as well (or
> maybe make a new option instead). Then you can feed stat info directly
> to git without a use-case-specific "assume-content-unchanged".
>
> - Add "git update-index --touch" that does what "touch" does. In this
> case, it blindly updates stat info to latest. But like touch, we can
> also specify mtime from command line if we need to. It's a bit less
> generic than the above option, but easier to use.
Even if we assume that it is a good idea to let people muck with the
index like this, either of the above would be a usable addition,
because the cached stat information does not consist solely of
mtime.
"git update-index --index-info" was invented for the case where a
user or a script _knows_ the object ID of the blob that _would_
result if a contents of a file on the filesystem were run through
hash-object. So from the interface's point of view, it may make
sense to teach it to take an extra/optional argument that is the
path to the file and take the stat info out of the named file when
the extra/optional argument was given.
But that assumes that it is a good idea to do this in the first
place. It was deliberate design decision that setting the cached
stat info for the entry was protected behind actual content
comparison, and removing that protection will open the index to
abuse.
The userbase of Git has grown wide enough that it is harder to say
"If you lie that a file whose contents does not match the index is
up to date using this mechanism, you will lose data and all bad
things happen---you can keep both halves". Once we release a
version of Git with such a "feature", the first bug report will be
"I did not want to run 'update-index --refresh' because it takes
time, and some index entries apparently did not match what is on the
filesystem, and I got a corrupt working file after a merge. Git
should make sure that the contents match when using the new 'path to
the file' argument when updating the cached stat info!". I do not
have a good answer to such a bug report.
So...
^ permalink raw reply
* Re: [PATCH v3 14/21] read-cache: touch shared index files when used
From: Junio C Hamano @ 2017-01-09 14:34 UTC (permalink / raw)
To: Duy Nguyen
Cc: Christian Couder, git, Ævar Arnfjörð Bjarmason,
Christian Couder
In-Reply-To: <CACsJy8AR6yNr0y+_JZDkW-HO_yHPkUx_6zbLGoviKQBOVcSg5A@mail.gmail.com>
Duy Nguyen <pclouds@gmail.com> writes:
> On Sun, Jan 8, 2017 at 4:46 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> Christian Couder <christian.couder@gmail.com> writes:
>>
>>> So what should we do if freshen_file() returns 0 which means that the
>>> freshening failed?
>>
>> You tell me ;-) as you are the one who is proposing this feature.
>
> My answer is, we are not worse than freshening loose objects case
> (especially since I took the idea from there).
I do not think so, unfortunately. Loose object files with stale
timestamps are not removed as long as objects are still reachable.
For the base/shared index file, the timestamp is the only thing that
protects them from pruning, unless it is serving as the base file
for the currently active $GIT_DIR/index that is split.
>> What is the failure mode after such a premature GC happens? What
>> does the end-user see? Can you try to (1) split the index (2)
>> modify bunch of entries (3) remove the base/shared index with /bin/rm
>> and then see how various Git commands fail? Do they fail gracefully?
>>
>> I am trying to gauge the seriousness of ignoring such an error here.
>
> If we fail to refresh it and the file is old enough and gc happens,
> any index file referenced to it are broken. Any commands that read the
> index will die(). The best you could do is delete $GIT_DIR/index and
> read-tree HEAD.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox