* [PATCH v3 1/4] serve: pass "config context" through to individual commands
From: Ævar Arnfjörð Bjarmason @ 2018-12-17 22:40 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Jeff King, Brandon Williams, Jonathan Tan,
Josh Steadmon, Ævar Arnfjörð Bjarmason
In-Reply-To: <20181213155817.27666-1-avarab@gmail.com>
From: Jeff King <peff@peff.net>
In protocol v2, instead of just running "upload-pack", we have a generic
"serve" loop which runs command requests from the client. What used to
be "upload-pack" is now generally split into two operations: "ls-refs"
and "fetch". The latter knows it must respect uploadpack.* config, but
the former is not actually specific to a fetch operation (we do not yet
do v2 receive-pack, but eventually we may, and ls-refs would support
both operations).
However, ls-refs does need to know which operation we're performing, in
order to read the correct config (e.g., uploadpack.hiderefs versus
receive.hiderefs; we don't read _either_ right now, but this is the
first step to fixing that).
In the generic "git-serve" program, we don't have that information. But
in the protocol as it is actually used by clients, the client still asks
to run "git-upload-pack", and then issues an "ls-refs" from there. So we
_do_ know at that point that "uploadpack" is the right config context.
This patch teaches the protocol v2 "serve" code to pass that context
through to the individual commands (a future patch will teach ls-refs to
actually use it).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
builtin/upload-pack.c | 1 +
ls-refs.c | 4 +++-
ls-refs.h | 3 ++-
serve.c | 9 +++++----
serve.h | 7 +++++++
upload-pack.c | 4 ++--
upload-pack.h | 4 ++--
7 files changed, 22 insertions(+), 10 deletions(-)
diff --git a/builtin/upload-pack.c b/builtin/upload-pack.c
index 42dc4da5a1f..67192cfa9e9 100644
--- a/builtin/upload-pack.c
+++ b/builtin/upload-pack.c
@@ -52,6 +52,7 @@ int cmd_upload_pack(int argc, const char **argv, const char *prefix)
case protocol_v2:
serve_opts.advertise_capabilities = opts.advertise_refs;
serve_opts.stateless_rpc = opts.stateless_rpc;
+ serve_opts.config_context = "uploadpack";
serve(&serve_opts);
break;
case protocol_v1:
diff --git a/ls-refs.c b/ls-refs.c
index a06f12eca8d..e8e31475f06 100644
--- a/ls-refs.c
+++ b/ls-refs.c
@@ -69,7 +69,9 @@ static int send_ref(const char *refname, const struct object_id *oid,
return 0;
}
-int ls_refs(struct repository *r, struct argv_array *keys,
+int ls_refs(struct repository *r,
+ const char *config_section,
+ struct argv_array *keys,
struct packet_reader *request)
{
struct ls_refs_data data;
diff --git a/ls-refs.h b/ls-refs.h
index b62877e8dad..da26fc9824d 100644
--- a/ls-refs.h
+++ b/ls-refs.h
@@ -4,7 +4,8 @@
struct repository;
struct argv_array;
struct packet_reader;
-extern int ls_refs(struct repository *r, struct argv_array *keys,
+extern int ls_refs(struct repository *r, const char *config_context,
+ struct argv_array *keys,
struct packet_reader *request);
#endif /* LS_REFS_H */
diff --git a/serve.c b/serve.c
index bda085f09c8..70f89cf0d98 100644
--- a/serve.c
+++ b/serve.c
@@ -48,6 +48,7 @@ struct protocol_capability {
* This field should be NULL for capabilities which are not commands.
*/
int (*command)(struct repository *r,
+ const char *config_context,
struct argv_array *keys,
struct packet_reader *request);
};
@@ -158,7 +159,7 @@ enum request_state {
PROCESS_REQUEST_DONE,
};
-static int process_request(void)
+static int process_request(struct serve_options *opts)
{
enum request_state state = PROCESS_REQUEST_KEYS;
struct packet_reader reader;
@@ -222,7 +223,7 @@ static int process_request(void)
if (!command)
die("no command requested");
- command->command(the_repository, &keys, &reader);
+ command->command(the_repository, opts->config_context, &keys, &reader);
argv_array_clear(&keys);
return 0;
@@ -249,10 +250,10 @@ void serve(struct serve_options *options)
* a single request/response exchange
*/
if (options->stateless_rpc) {
- process_request();
+ process_request(options);
} else {
for (;;)
- if (process_request())
+ if (process_request(options))
break;
}
}
diff --git a/serve.h b/serve.h
index fe65ba9f469..d527224cbb1 100644
--- a/serve.h
+++ b/serve.h
@@ -8,6 +8,13 @@ extern int has_capability(const struct argv_array *keys, const char *capability,
struct serve_options {
unsigned advertise_capabilities;
unsigned stateless_rpc;
+
+ /*
+ * Some operations may need to know the context when looking up config;
+ * e.g., set this to "uploadpack" to respect "uploadpack.hiderefs" (as
+ * opposed to "receive.hiderefs").
+ */
+ const char *config_context;
};
#define SERVE_OPTIONS_INIT { 0 }
extern void serve(struct serve_options *options);
diff --git a/upload-pack.c b/upload-pack.c
index 5e81f1ff24f..914bbb40bcd 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1413,8 +1413,8 @@ enum fetch_state {
FETCH_DONE,
};
-int upload_pack_v2(struct repository *r, struct argv_array *keys,
- struct packet_reader *request)
+int upload_pack_v2(struct repository *r, const char *config_context,
+ struct argv_array *keys, struct packet_reader *request)
{
enum fetch_state state = FETCH_PROCESS_ARGS;
struct upload_pack_data data;
diff --git a/upload-pack.h b/upload-pack.h
index cab2178796a..2a9f51197c4 100644
--- a/upload-pack.h
+++ b/upload-pack.h
@@ -13,8 +13,8 @@ void upload_pack(struct upload_pack_options *options);
struct repository;
struct argv_array;
struct packet_reader;
-extern int upload_pack_v2(struct repository *r, struct argv_array *keys,
- struct packet_reader *request);
+extern int upload_pack_v2(struct repository *r, const char *config_context,
+ struct argv_array *keys, struct packet_reader *request);
struct strbuf;
extern int upload_pack_advertise(struct repository *r,
--
2.20.0.405.gbc1bbc6f85
^ permalink raw reply related
* [PATCH v3 0/4] protocol v2 fixes
From: Ævar Arnfjörð Bjarmason @ 2018-12-17 22:40 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Jeff King, Brandon Williams, Jonathan Tan,
Josh Steadmon, Ævar Arnfjörð Bjarmason
In-Reply-To: <20181213155817.27666-1-avarab@gmail.com>
The v2 of this series conflicted with Josh Steadmon's work when merged
in "pu". That's still outstanding, see
https://public-inbox.org/git/87h8ff20ol.fsf@evledraar.gmail.com/
Then my just-sent
https://public-inbox.org/git/20181217221625.1523-1-avarab@gmail.com/
conflicts with even more things in it.
So I'm dropping "GIT_TEST_PROTOCOL_VERSION" for now until things
settle down. That can land after all this protocol activity settles.
No changes to Jeff's patches since v2, for Jonathan's no changes to
the C code, but I added a test which I extracted from the
GIT_TEST_PROTOCOL_VERSION=2 work.
Jeff King (3):
serve: pass "config context" through to individual commands
parse_hide_refs_config: handle NULL section
upload-pack: support hidden refs with protocol v2
Jonathan Tan (1):
fetch-pack: support protocol version 2
builtin/fetch-pack.c | 9 ++++++---
builtin/upload-pack.c | 1 +
ls-refs.c | 16 +++++++++++++++-
ls-refs.h | 3 ++-
refs.c | 3 ++-
serve.c | 9 +++++----
serve.h | 7 +++++++
t/t5500-fetch-pack.sh | 22 +++++++++++++++-------
t/t5512-ls-remote.sh | 6 ++++++
upload-pack.c | 4 ++--
upload-pack.h | 4 ++--
11 files changed, 63 insertions(+), 21 deletions(-)
--
2.20.0.405.gbc1bbc6f85
^ permalink raw reply
* [PATCH] upload-pack: turn on uploadpack.allowAnySHA1InWant=true
From: Ævar Arnfjörð Bjarmason @ 2018-12-17 22:16 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Jeff King, Brandon Williams, Jonathan Tan,
Fredrik Medley, David Turner, Matt McCutchen,
Ævar Arnfjörð Bjarmason
In-Reply-To: <20181217195713.GA10673@sigill.intra.peff.net>
Unconditionally turn on uploadpack.allowAnySHA1InWant=true which means
that the lesser uploadpack.allow{Tip,Reachable}SHA1InWant settings are
implicitly "true" as well.
This is because as mentioned in 235ec24352e ("doc: mention transfer
data leaks in more places", 2016-11-14) this has always been leaky,
and thus lulls users into a false sense of security. Better to not
make promises we're not confident we can keep.
Now setting any of uploadpack.allow{Tip,Reachable,Any}SHA1InWant will
warn, to e.g. point users who've explicitly set it to "false" to the
documentation. They're probably relying on behavior that was always
buggy. We may have users who trusted it to be "false" by default, but
warning if it's set seems like a good trade-off.
This change also has the effect of making any operation that needed
uploadpack.allowReachableSHA1InWant=true much faster, since we won't
need any reachability check.
This change doesn't do any of the hard work of extracting the now-dead
ALLOW_* code out of upload-pack.c and fetch-pack.c. Most of that was
changed or added in the following commits:
- 9f9fb57063a ("upload-pack: turn on uploadpack.allowAnySHA1InWant=true", 2018-12-17)
- 6e7b66eebd1 ("fetch: fetch objects by their exact SHA-1 object names", 2013-01-29)
- 7199c093ad4 ("upload-pack: prepare to extend allow-tip-sha1-in-want", 2015-05-21)
- 68ee628932c ("upload-pack: optionally allow fetching reachable sha1", 2015-05-21)
- f8edeaa05d8 ("upload-pack: optionally allow fetching any sha1", 2016-11-11)
That cleanup step can be done later. Starting with a change to switch
the default (and only) value and tweaking the docs makes it easier to
reason about this than fully ripping out the code right away.
See https://public-inbox.org/git/87pnu51kac.fsf@evledraar.gmail.com/
and related messages for further discussion.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
---
Documentation/config/uploadpack.txt | 48 +++++++++++++++----------
Documentation/transfer-data-leaks.txt | 8 +++++
t/t0410-partial-clone.sh | 2 --
t/t1090-sparse-checkout-scope.sh | 1 -
t/t5500-fetch-pack.sh | 10 +++---
t/t5512-ls-remote.sh | 14 ++++++++
t/t5516-fetch-push.sh | 50 +++++++++++++--------------
t/t5551-http-fetch-smart.sh | 4 ---
t/t5601-clone.sh | 3 --
t/t5616-partial-clone.sh | 8 +----
t/t5702-protocol-v2.sh | 1 -
t/t7406-submodule-update.sh | 5 +--
upload-pack.c | 18 +++-------
13 files changed, 87 insertions(+), 85 deletions(-)
diff --git a/Documentation/config/uploadpack.txt b/Documentation/config/uploadpack.txt
index ed1c835695c..854801bd6cc 100644
--- a/Documentation/config/uploadpack.txt
+++ b/Documentation/config/uploadpack.txt
@@ -1,31 +1,41 @@
uploadpack.hideRefs::
This variable is the same as `transfer.hideRefs`, but applies
only to `upload-pack` (and so affects only fetches, not pushes).
- An attempt to fetch a hidden ref by `git fetch` will fail. See
- also `uploadpack.allowTipSHA1InWant`.
++
+This can be used in some setups to reduce the size of the ref
+advertisement, see also `protocol.version` for using a protocol
+version that allows them to be filtered by the client.
++
+In earlier versions of Git attempts to fetch an object pointed to by a
+hidden ref would fail, this is no longer the case. See
+`uploadpack.allowAnySHA1InWant`.
uploadpack.allowTipSHA1InWant::
- When `uploadpack.hideRefs` is in effect, allow `upload-pack`
- to accept a fetch request that asks for an object at the tip
- of a hidden ref (by default, such a request is rejected).
- See also `uploadpack.hideRefs`. Even if this is false, a client
- may be able to steal objects via the techniques described in the
- "SECURITY" section of the linkgit:gitnamespaces[7] man page; it's
- best to keep private data in a separate repository.
+ Depreacted. Used to allow `upload-pack` to accept a fetch
+ request that asked for an object at the tip of a hidden
+ ref. Now always `true`. See `uploadpack.allowAnySHA1InWant`
+ below for details.
uploadpack.allowReachableSHA1InWant::
- Allow `upload-pack` to accept a fetch request that asks for an
- object that is reachable from any ref tip. However, note that
- calculating object reachability is computationally expensive.
- Defaults to `false`. Even if this is false, a client may be able
- to steal objects via the techniques described in the "SECURITY"
- section of the linkgit:gitnamespaces[7] man page; it's best to
- keep private data in a separate repository.
+ Depreacted. Used to allow `upload-pack` to accept a fetch
+ request that asked for an object that was reachable from any
+ ref tip. Now always `true`. See
+ `uploadpack.allowAnySHA1InWant` below for details.
uploadpack.allowAnySHA1InWant::
- Allow `upload-pack` to accept a fetch request that asks for any
- object at all.
- Defaults to `false`.
+ Deprecated. Before Git version 2.21 the
+ `uploadpack.allow{Tip,Any,Reachable}SHA1InWant` settings were
+ `false` by default. Now they're unconditionally `true`, which
+ means fetch request that ask for any object in the object
+ store will be accepted, regardless of whether such an object
+ is at the tip of a ref, reachable by no ref etc.
++
+The reason for the change in default is as noted in the "SECURITY"
+section of linkgit:git-fetch[1] a determined client can get to any
+previously forbidden objects the server has access to by manipulating
+the client/server dialog. If you host objects that should be kept
+secret from some clients it's best to keep them in a separate
+repository.
uploadpack.keepAlive::
When `upload-pack` has started `pack-objects`, there may be a
diff --git a/Documentation/transfer-data-leaks.txt b/Documentation/transfer-data-leaks.txt
index 914bacc39e0..f8beb334b84 100644
--- a/Documentation/transfer-data-leaks.txt
+++ b/Documentation/transfer-data-leaks.txt
@@ -28,3 +28,11 @@ The known attack vectors are as follows:
an object Y that the attacker already has, and the attacker falsely
claims to have X and not Y, so the victim sends Y as a delta against X.
The delta reveals regions of X that are similar to Y to the attacker.
+
+Before Git version 2.21 the
+`uploadpack.allow{Tip,Any,Reachable}SHA1InWant` allowed for tweaking
+linkgit:git-upload-pack[1] behavior to refuse to serve some refs. As
+noted above this served to lull users into a false sense of security,
+and the fetch protocol will now allow clients to fetch any object in
+the ref store. See `uploadpack.allowAnySHA1InWant` in
+linkgit:git-config[1].
diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh
index ba3887f178b..a9d24a94d9f 100755
--- a/t/t0410-partial-clone.sh
+++ b/t/t0410-partial-clone.sh
@@ -192,7 +192,6 @@ test_expect_success 'fetching of missing blobs works' '
git hash-object repo/foo.t >blobhash &&
rm -rf repo/.git/objects/* &&
- git -C server config uploadpack.allowanysha1inwant 1 &&
git -C server config uploadpack.allowfilter 1 &&
git -C repo config core.repositoryformatversion 1 &&
git -C repo config extensions.partialclone "origin" &&
@@ -211,7 +210,6 @@ test_expect_success 'fetching of missing trees does not fetch blobs' '
git hash-object repo/foo.t >blobhash &&
rm -rf repo/.git/objects/* &&
- git -C server config uploadpack.allowanysha1inwant 1 &&
git -C server config uploadpack.allowfilter 1 &&
git -C repo config core.repositoryformatversion 1 &&
git -C repo config extensions.partialclone "origin" &&
diff --git a/t/t1090-sparse-checkout-scope.sh b/t/t1090-sparse-checkout-scope.sh
index 090b7fc3d35..4de2e2dd1bb 100755
--- a/t/t1090-sparse-checkout-scope.sh
+++ b/t/t1090-sparse-checkout-scope.sh
@@ -68,7 +68,6 @@ test_expect_success 'in partial clone, sparse checkout only fetches needed blobs
git clone "file://$(pwd)/server" client &&
test_config -C server uploadpack.allowfilter 1 &&
- test_config -C server uploadpack.allowanysha1inwant 1 &&
echo a >server/a &&
echo bb >server/b &&
mkdir server/c &&
diff --git a/t/t5500-fetch-pack.sh b/t/t5500-fetch-pack.sh
index 086f2c40f68..b639629863d 100755
--- a/t/t5500-fetch-pack.sh
+++ b/t/t5500-fetch-pack.sh
@@ -592,8 +592,7 @@ test_expect_success 'fetch-pack can fetch a raw sha1' '
test_commit 1 &&
test_commit 2 &&
git update-ref refs/hidden/one HEAD^ &&
- git config transfer.hiderefs refs/hidden &&
- git config uploadpack.allowtipsha1inwant true
+ git config transfer.hiderefs refs/hidden
) &&
git fetch-pack hidden $(git -C hidden rev-parse refs/hidden/one)
'
@@ -619,7 +618,7 @@ test_expect_success 'fetch-pack can fetch a raw sha1 overlapping a named ref' '
$(git -C server rev-parse refs/tags/1) refs/tags/1
'
-test_expect_success 'fetch-pack cannot fetch a raw sha1 that is not advertised as a ref' '
+test_expect_success 'fetch-pack can fetch a raw sha1 that is not advertised as a ref' '
rm -rf server &&
git init server &&
@@ -628,9 +627,8 @@ test_expect_success 'fetch-pack cannot fetch a raw sha1 that is not advertised a
test_commit -C server 6 &&
git init client &&
- test_must_fail git -C client fetch-pack ../server \
- $(git -C server rev-parse refs/heads/master^) 2>err &&
- test_i18ngrep "Server does not allow request for unadvertised object" err
+ git -C client fetch-pack ../server \
+ $(git -C server rev-parse refs/heads/master^)
'
check_prot_path () {
diff --git a/t/t5512-ls-remote.sh b/t/t5512-ls-remote.sh
index 32e722db2ed..0cd1aad9597 100755
--- a/t/t5512-ls-remote.sh
+++ b/t/t5512-ls-remote.sh
@@ -195,6 +195,20 @@ do
! grep refs/tags/magic/one actual
'
+
+ test_expect_success "fetching a divergent object hidden by $configsection.hiderefs" "
+ test_when_finished 'test_unconfig $configsection.hiderefs' &&
+ git config --add $configsection.hiderefs refs/tags &&
+ test_when_finished 'git tag -d magic/divergent' &&
+ git tag -a -m'a tag' magic/divergent &&
+ SHA1_TAG=\$(git rev-parse magic/divergent) &&
+ echo \$SHA1_TAG >expect &&
+ git init client &&
+ test_when_finished 'rm -r client' &&
+ git -C client fetch ../ \$SHA1_TAG:refs/tags/magic/divergent &&
+ git -C client rev-parse magic/divergent >actual &&
+ test_cmp expect actual
+ "
done
test_expect_success 'overrides work between mixed transfer/upload-pack hideRefs' '
diff --git a/t/t5516-fetch-push.sh b/t/t5516-fetch-push.sh
index 37e8e80893d..8f2b12c17a5 100755
--- a/t/t5516-fetch-push.sh
+++ b/t/t5516-fetch-push.sh
@@ -1147,26 +1147,14 @@ test_expect_success 'fetch exact SHA1' '
git prune &&
test_must_fail git cat-file -t $the_commit &&
- # fetching the hidden object should fail by default
- test_must_fail git fetch -v ../testrepo $the_commit:refs/heads/copy 2>err &&
- test_i18ngrep "Server does not allow request for unadvertised object" err &&
- test_must_fail git rev-parse --verify refs/heads/copy &&
+ # fetching the object should work
+ git fetch -v ../testrepo $the_commit:refs/heads/copy &&
+ git rev-parse --verify refs/heads/copy &&
- # the server side can allow it to succeed
- (
- cd ../testrepo &&
- git config uploadpack.allowtipsha1inwant true
- ) &&
-
- git fetch -v ../testrepo $the_commit:refs/heads/copy master:refs/heads/extra &&
cat >expect <<-EOF &&
$the_commit
- $the_first_commit
EOF
- {
- git rev-parse --verify refs/heads/copy &&
- git rev-parse --verify refs/heads/extra
- } >actual &&
+ git rev-parse --verify refs/heads/copy >actual &&
test_cmp expect actual
)
'
@@ -1190,6 +1178,25 @@ test_expect_success 'fetch exact SHA1 in protocol v2' '
git -C child fetch -v ../testrepo $the_commit:refs/heads/copy
'
+for state in true false
+do
+ test_expect_success "Deprecated uploadpack.*=$state settings warn" "
+ mk_empty deprecated-warnings &&
+ test_commit -C deprecated-warnings A &&
+ test_config -C deprecated-warnings uploadpack.allowTipSHA1InWant $state &&
+ test_config -C deprecated-warnings uploadpack.allowReachableSHA1InWant $state &&
+ test_config -C deprecated-warnings uploadpack.allowAnySHA1InWant $state &&
+ mk_empty target &&
+ (
+ cd target &&
+ git fetch ../deprecated-warnings/.git refs/tags/A:refs/tags/A 2>stderr &&
+ test_i18ngrep 'warning: .*uploadpack.allowTipSHA1InWant .*deprecated' stderr &&
+ test_i18ngrep 'warning: .*uploadpack.allowReachableSHA1InWant.* deprecated' stderr &&
+ test_i18ngrep 'warning: .*uploadpack.allowAnySHA1InWant.* deprecated' stderr
+ )
+ "
+done
+
for configallowtipsha1inwant in true false
do
test_expect_success "shallow fetch reachable SHA1 (but not a ref), allowtipsha1inwant=$configallowtipsha1inwant" '
@@ -1204,8 +1211,6 @@ do
mk_empty shallow &&
(
cd shallow &&
- test_must_fail git fetch --depth=1 ../testrepo/.git $SHA1 &&
- git --git-dir=../testrepo/.git config uploadpack.allowreachablesha1inwant true &&
git fetch --depth=1 ../testrepo/.git $SHA1 &&
git cat-file commit $SHA1
)
@@ -1232,15 +1237,10 @@ do
mk_empty shallow &&
(
cd shallow &&
- test_must_fail ok=sigpipe git fetch ../testrepo/.git $SHA1_3 &&
- test_must_fail ok=sigpipe git fetch ../testrepo/.git $SHA1_1 &&
- git --git-dir=../testrepo/.git config uploadpack.allowreachablesha1inwant true &&
+ git fetch ../testrepo/.git $SHA1_3 &&
git fetch ../testrepo/.git $SHA1_1 &&
git cat-file commit $SHA1_1 &&
- test_must_fail git cat-file commit $SHA1_2 &&
- git fetch ../testrepo/.git $SHA1_2 &&
- git cat-file commit $SHA1_2 &&
- test_must_fail ok=sigpipe git fetch ../testrepo/.git $SHA1_3
+ git cat-file commit $SHA1_2
)
'
done
diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
index 8630b0cc390..b8f7c0b5563 100755
--- a/t/t5551-http-fetch-smart.sh
+++ b/t/t5551-http-fetch-smart.sh
@@ -283,7 +283,6 @@ test_expect_success 'test allowreachablesha1inwant' '
test_when_finished "rm -rf test_reachable.git" &&
server="$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
master_sha=$(git -C "$server" rev-parse refs/heads/master) &&
- git -C "$server" config uploadpack.allowreachablesha1inwant 1 &&
git init --bare test_reachable.git &&
git -C test_reachable.git remote add origin "$HTTPD_URL/smart/repo.git" &&
@@ -302,7 +301,6 @@ test_expect_success 'test allowreachablesha1inwant with unreachable' '
server="$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
master_sha=$(git -C "$server" rev-parse refs/heads/master) &&
- git -C "$server" config uploadpack.allowreachablesha1inwant 1 &&
git init --bare test_reachable.git &&
git -C test_reachable.git remote add origin "$HTTPD_URL/smart/repo.git" &&
@@ -321,13 +319,11 @@ test_expect_success 'test allowanysha1inwant with unreachable' '
server="$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
master_sha=$(git -C "$server" rev-parse refs/heads/master) &&
- git -C "$server" config uploadpack.allowreachablesha1inwant 1 &&
git init --bare test_reachable.git &&
git -C test_reachable.git remote add origin "$HTTPD_URL/smart/repo.git" &&
test_must_fail git -C test_reachable.git fetch origin "$(git rev-parse HEAD)" &&
- git -C "$server" config uploadpack.allowanysha1inwant 1 &&
git -C test_reachable.git fetch origin "$(git rev-parse HEAD)"
'
diff --git a/t/t5601-clone.sh b/t/t5601-clone.sh
index 8bbc7068acb..ec7a3c46528 100755
--- a/t/t5601-clone.sh
+++ b/t/t5601-clone.sh
@@ -646,7 +646,6 @@ partial_clone () {
test_commit -C "$SERVER" two &&
HASH2=$(git hash-object "$SERVER/two.t") &&
test_config -C "$SERVER" uploadpack.allowfilter 1 &&
- test_config -C "$SERVER" uploadpack.allowanysha1inwant 1 &&
git clone --filter=blob:limit=0 "$URL" client &&
@@ -689,7 +688,6 @@ test_expect_success 'batch missing blob request during checkout' '
git -C server commit -m x &&
test_config -C server uploadpack.allowfilter 1 &&
- test_config -C server uploadpack.allowanysha1inwant 1 &&
git clone --filter=blob:limit=0 "file://$(pwd)/server" client &&
@@ -720,7 +718,6 @@ test_expect_success 'batch missing blob request does not inadvertently try to fe
git -C server commit -m x &&
test_config -C server uploadpack.allowfilter 1 &&
- test_config -C server uploadpack.allowanysha1inwant 1 &&
# Make sure that it succeeds
git clone --filter=blob:limit=0 "file://$(pwd)/server" client
diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index 336f02a41a6..213899a690d 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -25,8 +25,7 @@ test_expect_success 'setup normal src repo' '
# bare clone "src" giving "srv.bare" for use as our server.
test_expect_success 'setup bare clone for server' '
git clone --bare "file://$(pwd)/src" srv.bare &&
- git -C srv.bare config --local uploadpack.allowfilter 1 &&
- git -C srv.bare config --local uploadpack.allowanysha1inwant 1
+ git -C srv.bare config --local uploadpack.allowfilter 1
'
# do basic partial clone from "srv.bare"
@@ -159,7 +158,6 @@ test_expect_success 'partial clone with transfer.fsckobjects=1 uses index-pack -
git init src &&
test_commit -C src x &&
test_config -C src uploadpack.allowfilter 1 &&
- test_config -C src uploadpack.allowanysha1inwant 1 &&
GIT_TRACE="$(pwd)/trace" git -c transfer.fsckobjects=1 \
clone --filter="blob:none" "file://$(pwd)/src" dst &&
@@ -213,7 +211,6 @@ test_expect_success 'partial clone fetches blobs pointed to by refs even if norm
git init src &&
test_commit -C src x &&
test_config -C src uploadpack.allowfilter 1 &&
- test_config -C src uploadpack.allowanysha1inwant 1 &&
# Create a tag pointing to a blob.
BLOB=$(echo blob-contents | git -C src hash-object --stdin -w) &&
@@ -229,7 +226,6 @@ test_expect_success 'fetch what is specified on CLI even if already promised' '
git init src &&
test_commit -C src foo &&
test_config -C src uploadpack.allowfilter 1 &&
- test_config -C src uploadpack.allowanysha1inwant 1 &&
git hash-object --stdin <src/foo.t >blob &&
@@ -257,7 +253,6 @@ test_expect_success 'upon cloning, check that all refs point to objects' '
test_create_repo "$SERVER" &&
test_commit -C "$SERVER" foo &&
test_config -C "$SERVER" uploadpack.allowfilter 1 &&
- test_config -C "$SERVER" uploadpack.allowanysha1inwant 1 &&
# Create a tag pointing to a blob.
BLOB=$(echo blob-contents | git -C "$SERVER" hash-object --stdin -w) &&
@@ -293,7 +288,6 @@ test_expect_success 'when partial cloning, tolerate server not sending target of
test_create_repo "$SERVER" &&
test_commit -C "$SERVER" foo &&
test_config -C "$SERVER" uploadpack.allowfilter 1 &&
- test_config -C "$SERVER" uploadpack.allowanysha1inwant 1 &&
# Create an annotated tag pointing to a blob.
BLOB=$(echo blob-contents | git -C "$SERVER" hash-object --stdin -w) &&
diff --git a/t/t5702-protocol-v2.sh b/t/t5702-protocol-v2.sh
index 0f2b09ebb8d..af16f139a4a 100755
--- a/t/t5702-protocol-v2.sh
+++ b/t/t5702-protocol-v2.sh
@@ -274,7 +274,6 @@ test_expect_success 'setup filter tests' '
test_commit -C server message2 a.txt &&
git -C server config protocol.version 2 &&
git -C server config uploadpack.allowfilter 1 &&
- git -C server config uploadpack.allowanysha1inwant 1 &&
git -C server config protocol.version 2
'
diff --git a/t/t7406-submodule-update.sh b/t/t7406-submodule-update.sh
index e87164aa8ff..cf5bda08690 100755
--- a/t/t7406-submodule-update.sh
+++ b/t/t7406-submodule-update.sh
@@ -943,10 +943,7 @@ test_expect_success 'submodule update clone shallow submodule outside of depth'
cd super3 &&
sed -e "s#url = ../#url = file://$pwd/#" <.gitmodules >.gitmodules.tmp &&
mv -f .gitmodules.tmp .gitmodules &&
- test_must_fail git submodule update --init --depth=1 2>actual &&
- test_i18ngrep "Direct fetching of that commit failed." actual &&
- git -C ../submodule config uploadpack.allowReachableSHA1InWant true &&
- git submodule update --init --depth=1 >actual &&
+ git submodule update --init --depth=1 &&
git -C submodule log --oneline >out &&
test_line_count = 1 out
)
diff --git a/upload-pack.c b/upload-pack.c
index 5e81f1ff24f..18e4c2d2452 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -54,7 +54,8 @@ static int no_progress, daemon_mode;
#define ALLOW_REACHABLE_SHA1 02
/* Allow request of any sha1. Implies ALLOW_TIP_SHA1 and ALLOW_REACHABLE_SHA1. */
#define ALLOW_ANY_SHA1 07
-static unsigned int allow_unadvertised_object_request;
+static unsigned int allow_unadvertised_object_request = (
+ ALLOW_TIP_SHA1 | ALLOW_REACHABLE_SHA1 | ALLOW_ANY_SHA1);
static int shallow_nr;
static struct object_array extra_edge_obj;
static unsigned int timeout;
@@ -1019,20 +1020,11 @@ static int find_symref(const char *refname, const struct object_id *oid,
static int upload_pack_config(const char *var, const char *value, void *unused)
{
if (!strcmp("uploadpack.allowtipsha1inwant", var)) {
- if (git_config_bool(var, value))
- allow_unadvertised_object_request |= ALLOW_TIP_SHA1;
- else
- allow_unadvertised_object_request &= ~ALLOW_TIP_SHA1;
+ warning(_("The uploadpack.allowTipSHA1InWant setting is deprecated, and now always 'true'"));
} else if (!strcmp("uploadpack.allowreachablesha1inwant", var)) {
- if (git_config_bool(var, value))
- allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
- else
- allow_unadvertised_object_request &= ~ALLOW_REACHABLE_SHA1;
+ warning(_("The uploadpack.allowReachableSHA1InWant setting is deprecated, and now always 'true'"));
} else if (!strcmp("uploadpack.allowanysha1inwant", var)) {
- if (git_config_bool(var, value))
- allow_unadvertised_object_request |= ALLOW_ANY_SHA1;
- else
- allow_unadvertised_object_request &= ~ALLOW_ANY_SHA1;
+ warning(_("The uploadpack.allowAnySHA1InWant setting is deprecated, and now always 'true'"));
} else if (!strcmp("uploadpack.keepalive", var)) {
keepalive = git_config_int(var, value);
if (!keepalive)
--
2.20.0.405.gbc1bbc6f85
^ permalink raw reply related
* Re: [PATCH v2 2/7] test-lib: parse some --options earlier
From: Jeff King @ 2018-12-17 21:44 UTC (permalink / raw)
To: SZEDER Gábor; +Cc: git, Junio C Hamano
In-Reply-To: <20181211124245.GT30222@szeder.dev>
On Tue, Dec 11, 2018 at 01:42:45PM +0100, SZEDER Gábor wrote:
> > But looking at what this is replacing:
> >
> > > -case "$GIT_TEST_TEE_STARTED, $* " in
> > > -done,*)
> > > - # do not redirect again
> > > - ;;
> > > -*' --tee '*|*' --va'*|*' -V '*|*' --verbose-log '*)
>
>
> Anyway, I had another crack at turning the current option parsing loop
> into a for loop keeping $@ intact, and the results don't look all that
> bad this time. Note that this diff below only does the while -> for
> conversion, but leaves the loop where it is, so the changes are easily
> visible. The important bits are the conditions at the beginning of
> the loop and after the loop, and the handling of '-r'; the rest is
> mostly s/shift// and sort-of s/$1/$opt/.
>
> Thoughts? Is it better than two loops? I think it's better.
It certainly looks better to me. It also makes sense to me to validate
the options before forking/logging, though I suppose one could argue the
opposite.
I wonder why we didn't do it this way in the beginning (i.e., why the
tee bits were all handled separately before the parsing phase). I guess
just because we have to pass the options down to the sub-process.
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index 9a3f7930a3..efdb6be3c8 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -264,58 +264,65 @@ test "x$TERM" != "xdumb" && (
> ) &&
> color=t
>
> -while test "$#" -ne 0
> +store_arg_to=
> +prev_opt=
> +for opt
> do
> - case "$1" in
> + if test -n "$store_arg_to"
> + then
> + eval $store_arg_to=\$opt
> + store_arg_to=
> + prev_opt=
> + continue
> + fi
OK, so this is set for the unstuck options, which then pick up the
option in the next loop iteration. That's perhaps less gross than my
"re-build the options with set --" trick.
A simple variable set is enough for "-r". In theory we could make this:
if test -n "$handle_unstuck_arg"
then
eval "$handle_unstuck_arg \$1"
fi
...
-r)
handle_unstuck_arg=handle_opt_r ;;
and handle_opt_r() could do whatever it wants. But I don't really
foresee us adding a lot of new options (in fact, given that this is just
the internal tests, I am tempted to say that we should just make it
"-r<arg>" for the sake of simplicity and consistency. But maybe somebody
would be annoyed. I have never used "-r" ever myself).
-Peff
^ permalink raw reply
* Re: [PATCH v3 1/4] pack-protocol.txt: accept error packets in any context
From: Jeff King @ 2018-12-17 21:33 UTC (permalink / raw)
To: Masaya Suzuki, git, Junio C Hamano
In-Reply-To: <20181213221826.GE37614@google.com>
On Thu, Dec 13, 2018 at 02:18:26PM -0800, Josh Steadmon wrote:
> On 2018.12.12 17:17, Masaya Suzuki wrote:
> > On Wed, Dec 12, 2018 at 3:02 AM Jeff King <peff@peff.net> wrote:
> > > This ERR handling has been moved to a very low level. What happens if
> > > we're passing arbitrary data via the packet_read() code? Could we
> > > erroneously trigger an error if a packfile happens to have the bytes
> > > "ERR " at a packet boundary?
> > >
> > > For packfiles via upload-pack, I _think_ we're OK, because we only
> > > packetize it when a sideband is in use. In which case this would never
> > > match, because we'd have "\1" in the first byte slot.
> > >
> > > But are there are other cases we need to worry about? Just
> > > brainstorming, I can think of:
> > >
> > > 1. We also pass packetized packfiles between git-remote-https and
> > > the stateless-rpc mode of fetch-pack/send-pack. And I don't think
> > > we use sidebands there.
> > >
> > > 2. The packet code is used for long-lived clean/smudge filters these
> > > days, which also pass arbitrary data.
> > >
> > > So I think it's probably not a good idea to unconditionally have callers
> > > of packet_read_with_status() handle this. We'd need a flag like
> > > PACKET_READ_RESPECT_ERR, and to trigger it from the appropriate callers.
> >
> > This is outside of the Git pack protocol so having a separate parsing
> > mode makes sense to me.
>
> This sounds like it could be a significant refactoring. Should we go
> back to V2 of this series, and then work on the new parsing mode
> separately?
Which one is v2? :)
Just the remote-curl cleanups from me, and then your "die on server-side
errors" patch?
-Peff
^ permalink raw reply
* Re: Git blame performance on files with a lot of history
From: Clement Moyroud @ 2018-12-17 20:59 UTC (permalink / raw)
To: stolee; +Cc: git
In-Reply-To: <3f3e7b11-19ef-cc2f-3bd4-e03d9ba8dc91@gmail.com>
On Fri, Dec 14, 2018 at 1:31 PM Derrick Stolee <stolee@gmail.com> wrote:
>
> Please double-check that you have the 'core.commitGraph' config setting
> enabled, or you will not read the commit-graph at run-time:
>
> git config core.commitGraph true
>
Yeah, this is what happens when trying too many things at once :( I
had removed it to get
with/without scores, and forgot to re-enable it before trying my last
set of experiments.
Here are the results with it enabled:
> time GIT_TRACE_BLOOM_FILTER=2 GIT_USE_POC_BLOOM_FILTER=y /path/to/git rev-list --count --full-history HEAD -- important/file.C
10:32:06.665057 revision.c:483 bloom filter total queries:
286363 definitely not: 234605 maybe: 51758 false positives: 48212 fp
ratio: 0.168360
GIT_TRACE_BLOOM_FILTER=2 GIT_USE_POC_BLOOM_FILTER=y rev-list --count
HEAD - 2.62s user 0.14s system 97% cpu 2.830 total
> time /path/to/git rev-list --count --full-history HEAD -- ic/lv/src/iclv/drc_compiler.C
3576
/path/to/git rev-list 8.86s user 0.15s system 99% cpu 9.031 total
So I'm getting a 3x benefit, not bad! This is on the re-repacked repo,
which is why I ran again
with and without the Bloom filter.
Let's see what this does for blame:
> time GIT_TRACE_BLOOM_FILTER=2 GIT_USE_POC_BLOOM_FILTER=y /path/to/git blame master -- important/file.C > /tmp/foo
Blaming lines: 100% (33179/33179), done.
12:50:42.703522 revision.c:483 bloom filter total queries: 0
definitely not: 0 maybe: 0 false positives: 0 fp ratio: -nan
GIT_TRACE_BLOOM_FILTER=2 GIT_USE_POC_BLOOM_FILTER=y blame master --
> 132.59s user 2.15s system 99% cpu 2:14.95 total
Seems like it's not implemented for blame operations. I'll be happy to
test any implementation.
Take care,
Clément
^ permalink raw reply
* Re: Git blame performance on files with a lot of history
From: Clement Moyroud @ 2018-12-17 20:43 UTC (permalink / raw)
To: bturner; +Cc: git
In-Reply-To: <CAGyf7-EZaiVBnoBfRJXAK7KdaXzPaDFVocUrB8LQNU_Hi16u7g@mail.gmail.com>
did hOn Fri, Dec 14, 2018 at 11:10 AM Bryan Turner
<bturner@atlassian.com> wrote:
>
> After you converted the repository from CVS to Git, did you run a manual repack?
>
> The process of converting a repository from another SCM often results
> in poor delta chain selections which result in a repository that's
> unnecessarily large on disk, and/or performs quite slowly.
>
Yep I did a repack, using 'git repack -A -d --pack-kept-objects'. On
NFS it'd be even
worse, because of all the small objects Git would have to go through.
> Something like `git repack -Adf --depth=50 --window=200` discards the
> existing delta chains and chooses new ones, and may result in
> significantly improved performance. A smaller depth, like --depth=20,
> might result in even more performance improvement, but may also make
> the repository larger on disk; you'll need to find the balance that
> works for you.
>
I re-ran with 'git repack -Adf --depth=20 --window=200' and that did
help quite a bit:
> time git blame master -- important/file.C > /tmp/foo
Blaming lines: 100% (33179/33179), done.
git blame master -- important/file.C > /tmp/foo 50.70s user 0.55s
system 99% cpu 51.298 total
That's roughly a 3x improvement, great. I just need another 10x one
and we'll be in business :) Based on
experiments with the Bloom filter, maybe that'll help enough to get
our users on board.
Cheers,
Clément
^ permalink raw reply
* Re: Git blame performance on files with a lot of history
From: Clement Moyroud @ 2018-12-17 20:30 UTC (permalink / raw)
To: avarab; +Cc: git
In-Reply-To: <87ftuz208b.fsf@evledraar.gmail.com>
On Fri, Dec 14, 2018 at 2:48 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Fri, Dec 14 2018, Clement Moyroud wrote:
>
> > My group at work is migrating a CVS repo to Git. The biggest issue we
> > face so far is the performance of git blame, especially compared to
> > CVS on the same file. One file especially causes us trouble: it's a
> > 30k lines file with 25 years of history in 3k+ commits. The complete
> > repo has 200k+ commits over that same period of time.
>
> There's a real-world repo with a shape & size very similar to this that
> has good performance, gcc.git: https://github.com/gcc-mirror/gcc
>
> $ wc -l ChangeLog
> 20240 ChangeLog
> $ git log --oneline -- ChangeLog | wc -l
> 2676
> $ git log --oneline | wc -l
> 165309
> $ time git blame ChangeLog >/dev/null
>
> real 0m1.977s
> user 0m1.909s
> sys 0m0.069s
>
> Its history began in 1997, and the changes to the ChangeLog file by its
> nature is fairly evenly spread through that period.
>
> So check out that repo to see if you have similar or worse
> performance. Does your work repo show the same problem with a history
> produced with 'git fast-export --anonymize', and if so is that something
> you'd be OK with sharing?
Hi Ævar,
I see around 3s here on the GCC repo, but I'm on a VM and the repo is
cloned on an NFS disk, so I'd say it matches :) It's around 45x faster
than my repo, on the same NFS share and VM. So there's definitely
something to improve here on my end (see my reply to Bryan re: repack
in a separate e-mail).
The anonymized export won't work in that case: all file contents are
replaced with 'anonymous blob <n>', so there's no per-line history for
blame to follow. Let me see if I can post-process a non-anonymized
version to keep the relevant data available.
Cheers,
Clément
^ permalink raw reply
* Re: [PATCH 0/4] Expose gpgsig in pretty-print
From: Jeff King @ 2018-12-17 20:24 UTC (permalink / raw)
To: John Passaro; +Cc: Michał Górny, git, Junio C Hamano, Alexey Shumkin
In-Reply-To: <CAJdN7KjExd6T+H4-wEupO2dg_mMWzeA22oYaskkfhz+GuFbfRQ@mail.gmail.com>
On Fri, Dec 14, 2018 at 11:07:03AM -0500, John Passaro wrote:
> Then I might rename the other new placeholders too:
>
> %Gs: signed commit signature (blank when unsigned)
> %Gp: signed commit payload (i.e. in practice minus the gpgsig header;
> also blank when unsigned as well)
One complication: the pretty-printing code sees the commit data in the
i18n.logOutputEncoding charset (utf8 by default). But the signature will
be over the raw commit data. That's also utf8 by default, but there may
be an encoding header indicating that it's something else. In that case,
you couldn't actually verify the signature from the "%Gs%Gp" pair.
I don't think that's insurmountable in the code. You'll have to jump
through a few hoops to make sure you have the _original_ payload, but we
obviously do have that data. However, it does feel a little weird to
include content from a different encoding in the middle of the log
output stream which claims to be i18n.logOutputEncoding.
-Peff
^ permalink raw reply
* Re: [PATCH v2 8/8] tests: mark tests broken under GIT_TEST_PROTOCOL_VERSION=2
From: Jeff King @ 2018-12-17 19:59 UTC (permalink / raw)
To: Ævar Arnfjörð Bjarmason
Cc: git, Junio C Hamano, Brandon Williams, Jonathan Tan
In-Reply-To: <87mup81i3x.fsf@evledraar.gmail.com>
On Fri, Dec 14, 2018 at 12:08:02PM +0100, Ævar Arnfjörð Bjarmason wrote:
> Also, parts of the v2 code, e.g. this in 685fbd3291 ("fetch-pack:
> perform a fetch using v2", 2018-03-15):
>
> /* v2 supports these by default */
> allow_unadvertised_object_request |= ALLOW_REACHABLE_SHA1;
>
> Seem to have intended to turn on allowReachableSHA1InWant, not
> allowAnySHA1InWant, but apparently that's what we're doing?
That I don't know. Maybe Brandon can answer what the intent was.
I agree that turning on the reachability check would make it behavior
more or less the same as v0 (at some additional cost to walk the refs
again, but v0 smart-http already had to do that).
-Peff
^ permalink raw reply
* Re: [PATCH v2 8/8] tests: mark tests broken under GIT_TEST_PROTOCOL_VERSION=2
From: Jeff King @ 2018-12-17 19:57 UTC (permalink / raw)
To: Ævar Arnfjörð Bjarmason
Cc: git, Junio C Hamano, Brandon Williams, Jonathan Tan
In-Reply-To: <87o99o1iot.fsf@evledraar.gmail.com>
On Fri, Dec 14, 2018 at 11:55:30AM +0100, Ævar Arnfjörð Bjarmason wrote:
> > An interesting implication of this at GitHub (and possibly other
> > hosters) is that it exposes objects from shared storage via unexpected
> > repos. If I fork torvalds/linux to peff/linux and push up object X, a v0
> > fetch will (by default) refuse to serve it to me. But v2 will happily
> > hand it over, which may confuse people into thinking that the object is
> > (or at least at some point was) in Linus's repository.
>
> More importantly this bypasses the security guarantee we've had with the
> default of uploadpack.allowAnySHA1InWant=false.
IMHO those security guarantees there are overrated (due to delta
guessing attacks, though things are not quite as bad if the attacker
can't actually push to the repo).
But I agree that people do assume it's the case. I was certainly
surprised by the v2 behavior, and I don't remember that aspect being
discussed.
(I suspect it was mostly because in the stateless world, the "fetch"
operation requires iterating all of the refs again. That's made even
harder by the fact that "ls-refs" takes arguments now, and might have
only advertised a subset of the refs. How can "fetch" know which ones
were advertised?).
> a) Subject to the "SECURITY" section of git-fetch, i.e. maybe this
> doesn't even work in the face of a determined attacker.
Yeah, this is the part I was thinking about above.
> b) Hosting sites like GitHub, GitLab, Gitweb etc. aren't doing
> reachability checks when you view /commit/<id>. If I delete my topic
> branch it'll be viewable until the next GC kicks in (which would
> also be the case with uploadpack.allowAnySHA1InWant=true).
Actually, we don't ever prune unless a user asks us to (and certainly
users do ask us to after accidentally pushing secret stuff). In GitHub's
case, though, we always tell people to consider anything pushed to a
public repo to be non-secret, even if it was exposed for only a few
seconds. My understanding is that there are literally clients watching
the public feeds and sucking down repo data looking for these kinds of
mistakes.
The /commit/<id> thing has been a frequent topic of discussion within
GitHub over the years, and we have looked at actually doing reachability
checks. They're expensive, though bitmaps help.
> So I wonder how much we need to care about this in practice, but for
> some configurations protocol v2 definitely breaks a security promise
> we've been making, now whether that promise was always BS due to "a)"
> above is another matter.
>
> I'm inclined to say that in the face of that "SECURITY" section we
> should just:
>
> * Turn on uploadpack.allowReachableSHA1InWant for v0/v1 by
> default. Make saying uploadpack.allowReachableSHA1InWant=false warn
> with "this won't work, see SECURITY...".
>
> * The uploadpack.allowTipSHA1InWant setting will also be turned on by
> default, and will be much faster, since it'll just degrade to
> uploadpack.allowReachableSHA1InWant=true and we won't need any
> reachability check. We'll also warn saying that setting it is
> useless.
No real argument from me. I have always thought those security
guarantees were BS.
However, I do think that's all orthogonal to the issue I was mentioning,
which is that it looks like repo A wants to serve you object X, even if
that repo never saw the object. That's unique to shared storage schemes,
though.
IMHO this is also a user education issue (it's a question of trust: refs
are the source of trust, and objects don't need to be trusted because
they're immutable). But it's pretty subtle for people who don't know the
inner workings of Git.
-Peff
^ permalink raw reply
* Re: pack file object size question
From: Jeff King @ 2018-12-17 19:39 UTC (permalink / raw)
To: Farhan Khan; +Cc: git
In-Reply-To: <CAFd4kYCHefqRsiFK=K7MHp=MTwOBXB5979WobEm3w1J5q1bZ0w@mail.gmail.com>
On Sun, Dec 16, 2018 at 04:52:13PM -0500, Farhan Khan wrote:
> It seems that there is a 12 byte header (signature, version, number of
> objects), then it immediately jumps into each individual object. The
> object consists of the object header, then the zlib deflated object,
> followed by a SHA1 of the above. Is this accurate?
Others discussed the length confusion, but I wanted to point out one
more thing: the packfile does not contain the sha1 of each object. That
is computed by index-pack (but there is a sha1 of the contents of the
_entire_ packfile).
A bit error on the wire will be detected by the whole-pack sha1. A bit
error on the sender's disk generally be detected by zlib, but not
always. The ultimate check that the receiver does is make sure it has
all of the expected objects by walking the object graph from the
proposed ref updates. Any object which has an undetected bit error will
appear to be be missing (as well as any object that the sender actually
just failed to send).
-Peff
^ permalink raw reply
* Re: Can git tell me which uncommitted files clash with the incoming changes?
From: Duy Nguyen @ 2018-12-17 19:37 UTC (permalink / raw)
To: Elijah Newren; +Cc: Mark Kharitonov, Git Mailing List
In-Reply-To: <CABPp-BE9+qqVfccwzofD0qFecTGo2EjighNSu0vh-rF_8F5PoA@mail.gmail.com>
On Mon, Dec 17, 2018 at 6:17 PM Elijah Newren <newren@gmail.com> wrote:
>
> On Mon, Dec 17, 2018 at 8:26 AM Duy Nguyen <pclouds@gmail.com> wrote:
> >
> > On Mon, Dec 17, 2018 at 2:11 PM Mark Kharitonov
> > <mark.kharitonov@gmail.com> wrote:
> > >
> > > Hi,
> > > I have asked this question on SO
> > > (https://stackoverflow.com/questions/53679167/can-git-tell-me-which-uncommitted-files-clash-with-the-incoming-changes)
> > > and usually there are tons of responses on Git questions, but not on
> > > this one.
> > >
> > > Allow me to quote it now.
> > >
> > > Please, observe:
> > >
> > > C:\Dayforce\test [master ↓2 +0 ~2 -0 !]> git pull
> > > error: Your local changes to the following files would be
> > > overwritten by merge:
> > > 2.txt
> > > Please commit your changes or stash them before you merge.
> > > Aborting
> > > Updating 2dc8bd0..ea343f8
> > > C:\Dayforce\test [master ↓2 +0 ~2 -0 !]>
> > >
> > > Does git have a command that can tell me which uncommitted files cause
> > > the this error? I can see them displayed by git pull, but I really do
> > > not want to parse git pull output.
> >
> > Assume that you have done "git fetch origin" (or whatever master's
> > upstream is). Do
> >
> > git diff --name-only HEAD origin/master
> >
> > You get the list of files that will need to be updated. Do
> >
> > git diff --name-only
>
> Are you assuming that `git diff --cached --name-only` is empty? If it
> isn't, that alone will trigger a failure (unless using an esoteric
> merge strategy or an older version of git), so this assumption is
> fairly reasonable to make. But it may be worth being explicit about
> for external readers.
Actually I think Jeff's suggestion may be better since he compares
worktree with HEAD and should catch everything.
> > to get the list of files that have local changes. If this list shares
> > some paths with the first list, these paths will very likely cause
> > "git pull" to abort.
> >
> > For a better check, I think you need to do "git read-tree -m" by
> > yourself (to a temporary index file with --index-output) then you can
> > examine that file and determine what file has changed compared to HEAD
> > (and if the same file has local changes, git-pull will be aborted).
> > You may need to read more in read-tree man page.
> >
> > Ideally though, git-read-tree should be able to tell what paths are
> > updated in "--dry-run -u" mode. But I don't think it's supported yet.
>
> merge-recursive currently uses unpack_trees to do this "files would be
> overwritten by merge" checking, so the suggestion of read-tree (which
> also uses unpack_trees) makes sense. BUT ... the error checking in
> unpack_trees has both false positives and false negatives due to not
> understanding renames, and it is somewhat of a nightmarish mess. See
> [1] for details. Further, I think it warns in cases that shouldn't be
> needed (both sides of history modified the same file, with the
> modifications on HEAD's side being a superset of the changes on the
> other side, in such a way that 3-way content merge happens to match
> what is in HEAD already). So, while the suggestions made so far give
> some useful approximations, it's an approximation that will get worse
> over time.
Ah.. dang. I guess we need "git merge --dry-run" then :)
> I don't have a better approximation to provide at this
> time, though.
>
>
> Elijah
>
> [1] https://public-inbox.org/git/20171124195901.2581-1-newren@gmail.com/
> , starting at "Note that unpack_trees() doesn't understand renames"
> and running until "4-way merges simply cause the complexity to
> increase with every new capability."
--
Duy
^ permalink raw reply
* Re: Can git tell me which uncommitted files clash with the incoming changes?
From: Jeff King @ 2018-12-17 19:35 UTC (permalink / raw)
To: Ævar Arnfjörð Bjarmason; +Cc: Mark Kharitonov, git
In-Reply-To: <877eg80z1f.fsf@evledraar.gmail.com>
On Mon, Dec 17, 2018 at 07:49:00PM +0100, Ævar Arnfjörð Bjarmason wrote:
> > The answer that gives will be overly broad (e.g., in a case where our
> > local branch had touched file "foo" but other side had not, we'd
> > consider "foo" as a difference the two-point diff-tree, whereas a real
> > 3-way merge would realize that we'd keep our version of "foo"). But it
> > might be good enough for your purposes.
>
> Isn't this done more simply with just running the merge with
> git-merge-tree? Maybe I'm missing something. E.g. earlier I had a
> conflict between a WIP series of mine in next in
> parse-options-cb.c. Just using git-merge-tree and grepping for conflict
> markers gives me what conflicted:
I forgot about the existence of merge-tree (though TBH I don't have a
huge amount of faith in antique plumbing tools like that that very few
people actually run these days).
It won't look at the working tree at all, but it could be used instead
of diff-tree to find the set of touched paths, and then that can be
correlated with the diff-files output. We'd want to see all paths, not
just conflicted ones, so you'd have to be a little fancy with the
parsing.
It's also not _quite_ the same as what git-pull is doing to merge, since
merge-recursive does fancy stuff like renames. But the distinction would
probably be OK for casual use.
-Peff
^ permalink raw reply
* Re: Can git tell me which uncommitted files clash with the incoming changes?
From: Ævar Arnfjörð Bjarmason @ 2018-12-17 18:49 UTC (permalink / raw)
To: Jeff King; +Cc: Mark Kharitonov, git
In-Reply-To: <20181217162108.GB914@sigill.intra.peff.net>
On Mon, Dec 17 2018, Jeff King wrote:
> On Mon, Dec 17, 2018 at 08:08:49AM -0500, Mark Kharitonov wrote:
>
>> C:\Dayforce\test [master ↓2 +0 ~2 -0 !]> git pull
>> error: Your local changes to the following files would be
>> overwritten by merge:
>> 2.txt
>> Please commit your changes or stash them before you merge.
>> Aborting
>> Updating 2dc8bd0..ea343f8
>> C:\Dayforce\test [master ↓2 +0 ~2 -0 !]>
>>
>> Does git have a command that can tell me which uncommitted files cause
>> the this error? I can see them displayed by git pull, but I really do
>> not want to parse git pull output.
>
> That message is generated by merge-recursive, I believe after it's
> figured out which files would need to be touched.
>
> I don't offhand know of a way to get that _exact_ answer from another
> plumbing command. But in practice it would probably be reasonable to ask
> for the diff between your current branch and what you plan to merge, and
> cross-reference that with the list of files with local changes.
>
> Something like:
>
> git pull ;# this fails, but FETCH_HEAD is left over
>
> git diff-tree -z --name-only HEAD FETCH_HEAD >one
> git diff-index -z --name-only HEAD >two
> comm -z -12 one two
>
> would work on Linux, but "comm -z" is not portable (and I suspect you
> may not have comm at all on Windows). You can probably find a way to
> show the common elements of the two lists using the scripting language
> of your choice.
>
> The answer that gives will be overly broad (e.g., in a case where our
> local branch had touched file "foo" but other side had not, we'd
> consider "foo" as a difference the two-point diff-tree, whereas a real
> 3-way merge would realize that we'd keep our version of "foo"). But it
> might be good enough for your purposes.
Isn't this done more simply with just running the merge with
git-merge-tree? Maybe I'm missing something. E.g. earlier I had a
conflict between a WIP series of mine in next in
parse-options-cb.c. Just using git-merge-tree and grepping for conflict
markers gives me what conflicted:
$ git merge-tree origin/master unconditional-abbrev-2 origin/next|grep -E -e '^(merged|changed in both| (base|our|their|result))' -e '^\+======='
[...]
merged
result 100644 d70c6d9afb94c77c285fe8ee3237f7a40867157a packfile.h
our 100644 6c4037605d0dfee59a084c440506f1af11708d63 packfile.h
changed in both
base 100644 8c9edce52f63bcb1085b119b3a2264a97b1fb374 parse-options-cb.c
our 100644 1afc11a9901dba25dc0f6151e5d9a7654b6e3192 parse-options-cb.c
their 100644 e2f3eaed072f77d63890ec814d810199f57248d5 parse-options-cb.c
+=======
[...]
Or more simply, you can "grep -q" for '^\+=======' to ask "does this
conflict?".
^ permalink raw reply
* Re: Can git tell me which uncommitted files clash with the incoming changes?
From: Elijah Newren @ 2018-12-17 17:17 UTC (permalink / raw)
To: Duy Nguyen; +Cc: Mark Kharitonov, Git Mailing List
In-Reply-To: <CACsJy8ANoiWfmLkmO9ACab5H+m2c2y5uhKBJzGNwwxcs9zV0wA@mail.gmail.com>
On Mon, Dec 17, 2018 at 8:26 AM Duy Nguyen <pclouds@gmail.com> wrote:
>
> On Mon, Dec 17, 2018 at 2:11 PM Mark Kharitonov
> <mark.kharitonov@gmail.com> wrote:
> >
> > Hi,
> > I have asked this question on SO
> > (https://stackoverflow.com/questions/53679167/can-git-tell-me-which-uncommitted-files-clash-with-the-incoming-changes)
> > and usually there are tons of responses on Git questions, but not on
> > this one.
> >
> > Allow me to quote it now.
> >
> > Please, observe:
> >
> > C:\Dayforce\test [master ↓2 +0 ~2 -0 !]> git pull
> > error: Your local changes to the following files would be
> > overwritten by merge:
> > 2.txt
> > Please commit your changes or stash them before you merge.
> > Aborting
> > Updating 2dc8bd0..ea343f8
> > C:\Dayforce\test [master ↓2 +0 ~2 -0 !]>
> >
> > Does git have a command that can tell me which uncommitted files cause
> > the this error? I can see them displayed by git pull, but I really do
> > not want to parse git pull output.
>
> Assume that you have done "git fetch origin" (or whatever master's
> upstream is). Do
>
> git diff --name-only HEAD origin/master
>
> You get the list of files that will need to be updated. Do
>
> git diff --name-only
Are you assuming that `git diff --cached --name-only` is empty? If it
isn't, that alone will trigger a failure (unless using an esoteric
merge strategy or an older version of git), so this assumption is
fairly reasonable to make. But it may be worth being explicit about
for external readers.
> to get the list of files that have local changes. If this list shares
> some paths with the first list, these paths will very likely cause
> "git pull" to abort.
>
> For a better check, I think you need to do "git read-tree -m" by
> yourself (to a temporary index file with --index-output) then you can
> examine that file and determine what file has changed compared to HEAD
> (and if the same file has local changes, git-pull will be aborted).
> You may need to read more in read-tree man page.
>
> Ideally though, git-read-tree should be able to tell what paths are
> updated in "--dry-run -u" mode. But I don't think it's supported yet.
merge-recursive currently uses unpack_trees to do this "files would be
overwritten by merge" checking, so the suggestion of read-tree (which
also uses unpack_trees) makes sense. BUT ... the error checking in
unpack_trees has both false positives and false negatives due to not
understanding renames, and it is somewhat of a nightmarish mess. See
[1] for details. Further, I think it warns in cases that shouldn't be
needed (both sides of history modified the same file, with the
modifications on HEAD's side being a superset of the changes on the
other side, in such a way that 3-way content merge happens to match
what is in HEAD already). So, while the suggestions made so far give
some useful approximations, it's an approximation that will get worse
over time. I don't have a better approximation to provide at this
time, though.
Elijah
[1] https://public-inbox.org/git/20171124195901.2581-1-newren@gmail.com/
, starting at "Note that unpack_trees() doesn't understand renames"
and running until "4-way merges simply cause the complexity to
increase with every new capability."
^ permalink raw reply
* [PATCH] stripspace: allow -s/-c outside git repository
From: Jonathan Nieder @ 2018-12-17 16:59 UTC (permalink / raw)
To: git; +Cc: Han-Wen Nienhuys, Johannes Schindelin
v2.11.0-rc3~3^2~1 (stripspace: respect repository config, 2016-11-21)
improved stripspace --strip-comments / --comentlines by teaching them
to read repository config, but it went a little too far: when running
stripspace outside any repository, the result is
$ git stripspace --strip-comments <test-input
fatal: not a git repository (or any parent up to mount point /tmp)
That makes experimenting with the stripspace command unnecessarily
fussy. Fix it by discovering the git directory gently, as intended
all along.
Reported-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
builtin/stripspace.c | 3 ++-
t/t0030-stripspace.sh | 12 +++++++++---
2 files changed, 11 insertions(+), 4 deletions(-)
diff --git a/builtin/stripspace.c b/builtin/stripspace.c
index bdf0328869..be33eb83c1 100644
--- a/builtin/stripspace.c
+++ b/builtin/stripspace.c
@@ -30,6 +30,7 @@ int cmd_stripspace(int argc, const char **argv, const char *prefix)
{
struct strbuf buf = STRBUF_INIT;
enum stripspace_mode mode = STRIP_DEFAULT;
+ int nongit;
const struct option options[] = {
OPT_CMDMODE('s', "strip-comments", &mode,
@@ -46,7 +47,7 @@ int cmd_stripspace(int argc, const char **argv, const char *prefix)
usage_with_options(stripspace_usage, options);
if (mode == STRIP_COMMENTS || mode == COMMENT_LINES) {
- setup_git_directory_gently(NULL);
+ setup_git_directory_gently(&nongit);
git_config(git_default_config, NULL);
}
diff --git a/t/t0030-stripspace.sh b/t/t0030-stripspace.sh
index 5ce47e8af5..0c24a0f9a3 100755
--- a/t/t0030-stripspace.sh
+++ b/t/t0030-stripspace.sh
@@ -430,9 +430,15 @@ test_expect_success '-c with changed comment char' '
test_expect_success '-c with comment char defined in .git/config' '
test_config core.commentchar = &&
printf "= foo\n" >expect &&
- printf "foo" | (
- mkdir sub && cd sub && git stripspace -c
- ) >actual &&
+ rm -fr sub &&
+ mkdir sub &&
+ printf "foo" | git -C sub stripspace -c >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success '-c outside git repository' '
+ printf "# foo\n" >expect &&
+ printf "foo" | nongit git stripspace -c >actual &&
test_cmp expect actual
'
--
2.20.0.405.gbc1bbc6f85
^ permalink raw reply related
* Re: Can git tell me which uncommitted files clash with the incoming changes?
From: Duy Nguyen @ 2018-12-17 16:24 UTC (permalink / raw)
To: Mark Kharitonov; +Cc: Git Mailing List
In-Reply-To: <CAG2YSPzmN5u1uAPVbjsqC3LzVVinFR21-_6wfrkBHdPWhOfMfQ@mail.gmail.com>
On Mon, Dec 17, 2018 at 2:11 PM Mark Kharitonov
<mark.kharitonov@gmail.com> wrote:
>
> Hi,
> I have asked this question on SO
> (https://stackoverflow.com/questions/53679167/can-git-tell-me-which-uncommitted-files-clash-with-the-incoming-changes)
> and usually there are tons of responses on Git questions, but not on
> this one.
>
> Allow me to quote it now.
>
> Please, observe:
>
> C:\Dayforce\test [master ↓2 +0 ~2 -0 !]> git pull
> error: Your local changes to the following files would be
> overwritten by merge:
> 2.txt
> Please commit your changes or stash them before you merge.
> Aborting
> Updating 2dc8bd0..ea343f8
> C:\Dayforce\test [master ↓2 +0 ~2 -0 !]>
>
> Does git have a command that can tell me which uncommitted files cause
> the this error? I can see them displayed by git pull, but I really do
> not want to parse git pull output.
Assume that you have done "git fetch origin" (or whatever master's
upstream is). Do
git diff --name-only HEAD origin/master
You get the list of files that will need to be updated. Do
git diff --name-only
to get the list of files that have local changes. If this list shares
some paths with the first list, these paths will very likely cause
"git pull" to abort.
For a better check, I think you need to do "git read-tree -m" by
yourself (to a temporary index file with --index-output) then you can
examine that file and determine what file has changed compared to HEAD
(and if the same file has local changes, git-pull will be aborted).
You may need to read more in read-tree man page.
Ideally though, git-read-tree should be able to tell what paths are
updated in "--dry-run -u" mode. But I don't think it's supported yet.
--
Duy
^ permalink raw reply
* Re: Can git tell me which uncommitted files clash with the incoming changes?
From: Jeff King @ 2018-12-17 16:21 UTC (permalink / raw)
To: Mark Kharitonov; +Cc: git
In-Reply-To: <CAG2YSPzmN5u1uAPVbjsqC3LzVVinFR21-_6wfrkBHdPWhOfMfQ@mail.gmail.com>
On Mon, Dec 17, 2018 at 08:08:49AM -0500, Mark Kharitonov wrote:
> C:\Dayforce\test [master ↓2 +0 ~2 -0 !]> git pull
> error: Your local changes to the following files would be
> overwritten by merge:
> 2.txt
> Please commit your changes or stash them before you merge.
> Aborting
> Updating 2dc8bd0..ea343f8
> C:\Dayforce\test [master ↓2 +0 ~2 -0 !]>
>
> Does git have a command that can tell me which uncommitted files cause
> the this error? I can see them displayed by git pull, but I really do
> not want to parse git pull output.
That message is generated by merge-recursive, I believe after it's
figured out which files would need to be touched.
I don't offhand know of a way to get that _exact_ answer from another
plumbing command. But in practice it would probably be reasonable to ask
for the diff between your current branch and what you plan to merge, and
cross-reference that with the list of files with local changes.
Something like:
git pull ;# this fails, but FETCH_HEAD is left over
git diff-tree -z --name-only HEAD FETCH_HEAD >one
git diff-index -z --name-only HEAD >two
comm -z -12 one two
would work on Linux, but "comm -z" is not portable (and I suspect you
may not have comm at all on Windows). You can probably find a way to
show the common elements of the two lists using the scripting language
of your choice.
The answer that gives will be overly broad (e.g., in a case where our
local branch had touched file "foo" but other side had not, we'd
consider "foo" as a difference the two-point diff-tree, whereas a real
3-way merge would realize that we'd keep our version of "foo"). But it
might be good enough for your purposes.
-Peff
^ permalink raw reply
* Re: [PATCH] log: add %S option (like --source) to log --format
From: Jeff King @ 2018-12-17 15:59 UTC (permalink / raw)
To: Issac Trotts; +Cc: Junio C Hamano, git, Noemi Mercado
In-Reply-To: <CANdyxMwxPqTMfLsoK-2JT3Wf3hXZnQNCPRS04aSHzsMbYJZo-Q@mail.gmail.com>
On Sun, Dec 16, 2018 at 10:25:14PM -0800, Issac Trotts wrote:
> Make it possible to write for example
>
> git log --format="%H,%S"
>
> where the %S at the end is a new placeholder that prints out the ref
> (tag/branch) for each commit.
Seems like a reasonable thing to want.
One curious thing about "--source" is that it requires cooperation from
the actual traversal. So if you did:
git rev-list | git diff-tree --format="%H %S"
we don't have the %S information in the latter command. I think that's
probably acceptable as long as it does something sane when we don't have
that information (e.g., replace it with an empty string). It might also
be worth calling out in the documentation.
-Peff
^ permalink raw reply
* Re: pack file object size question
From: Duy Nguyen @ 2018-12-17 15:31 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: Farhan Khan, git
In-Reply-To: <20181217001446.GL75890@google.com>
On Sun, Dec 16, 2018 at 04:14:46PM -0800, Jonathan Nieder wrote:
> Hi,
>
> Farhan Khan wrote:
> >> Farhan Khan wrote:
>
> >>> I am having trouble figuring out the boundary between two objects in
> >>> the pack file.
> [...]
> > I think the issue is, the compressed object has a fixed
> > size and git inflates it, then moves on to the next object. I am
> > trying to figure out how where it identifies the size of the object.
>
> Do you mean the compressed size or uncompressed size?
>
> It sounds to me like pack-format.txt needs to do a better job of
> distinguishing the two.
How about something like this?
I mostly wrote this based on memory (and a very quick look at
index-pack) but I think we never ever really stored compressed
sizes. The "length" field (even in loose format) is always about
uncompressed size.
-- 8< --
diff --git a/Documentation/technical/pack-format.txt b/Documentation/technical/pack-format.txt
index cab5bdd2ff..4fd49f61d6 100644
--- a/Documentation/technical/pack-format.txt
+++ b/Documentation/technical/pack-format.txt
@@ -31,6 +31,11 @@ Git pack format
is an OBJ_OFS_DELTA object
compressed delta data
+ Note: The length (in bytes) is of uncompressed objects or
+ deltified representation. We're supposed to reach the end of zlib
+ stream once we have inflated the given length, otherwise it's a
+ corrupted pack file.
+
Observation: length of each object is encoded in a variable
length format and is not constrained to 32-bit or anything.
@@ -199,7 +204,8 @@ Pack file entry: <+
is the size before compression).
If it is REF_DELTA, then
20-byte base object name SHA-1 (the size above is the
- size of the delta data that follows).
+ size of the delta data that follows, before
+ compression).
delta data, deflated.
If it is OFS_DELTA, then
n-byte offset (see below) interpreted as a negative
-- 8< --
^ permalink raw reply related
* Re: [PATCH v4 4/6] revision: implement sparse algorithm
From: Derrick Stolee @ 2018-12-17 14:50 UTC (permalink / raw)
To: Ævar Arnfjörð Bjarmason
Cc: Derrick Stolee via GitGitGadget, git, peff, jrnieder,
Junio C Hamano, Derrick Stolee
In-Reply-To: <87a7l41b78.fsf@evledraar.gmail.com>
On 12/17/2018 9:26 AM, Ævar Arnfjörð Bjarmason wrote:
> On Mon, Dec 17 2018, Derrick Stolee wrote:
>
>
>> As for adding progress to this step, I'm open to it. It can be done as
>> a sequel series.
> Okey. To clarify I wasn't complaining about the lack of progress output,
> we didn't have it before, just clarifying (as I've found out now) that
> when you're talking about "enumerating objects" in your commit message
> it's *not* what we're doing when we show the "Enumerating objects"
> progress bar, but an unrelated step prior to that.
Part of the problem is that in builtin/pack-objects.c, we have the
following:
if (progress)
progress_state = start_progress(_("Enumerating
objects"), 0);
if (!use_internal_rev_list)
read_object_list_from_stdin();
else {
get_object_list(rp.argc, rp.argv);
argv_array_clear(&rp);
}
cleanup_preferred_base();
if (include_tag && nr_result)
for_each_ref(add_ref_tag, NULL);
stop_progress(&progress_state);
and the logic for walking uninteresting objects is the
mark_edges_uninteresting() inside get_object_list() (both entirely
contained in this progress state). Perhaps the best thing to do is to
untangle the progress for the two modes based on
'use_internal_rev_list'. Could we then have the progress for
get_object_list() be "Walking objects" instead?
Thanks,
-Stolee
^ permalink raw reply
* Draft of Git Rev News edition 46
From: Christian Couder @ 2018-12-17 14:33 UTC (permalink / raw)
To: git
Cc: Junio C Hamano, Jakub Narebski, Markus Jansen, Gabriel Alcaras,
Jeff King, Johannes Schindelin,
Ævar Arnfjörð Bjarmason, Luca Milanesio,
Elijah Newren, Stefan Xenos, Stefan Beller, Nguyen Thai Ngoc Duy,
Thomas Gummerer, Eric Sunshine, Dan Fabulich, Kaartic Sivaraam,
Drew DeVault
Hi,
A draft of a new Git Rev News edition is available here:
https://github.com/git/git.github.io/blob/master/rev_news/drafts/edition-46.md
Everyone is welcome to contribute in any section either by editing the
above page on GitHub and sending a pull request, or by commenting on
this GitHub issue:
https://github.com/git/git.github.io/issues/321
You can also reply to this email.
In general all kinds of contribution, for example proofreading,
suggestions for articles or links, help on the issues in GitHub, and
so on, are very much appreciated.
I tried to cc everyone who appears in this edition, but maybe I missed
some people, sorry about that.
Jakub, Markus, Gabriel and me plan to publish this edition on
Wednesday December 19th.
Thanks,
Christian.
^ permalink raw reply
* Re: [PATCH v4 4/6] revision: implement sparse algorithm
From: Ævar Arnfjörð Bjarmason @ 2018-12-17 14:26 UTC (permalink / raw)
To: Derrick Stolee
Cc: Derrick Stolee via GitGitGadget, git, peff, jrnieder,
Junio C Hamano, Derrick Stolee
In-Reply-To: <867aa5c3-60e0-2467-795a-40aac58f306b@gmail.com>
On Mon, Dec 17 2018, Derrick Stolee wrote:
> On 12/14/2018 6:32 PM, Ævar Arnfjörð Bjarmason wrote:
>> On Fri, Dec 14 2018, Derrick Stolee via GitGitGadget wrote:
>>
>>> Despite these potential drawbacks, the benefits of the algorithm
>>> are clear. By adding a counter to 'add_children_by_path' and
>>> 'mark_tree_contents_uninteresting', I measured the number of
>>> parsed trees for the two algorithms in a variety of repos.
>> We spend a long time printing those out before we ever get to
>> "Enumerating objects".
>>
>> Which was where I was trying to test this, i.e. is this a lot of work we
>> perform before we print out the progress bar, and regardless of this
>> optimization should have other progress output there, so we can see this
>> time we're spending on this?
>
> It is true that part of the problem is that a 'git push' will sit for
> a while without presenting any feedback until this part of the
> algorithm is complete. The current series intends to significantly
> reduce this time.
> As for adding progress to this step, I'm open to it. It can be done as
> a sequel series.
Okey. To clarify I wasn't complaining about the lack of progress output,
we didn't have it before, just clarifying (as I've found out now) that
when you're talking about "enumerating objects" in your commit message
it's *not* what we're doing when we show the "Enumerating objects"
progress bar, but an unrelated step prior to that.
I thought I might have been holding this wrong.
> What would we use to describe this section? "Enumerating remote
> objects"?
Isn't this "Discovering objects to push to remote" i.e. "Selecting
objects", but then what's "Enumerating objects" really doing? "Looping
over objects we selected before and creating a pack"?
^ permalink raw reply
* Re: [PATCH v4 4/6] revision: implement sparse algorithm
From: Derrick Stolee @ 2018-12-17 14:20 UTC (permalink / raw)
To: Ævar Arnfjörð Bjarmason,
Derrick Stolee via GitGitGadget
Cc: git, peff, jrnieder, Junio C Hamano, Derrick Stolee
In-Reply-To: <87efaj1y77.fsf@evledraar.gmail.com>
On 12/14/2018 6:32 PM, Ævar Arnfjörð Bjarmason wrote:
> On Fri, Dec 14 2018, Derrick Stolee via GitGitGadget wrote:
>
>> Despite these potential drawbacks, the benefits of the algorithm
>> are clear. By adding a counter to 'add_children_by_path' and
>> 'mark_tree_contents_uninteresting', I measured the number of
>> parsed trees for the two algorithms in a variety of repos.
> We spend a long time printing those out before we ever get to
> "Enumerating objects".
>
> Which was where I was trying to test this, i.e. is this a lot of work we
> perform before we print out the progress bar, and regardless of this
> optimization should have other progress output there, so we can see this
> time we're spending on this?
It is true that part of the problem is that a 'git push' will sit for a
while without presenting any feedback until this part of the algorithm
is complete. The current series intends to significantly reduce this time.
As for adding progress to this step, I'm open to it. It can be done as a
sequel series.
What would we use to describe this section? "Enumerating remote objects"?
Thanks,
-Stolee
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox