* [PATCH 1/3] t1006: update 'run_tests' to test generic object specifiers
2024-03-11 18:55 [PATCH 0/3] cat-file: add %(objectmode) avoid verifying submodules' OIDs Johannes Schindelin via GitGitGadget
@ 2024-03-11 18:56 ` Victoria Dye via GitGitGadget
2024-03-11 21:54 ` Junio C Hamano
0 siblings, 1 reply; 12+ messages in thread
From: Victoria Dye via GitGitGadget @ 2024-03-11 18:56 UTC (permalink / raw)
To: git; +Cc: Johannes Schindelin, Victoria Dye
From: Victoria Dye <vdye@github.com>
Update the 'run_tests' test wrapper so that the first argument may refer to
any specifier that uniquely identifies an object (e.g. a ref name,
'<OID>:<path>', '<OID>^{<type>}', etc.), rather than only a full object ID.
Also, add a test that uses a non-OID identifier, ensuring appropriate
parsing in 'cat-file'.
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
t/t1006-cat-file.sh | 46 +++++++++++++++++++++++----------------------
1 file changed, 24 insertions(+), 22 deletions(-)
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index e0c6482797e..ac1f754ee32 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -112,65 +112,66 @@ strlen () {
run_tests () {
type=$1
- sha1=$2
+ object_name=$2
+ oid=$(git rev-parse --verify $object_name)
size=$3
content=$4
pretty_content=$5
- batch_output="$sha1 $type $size
+ batch_output="$oid $type $size
$content"
test_expect_success "$type exists" '
- git cat-file -e $sha1
+ git cat-file -e $object_name
'
test_expect_success "Type of $type is correct" '
echo $type >expect &&
- git cat-file -t $sha1 >actual &&
+ git cat-file -t $object_name >actual &&
test_cmp expect actual
'
test_expect_success "Size of $type is correct" '
echo $size >expect &&
- git cat-file -s $sha1 >actual &&
+ git cat-file -s $object_name >actual &&
test_cmp expect actual
'
test_expect_success "Type of $type is correct using --allow-unknown-type" '
echo $type >expect &&
- git cat-file -t --allow-unknown-type $sha1 >actual &&
+ git cat-file -t --allow-unknown-type $object_name >actual &&
test_cmp expect actual
'
test_expect_success "Size of $type is correct using --allow-unknown-type" '
echo $size >expect &&
- git cat-file -s --allow-unknown-type $sha1 >actual &&
+ git cat-file -s --allow-unknown-type $object_name >actual &&
test_cmp expect actual
'
test -z "$content" ||
test_expect_success "Content of $type is correct" '
echo_without_newline "$content" >expect &&
- git cat-file $type $sha1 >actual &&
+ git cat-file $type $object_name >actual &&
test_cmp expect actual
'
test_expect_success "Pretty content of $type is correct" '
echo_without_newline "$pretty_content" >expect &&
- git cat-file -p $sha1 >actual &&
+ git cat-file -p $object_name >actual &&
test_cmp expect actual
'
test -z "$content" ||
test_expect_success "--batch output of $type is correct" '
echo "$batch_output" >expect &&
- echo $sha1 | git cat-file --batch >actual &&
+ echo $object_name | git cat-file --batch >actual &&
test_cmp expect actual
'
test_expect_success "--batch-check output of $type is correct" '
- echo "$sha1 $type $size" >expect &&
- echo_without_newline $sha1 | git cat-file --batch-check >actual &&
+ echo "$oid $type $size" >expect &&
+ echo_without_newline $object_name | git cat-file --batch-check >actual &&
test_cmp expect actual
'
@@ -179,33 +180,33 @@ $content"
test -z "$content" ||
test_expect_success "--batch-command $opt output of $type content is correct" '
echo "$batch_output" >expect &&
- test_write_lines "contents $sha1" | git cat-file --batch-command $opt >actual &&
+ test_write_lines "contents $object_name" | git cat-file --batch-command $opt >actual &&
test_cmp expect actual
'
test_expect_success "--batch-command $opt output of $type info is correct" '
- echo "$sha1 $type $size" >expect &&
- test_write_lines "info $sha1" |
+ echo "$oid $type $size" >expect &&
+ test_write_lines "info $object_name" |
git cat-file --batch-command $opt >actual &&
test_cmp expect actual
'
done
test_expect_success "custom --batch-check format" '
- echo "$type $sha1" >expect &&
- echo $sha1 | git cat-file --batch-check="%(objecttype) %(objectname)" >actual &&
+ echo "$type $oid" >expect &&
+ echo $object_name | git cat-file --batch-check="%(objecttype) %(objectname)" >actual &&
test_cmp expect actual
'
test_expect_success "custom --batch-command format" '
- echo "$type $sha1" >expect &&
- echo "info $sha1" | git cat-file --batch-command="%(objecttype) %(objectname)" >actual &&
+ echo "$type $oid" >expect &&
+ echo "info $object_name" | git cat-file --batch-command="%(objecttype) %(objectname)" >actual &&
test_cmp expect actual
'
test_expect_success '--batch-check with %(rest)' '
echo "$type this is some extra content" >expect &&
- echo "$sha1 this is some extra content" |
+ echo "$object_name this is some extra content" |
git cat-file --batch-check="%(objecttype) %(rest)" >actual &&
test_cmp expect actual
'
@@ -216,7 +217,7 @@ $content"
echo "$size" &&
echo "$content"
} >expect &&
- echo $sha1 | git cat-file --batch="%(objectsize)" >actual &&
+ echo $object_name | git cat-file --batch="%(objectsize)" >actual &&
test_cmp expect actual
'
@@ -226,7 +227,7 @@ $content"
echo "$type" &&
echo "$content"
} >expect &&
- echo $sha1 | git cat-file --batch="%(objecttype)" >actual &&
+ echo $object_name | git cat-file --batch="%(objecttype)" >actual &&
test_cmp expect actual
'
}
@@ -271,6 +272,7 @@ tree_size=$(($(test_oid rawsz) + 13))
tree_pretty_content="100644 blob $hello_sha1 hello${LF}"
run_tests 'tree' $tree_sha1 $tree_size "" "$tree_pretty_content"
+run_tests 'blob' "$tree_sha1:hello" $hello_size "" "$hello_content"
commit_message="Initial commit"
commit_sha1=$(echo_without_newline "$commit_message" | git commit-tree $tree_sha1)
--
gitgitgadget
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 1/3] t1006: update 'run_tests' to test generic object specifiers
2024-03-11 18:56 ` [PATCH 1/3] t1006: update 'run_tests' to test generic object specifiers Victoria Dye via GitGitGadget
@ 2024-03-11 21:54 ` Junio C Hamano
0 siblings, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2024-03-11 21:54 UTC (permalink / raw)
To: Victoria Dye via GitGitGadget; +Cc: git, Johannes Schindelin, Victoria Dye
"Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:
> From: Victoria Dye <vdye@github.com>
>
> Update the 'run_tests' test wrapper so that the first argument may refer to
> any specifier that uniquely identifies an object (e.g. a ref name,
> '<OID>:<path>', '<OID>^{<type>}', etc.), rather than only a full object ID.
> Also, add a test that uses a non-OID identifier, ensuring appropriate
> parsing in 'cat-file'.
>
> Signed-off-by: Victoria Dye <vdye@github.com>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
> t/t1006-cat-file.sh | 46 +++++++++++++++++++++++----------------------
> 1 file changed, 24 insertions(+), 22 deletions(-)
>
> diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
> index e0c6482797e..ac1f754ee32 100755
> --- a/t/t1006-cat-file.sh
> +++ b/t/t1006-cat-file.sh
> @@ -112,65 +112,66 @@ strlen () {
>
> run_tests () {
> type=$1
> - sha1=$2
> + object_name=$2
> + oid=$(git rev-parse --verify $object_name)
> size=$3
> content=$4
> pretty_content=$5
>
> - batch_output="$sha1 $type $size
> + batch_output="$oid $type $size
> $content"
As "object_name" is now allowed to be any name in the 'extended
SHA-1' syntax (cf. Documentation/revisions.txt), you should be a bit
more careful in quoting.
oid=$(git rev-parse --verify "$object_name")
> test_expect_success "$type exists" '
> - git cat-file -e $sha1
> + git cat-file -e $object_name
> '
Likewise. You may not currently use a path with SP in it to name a
tree object, e.g., "HEAD:Read Me.txt", but protecting against such a
pathname is a cheap investment for futureproofing.
Looking good otherwise. Thanks.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 0/3] cat-file: add %(objectmode) and submodule message to batch commands
@ 2025-06-02 18:55 Victoria Dye via GitGitGadget
2025-06-02 18:55 ` [PATCH 1/3] t1006: update 'run_tests' to test generic object specifiers Victoria Dye via GitGitGadget
` (3 more replies)
0 siblings, 4 replies; 12+ messages in thread
From: Victoria Dye via GitGitGadget @ 2025-06-02 18:55 UTC (permalink / raw)
To: git; +Cc: peff, gitster, Victoria Dye
This series re-attempts the changes proposed last year [1] for extending the
information about tree entries available from the 'cat-file' batch format
commands. It also (hopefully) addresses the initial round of feedback that
series received.
The first patch updates 't1006-cat-file.sh' to test non-OID object
specifications. In response to the feedback in [2], I added more careful
quoting and a couple tests using paths with spaces. This change revealed a
(likely known) limitation of the '%(rest)' atom when processing object names
with spaces. To make that limitation explicit, I marked the relevant test as
expected to fail.
The second patch adds "mode" support. This is essentially unchanged from its
initial submission, save for some conflict resolution in the test script.
The final patch takes a different approach to submodule resolution than the
initial submission; rather than treat the entry as a "regular" commit object
with empty content, we now print an error message similar to the "missing",
"ambiguous", etc. cases, but with the tree entry's OID rather than the input
object name.
As for the motivation behind the change (re: [3]), the goal of this series
is to be able to get more of the information available internally about an
object in 'cat-file --batch[*]' -- in the case of a tree entry, the main
things missing were the file mode and the presence (and OID) of submodule
pointers. As Junio mentioned in [4], using a single long-running process to
resolve objects is far more performant than spawning multiple processes to
resolve tree entries with something like 'ls-tree', especially when
resolving entries across multiple trees or resolving a mix of tree entries
and OIDs, refnames, etc. The object resolution logic in 'cat-file' meant
that the mode & submodule OID information were already (mostly) available,
but we didn't have a way to output it.
The intent of this series is to make the new format options/outputs to get
those fields as unobtrusive as possible, but I'm happy to do something more
like the previous series if that would be preferable.
[1]
https://lore.kernel.org/git/pull.1689.git.1710183362.gitgitgadget@gmail.com/
[2] https://lore.kernel.org/git/xmqqle6oo2ns.fsf@gitster.g/ [3]
https://lore.kernel.org/git/20240312221758.GA109417@coredump.intra.peff.net/
[4] https://lore.kernel.org/git/xmqq1q8fl05r.fsf@gitster.g/
Victoria Dye (3):
t1006: update 'run_tests' to test generic object specifiers
cat-file: add %(objectmode) atom
cat-file.c: add batch handling for submodules
Documentation/git-cat-file.adoc | 13 ++++
builtin/cat-file.c | 14 +++-
t/t1006-cat-file.sh | 111 +++++++++++++++++++++++---------
3 files changed, 103 insertions(+), 35 deletions(-)
base-commit: 7014b55638da979331baf8dc31c4e1d697cf2d67
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1929%2Fvdye%2Fvdye%2Fcat-file-mode-submodule-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1929/vdye/vdye/cat-file-mode-submodule-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/1929
--
gitgitgadget
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/3] t1006: update 'run_tests' to test generic object specifiers
2025-06-02 18:55 [PATCH 0/3] cat-file: add %(objectmode) and submodule message to batch commands Victoria Dye via GitGitGadget
@ 2025-06-02 18:55 ` Victoria Dye via GitGitGadget
2025-06-02 18:55 ` [PATCH 2/3] cat-file: add %(objectmode) atom Victoria Dye via GitGitGadget
` (2 subsequent siblings)
3 siblings, 0 replies; 12+ messages in thread
From: Victoria Dye via GitGitGadget @ 2025-06-02 18:55 UTC (permalink / raw)
To: git; +Cc: peff, gitster, Victoria Dye, Victoria Dye
From: Victoria Dye <vdye@github.com>
Update the 'run_tests' test wrapper so that the first argument may refer to
any specifier that uniquely identifies an object (e.g. a ref name,
'<OID>:<path>', '<OID>^{<type>}', etc.), rather than only a full object ID.
Also add tests that use non-OID identifiers, ensuring appropriate parsing in
'cat-file'. The identifiers used in some of the added tests include a space,
which is incompatible with the '%(rest)' atom. To accommodate that without
removing the test case, use 'test_expect_failure' when 'object_name'
includes a space.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Victoria Dye <vdye@github.com>
---
t/t1006-cat-file.sh | 56 +++++++++++++++++++++++++++++----------------
1 file changed, 36 insertions(+), 20 deletions(-)
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index 317da6869c88..7c9512a6b439 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -113,53 +113,54 @@ strlen () {
run_tests () {
type=$1
- oid=$2
+ object_name="$2"
size=$3
content=$4
pretty_content=$5
+ oid=${6:-"$object_name"}
batch_output="$oid $type $size
$content"
test_expect_success "$type exists" '
- git cat-file -e $oid
+ git cat-file -e "$object_name"
'
test_expect_success "Type of $type is correct" '
echo $type >expect &&
- git cat-file -t $oid >actual &&
+ git cat-file -t "$object_name" >actual &&
test_cmp expect actual
'
test_expect_success "Size of $type is correct" '
echo $size >expect &&
- git cat-file -s $oid >actual &&
+ git cat-file -s "$object_name" >actual &&
test_cmp expect actual
'
test -z "$content" ||
test_expect_success "Content of $type is correct" '
echo_without_newline "$content" >expect &&
- git cat-file $type $oid >actual &&
+ git cat-file $type "$object_name" >actual &&
test_cmp expect actual
'
test_expect_success "Pretty content of $type is correct" '
echo_without_newline "$pretty_content" >expect &&
- git cat-file -p $oid >actual &&
+ git cat-file -p "$object_name" >actual &&
test_cmp expect actual
'
test -z "$content" ||
test_expect_success "--batch output of $type is correct" '
echo "$batch_output" >expect &&
- echo $oid | git cat-file --batch >actual &&
+ echo "$object_name" | git cat-file --batch >actual &&
test_cmp expect actual
'
test_expect_success "--batch-check output of $type is correct" '
echo "$oid $type $size" >expect &&
- echo_without_newline $oid | git cat-file --batch-check >actual &&
+ echo_without_newline "$object_name" | git cat-file --batch-check >actual &&
test_cmp expect actual
'
@@ -168,13 +169,13 @@ $content"
test -z "$content" ||
test_expect_success "--batch-command $opt output of $type content is correct" '
echo "$batch_output" >expect &&
- test_write_lines "contents $oid" | git cat-file --batch-command $opt >actual &&
+ test_write_lines "contents $object_name" | git cat-file --batch-command $opt >actual &&
test_cmp expect actual
'
test_expect_success "--batch-command $opt output of $type info is correct" '
echo "$oid $type $size" >expect &&
- test_write_lines "info $oid" |
+ test_write_lines "info $object_name" |
git cat-file --batch-command $opt >actual &&
test_cmp expect actual
'
@@ -182,19 +183,28 @@ $content"
test_expect_success "custom --batch-check format" '
echo "$type $oid" >expect &&
- echo $oid | git cat-file --batch-check="%(objecttype) %(objectname)" >actual &&
+ echo "$object_name" | git cat-file --batch-check="%(objecttype) %(objectname)" >actual &&
test_cmp expect actual
'
test_expect_success "custom --batch-command format" '
echo "$type $oid" >expect &&
- echo "info $oid" | git cat-file --batch-command="%(objecttype) %(objectname)" >actual &&
+ echo "info $object_name" | git cat-file --batch-command="%(objecttype) %(objectname)" >actual &&
test_cmp expect actual
'
- test_expect_success '--batch-check with %(rest)' '
+ # FIXME: %(rest) is incompatible with object names that include whitespace,
+ # e.g. HEAD:path/to/a/file with spaces. Use the resolved OID as input to
+ # test this instead of the raw object name.
+ if echo "$object_name" | grep " "; then
+ test_rest=test_expect_failure
+ else
+ test_rest=test_expect_success
+ fi
+
+ $test_rest '--batch-check with %(rest)' '
echo "$type this is some extra content" >expect &&
- echo "$oid this is some extra content" |
+ echo "$object_name this is some extra content" |
git cat-file --batch-check="%(objecttype) %(rest)" >actual &&
test_cmp expect actual
'
@@ -205,7 +215,7 @@ $content"
echo "$size" &&
echo "$content"
} >expect &&
- echo $oid | git cat-file --batch="%(objectsize)" >actual &&
+ echo "$object_name" | git cat-file --batch="%(objectsize)" >actual &&
test_cmp expect actual
'
@@ -215,7 +225,7 @@ $content"
echo "$type" &&
echo "$content"
} >expect &&
- echo $oid | git cat-file --batch="%(objecttype)" >actual &&
+ echo "$object_name" | git cat-file --batch="%(objecttype)" >actual &&
test_cmp expect actual
'
}
@@ -230,6 +240,8 @@ test_expect_success "setup" '
git config extensions.compatobjectformat $test_compat_hash_algo &&
echo_without_newline "$hello_content" > hello &&
git update-index --add hello &&
+ echo_without_newline "$hello_content" > "path with spaces" &&
+ git update-index --add --chmod=+x "path with spaces" &&
git commit -m "add hello file"
'
@@ -269,13 +281,17 @@ test_expect_success '--batch-check without %(rest) considers whole line' '
tree_oid=$(git write-tree)
tree_compat_oid=$(git rev-parse --output-object-format=$test_compat_hash_algo $tree_oid)
-tree_size=$(($(test_oid rawsz) + 13))
-tree_compat_size=$(($(test_oid --hash=compat rawsz) + 13))
-tree_pretty_content="100644 blob $hello_oid hello${LF}"
-tree_compat_pretty_content="100644 blob $hello_compat_oid hello${LF}"
+tree_size=$((2 * $(test_oid rawsz) + 13 + 24))
+tree_compat_size=$((2 * $(test_oid --hash=compat rawsz) + 13 + 24))
+tree_pretty_content="100644 blob $hello_oid hello${LF}100755 blob $hello_oid path with spaces${LF}"
+tree_compat_pretty_content="100644 blob $hello_compat_oid hello${LF}100755 blob $hello_compat_oid path with spaces${LF}"
run_tests 'tree' $tree_oid $tree_size "" "$tree_pretty_content"
run_tests 'tree' $tree_compat_oid $tree_compat_size "" "$tree_compat_pretty_content"
+run_tests 'blob' "$tree_oid:hello" $hello_size "" "$hello_content" $hello_oid
+run_tests 'blob' "$tree_compat_oid:hello" $hello_size "" "$hello_content" $hello_compat_oid
+run_tests 'blob' "$tree_oid:path with spaces" $hello_size "" "$hello_content" $hello_oid
+run_tests 'blob' "$tree_compat_oid:path with spaces" $hello_size "" "$hello_content" $hello_compat_oid
commit_message="Initial commit"
commit_oid=$(echo_without_newline "$commit_message" | git commit-tree $tree_oid)
--
gitgitgadget
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/3] cat-file: add %(objectmode) atom
2025-06-02 18:55 [PATCH 0/3] cat-file: add %(objectmode) and submodule message to batch commands Victoria Dye via GitGitGadget
2025-06-02 18:55 ` [PATCH 1/3] t1006: update 'run_tests' to test generic object specifiers Victoria Dye via GitGitGadget
@ 2025-06-02 18:55 ` Victoria Dye via GitGitGadget
2025-06-04 19:36 ` Jeff King
2025-06-02 18:55 ` [PATCH 3/3] cat-file.c: add batch handling for submodules Victoria Dye via GitGitGadget
2025-06-04 14:43 ` [PATCH 0/3] cat-file: add %(objectmode) and submodule message to batch commands Junio C Hamano
3 siblings, 1 reply; 12+ messages in thread
From: Victoria Dye via GitGitGadget @ 2025-06-02 18:55 UTC (permalink / raw)
To: git; +Cc: peff, gitster, Victoria Dye, Victoria Dye
From: Victoria Dye <vdye@github.com>
Add a formatting atom, used with the --batch-check/--batch-command options,
that prints the octal representation of the object mode if a given revision
includes that information, e.g. one that follows the format
<tree-ish>:<path>. If the mode information does not exist, an empty string
is printed instead.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Victoria Dye <vdye@github.com>
---
Documentation/git-cat-file.adoc | 5 +++++
builtin/cat-file.c | 9 ++++++--
t/t1006-cat-file.sh | 38 +++++++++++++++++++--------------
3 files changed, 34 insertions(+), 18 deletions(-)
diff --git a/Documentation/git-cat-file.adoc b/Documentation/git-cat-file.adoc
index cde79ad242bb..5c002c0499e4 100644
--- a/Documentation/git-cat-file.adoc
+++ b/Documentation/git-cat-file.adoc
@@ -307,6 +307,11 @@ newline. The available atoms are:
`objecttype`::
The type of the object (the same as `cat-file -t` reports).
+`objectmode`::
+ If the specified object has mode information (such as a tree or
+ index entry), the mode expressed as an octal integer. Otherwise,
+ empty string.
+
`objectsize`::
The size, in bytes, of the object (the same as `cat-file -s`
reports).
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 67a5ff2b9ebd..b11576756bcc 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -275,6 +275,7 @@ struct expand_data {
struct object_id oid;
enum object_type type;
unsigned long size;
+ unsigned short mode;
off_t disk_size;
const char *rest;
struct object_id delta_base_oid;
@@ -306,6 +307,7 @@ struct expand_data {
*/
unsigned skip_object_info : 1;
};
+#define EXPAND_DATA_INIT { .mode = S_IFINVALID }
static int is_atom(const char *atom, const char *s, int slen)
{
@@ -345,6 +347,9 @@ static int expand_atom(struct strbuf *sb, const char *atom, int len,
else
strbuf_addstr(sb,
oid_to_hex(&data->delta_base_oid));
+ } else if (is_atom("objectmode", atom, len)) {
+ if (!data->mark_query && !(S_IFINVALID == data->mode))
+ strbuf_addf(sb, "%06o", data->mode);
} else
return 0;
return 1;
@@ -613,6 +618,7 @@ static void batch_one_object(const char *obj_name,
goto out;
}
+ data->mode = ctx.mode;
batch_object_write(obj_name, scratch, opt, data, NULL, 0);
out:
@@ -866,7 +872,7 @@ static int batch_objects(struct batch_options *opt)
{
struct strbuf input = STRBUF_INIT;
struct strbuf output = STRBUF_INIT;
- struct expand_data data;
+ struct expand_data data = EXPAND_DATA_INIT;
int save_warning;
int retval = 0;
@@ -875,7 +881,6 @@ static int batch_objects(struct batch_options *opt)
* object_info to be handed to oid_object_info_extended for each
* object.
*/
- memset(&data, 0, sizeof(data));
data.mark_query = 1;
expand_format(&output,
opt->format ? opt->format : DEFAULT_FORMAT,
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index 7c9512a6b439..97052b3f31f1 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -114,10 +114,11 @@ strlen () {
run_tests () {
type=$1
object_name="$2"
- size=$3
- content=$4
- pretty_content=$5
- oid=${6:-"$object_name"}
+ mode=$3
+ size=$4
+ content=$5
+ pretty_content=$6
+ oid=${7:-"$object_name"}
batch_output="$oid $type $size
$content"
@@ -209,6 +210,12 @@ $content"
test_cmp expect actual
'
+ test_expect_success '--batch-check with %(objectmode)' '
+ echo "$mode $oid" >expect &&
+ echo $object_name | git cat-file --batch-check="%(objectmode) %(objectname)" >actual &&
+ test_cmp expect actual
+ '
+
test -z "$content" ||
test_expect_success "--batch without type ($type)" '
{
@@ -247,8 +254,7 @@ test_expect_success "setup" '
run_blob_tests () {
oid=$1
-
- run_tests 'blob' $oid $hello_size "$hello_content" "$hello_content"
+ run_tests 'blob' $oid "" $hello_size "$hello_content" "$hello_content"
test_expect_success '--batch-command --buffer with flush for blob info' '
echo "$oid blob $hello_size" >expect &&
@@ -286,12 +292,12 @@ tree_compat_size=$((2 * $(test_oid --hash=compat rawsz) + 13 + 24))
tree_pretty_content="100644 blob $hello_oid hello${LF}100755 blob $hello_oid path with spaces${LF}"
tree_compat_pretty_content="100644 blob $hello_compat_oid hello${LF}100755 blob $hello_compat_oid path with spaces${LF}"
-run_tests 'tree' $tree_oid $tree_size "" "$tree_pretty_content"
-run_tests 'tree' $tree_compat_oid $tree_compat_size "" "$tree_compat_pretty_content"
-run_tests 'blob' "$tree_oid:hello" $hello_size "" "$hello_content" $hello_oid
-run_tests 'blob' "$tree_compat_oid:hello" $hello_size "" "$hello_content" $hello_compat_oid
-run_tests 'blob' "$tree_oid:path with spaces" $hello_size "" "$hello_content" $hello_oid
-run_tests 'blob' "$tree_compat_oid:path with spaces" $hello_size "" "$hello_content" $hello_compat_oid
+run_tests 'tree' $tree_oid "" $tree_size "" "$tree_pretty_content"
+run_tests 'tree' $tree_compat_oid "" $tree_compat_size "" "$tree_compat_pretty_content"
+run_tests 'blob' "$tree_oid:hello" "100644" $hello_size "" "$hello_content" $hello_oid
+run_tests 'blob' "$tree_compat_oid:hello" "100644" $hello_size "" "$hello_content" $hello_compat_oid
+run_tests 'blob' "$tree_oid:path with spaces" "100755" $hello_size "" "$hello_content" $hello_oid
+run_tests 'blob' "$tree_compat_oid:path with spaces" "100755" $hello_size "" "$hello_content" $hello_compat_oid
commit_message="Initial commit"
commit_oid=$(echo_without_newline "$commit_message" | git commit-tree $tree_oid)
@@ -310,8 +316,8 @@ committer $GIT_COMMITTER_NAME <$GIT_COMMITTER_EMAIL> $GIT_COMMITTER_DATE
$commit_message"
-run_tests 'commit' $commit_oid $commit_size "$commit_content" "$commit_content"
-run_tests 'commit' $commit_compat_oid $commit_compat_size "$commit_compat_content" "$commit_compat_content"
+run_tests 'commit' $commit_oid "" $commit_size "$commit_content" "$commit_content"
+run_tests 'commit' $commit_compat_oid "" $commit_compat_size "$commit_compat_content" "$commit_compat_content"
tag_header_without_oid="type blob
tag hellotag
@@ -334,8 +340,8 @@ tag_size=$(strlen "$tag_content")
tag_compat_oid=$(git rev-parse --output-object-format=$test_compat_hash_algo $tag_oid)
tag_compat_size=$(strlen "$tag_compat_content")
-run_tests 'tag' $tag_oid $tag_size "$tag_content" "$tag_content"
-run_tests 'tag' $tag_compat_oid $tag_compat_size "$tag_compat_content" "$tag_compat_content"
+run_tests 'tag' $tag_oid "" $tag_size "$tag_content" "$tag_content"
+run_tests 'tag' $tag_compat_oid "" $tag_compat_size "$tag_compat_content" "$tag_compat_content"
test_expect_success "Reach a blob from a tag pointing to it" '
echo_without_newline "$hello_content" >expect &&
--
gitgitgadget
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 3/3] cat-file.c: add batch handling for submodules
2025-06-02 18:55 [PATCH 0/3] cat-file: add %(objectmode) and submodule message to batch commands Victoria Dye via GitGitGadget
2025-06-02 18:55 ` [PATCH 1/3] t1006: update 'run_tests' to test generic object specifiers Victoria Dye via GitGitGadget
2025-06-02 18:55 ` [PATCH 2/3] cat-file: add %(objectmode) atom Victoria Dye via GitGitGadget
@ 2025-06-02 18:55 ` Victoria Dye via GitGitGadget
2025-06-04 19:54 ` Jeff King
2025-06-04 14:43 ` [PATCH 0/3] cat-file: add %(objectmode) and submodule message to batch commands Junio C Hamano
3 siblings, 1 reply; 12+ messages in thread
From: Victoria Dye via GitGitGadget @ 2025-06-02 18:55 UTC (permalink / raw)
To: git; +Cc: peff, gitster, Victoria Dye, Victoria Dye
From: Victoria Dye <vdye@github.com>
When an object specification is passed to 'cat-file --batch[-check]'
referring to a submodule (e.g. 'HEAD:path/to/my/submodule'), the current
behavior of the command is to print the "missing" error message. However, it
is often valuable for callers to distinguish between paths that are actually
missing and "the submodule tree entry exists, but the object does not exist
in the repository".
To disambiguate without needing to invoke a separate Git process (e.g.
'ls-tree'), print the message "<oid> submodule" for such objects instead of
"<object> missing". In addition to the change from "missing" to "submodule",
the new message differs from the old in that it always prints the resolved
tree entry's OID, rather than the input object specification.
Note that this implementation maintains a distinction between submodules
where the commit OID is not present in the repo, and submodules where the
commit OID *is* present; the former will now print "<object> submodule", but
the latter will still print the full object content.
Signed-off-by: Victoria Dye <vdye@github.com>
---
Documentation/git-cat-file.adoc | 8 ++++++++
builtin/cat-file.c | 5 ++++-
t/t1006-cat-file.sh | 25 +++++++++++++++++++++++++
3 files changed, 37 insertions(+), 1 deletion(-)
diff --git a/Documentation/git-cat-file.adoc b/Documentation/git-cat-file.adoc
index 5c002c0499e4..180d1ad363fd 100644
--- a/Documentation/git-cat-file.adoc
+++ b/Documentation/git-cat-file.adoc
@@ -373,6 +373,14 @@ If a name is specified that might refer to more than one object (an ambiguous sh
<object> SP ambiguous LF
------------
+If a name is specified that refers to a submodule entry in a tree and the
+target object does not exist in the repository, then `cat-file` will ignore
+any custom format and print (with the object ID of the submodule):
+
+------------
+<oid> SP submodule LF
+------------
+
If `--follow-symlinks` is used, and a symlink in the repository points
outside the repository, then `cat-file` will ignore any custom format
and print:
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index b11576756bcc..4b23fcecbd8e 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -496,7 +496,10 @@ static void batch_object_write(const char *obj_name,
&data->oid, &data->info,
OBJECT_INFO_LOOKUP_REPLACE);
if (ret < 0) {
- report_object_status(opt, obj_name, &data->oid, "missing");
+ if (data->mode == S_IFGITLINK)
+ report_object_status(opt, oid_to_hex(&data->oid), &data->oid, "submodule");
+ else
+ report_object_status(opt, obj_name, &data->oid, "missing");
return;
}
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index 97052b3f31f1..f123ef1e360a 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -1220,6 +1220,31 @@ test_expect_success 'cat-file --batch-check respects replace objects' '
test_cmp expect actual
'
+test_expect_success 'batch-check with a submodule' '
+ # FIXME: this call to mktree is incompatible with compatObjectFormat
+ # because the submodule OID cannot be mapped to the compat hash algo.
+ test_unconfig extensions.compatobjectformat &&
+ printf "160000 commit $(test_oid deadbeef)\tsub\n" >tree-with-sub &&
+ tree=$(git mktree <tree-with-sub) &&
+ test_config extensions.compatobjectformat $test_compat_hash_algo &&
+
+ git cat-file --batch-check >actual <<-EOF &&
+ $tree:sub
+ EOF
+ printf "$(test_oid deadbeef) submodule\n" >expect &&
+ test_cmp expect actual
+'
+
+test_expect_success 'batch-check with a submodule, object exists' '
+ printf "160000 commit $commit_oid\tsub\n" >tree-with-sub &&
+ tree=$(git mktree <tree-with-sub) &&
+ git cat-file --batch-check >actual <<-EOF &&
+ $tree:sub
+ EOF
+ printf "$commit_oid commit $commit_size\n" >expect &&
+ test_cmp expect actual
+'
+
# Pull the entry for object with oid "$1" out of the output of
# "cat-file --batch", including its object content (which requires
# parsing and reading a set amount of bytes, hence perl).
--
gitgitgadget
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 0/3] cat-file: add %(objectmode) and submodule message to batch commands
2025-06-02 18:55 [PATCH 0/3] cat-file: add %(objectmode) and submodule message to batch commands Victoria Dye via GitGitGadget
` (2 preceding siblings ...)
2025-06-02 18:55 ` [PATCH 3/3] cat-file.c: add batch handling for submodules Victoria Dye via GitGitGadget
@ 2025-06-04 14:43 ` Junio C Hamano
2025-06-04 19:57 ` Jeff King
3 siblings, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2025-06-04 14:43 UTC (permalink / raw)
To: Victoria Dye via GitGitGadget; +Cc: git, peff, Victoria Dye
"Victoria Dye via GitGitGadget" <gitgitgadget@gmail.com> writes:
> This series re-attempts the changes proposed last year [1] for extending the
> information about tree entries available from the 'cat-file' batch format
> commands. It also (hopefully) addresses the initial round of feedback that
> series received.
>
> The first patch updates 't1006-cat-file.sh' to test non-OID object
> specifications. In response to the feedback in [2], I added more careful
> quoting and a couple tests using paths with spaces. This change revealed a
> (likely known) limitation of the '%(rest)' atom when processing object names
> with spaces. To make that limitation explicit, I marked the relevant test as
> expected to fail.
>
> The second patch adds "mode" support. This is essentially unchanged from its
> initial submission, save for some conflict resolution in the test script.
>
> The final patch takes a different approach to submodule resolution than the
> initial submission; rather than treat the entry as a "regular" commit object
> with empty content, we now print an error message similar to the "missing",
> "ambiguous", etc. cases, but with the tree entry's OID rather than the input
> object name.
I did not send any line-by-line reviews, but after reading these
patches I didn't see anything questionable. Unless we see others
comments that need to be addressed, let's merge it to 'next' in
preparation for the next cycle.
Thanks.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/3] cat-file: add %(objectmode) atom
2025-06-02 18:55 ` [PATCH 2/3] cat-file: add %(objectmode) atom Victoria Dye via GitGitGadget
@ 2025-06-04 19:36 ` Jeff King
0 siblings, 0 replies; 12+ messages in thread
From: Jeff King @ 2025-06-04 19:36 UTC (permalink / raw)
To: Victoria Dye via GitGitGadget; +Cc: git, gitster, Victoria Dye
On Mon, Jun 02, 2025 at 06:55:54PM +0000, Victoria Dye via GitGitGadget wrote:
> Add a formatting atom, used with the --batch-check/--batch-command options,
> that prints the octal representation of the object mode if a given revision
> includes that information, e.g. one that follows the format
> <tree-ish>:<path>. If the mode information does not exist, an empty string
> is printed instead.
Overall, this looks good to me. I have a few small comments below,
though I'm not sure if they merit a re-roll or not.
> @@ -345,6 +347,9 @@ static int expand_atom(struct strbuf *sb, const char *atom, int len,
> else
> strbuf_addstr(sb,
> oid_to_hex(&data->delta_base_oid));
> + } else if (is_atom("objectmode", atom, len)) {
> + if (!data->mark_query && !(S_IFINVALID == data->mode))
> + strbuf_addf(sb, "%06o", data->mode);
> } else
> return 0;
> return 1;
Looking at this hunk raised a few questions. Fortunately with answers. ;)
First, in other parts of this if/else chain, when mark_query is set we
need to perform some action (usually setting up the object_info
pointers). But we _don't_ need to do that here, since we get the mode
info "for free" from get_oid_with_context(). Good.
Second, how do we reliably get S_IFINVALID? We can see that the
expand_data struct is now initialized with it:
> +#define EXPAND_DATA_INIT { .mode = S_IFINVALID }
But that seems like it would be a bug, since we only initialize it once,
in batch_objects():
> @@ -866,7 +872,7 @@ static int batch_objects(struct batch_options *opt)
> {
> struct strbuf input = STRBUF_INIT;
> struct strbuf output = STRBUF_INIT;
> - struct expand_data data;
> + struct expand_data data = EXPAND_DATA_INIT;
> int save_warning;
> int retval = 0;
>
> @@ -875,7 +881,6 @@ static int batch_objects(struct batch_options *opt)
> * object_info to be handed to oid_object_info_extended for each
> * object.
> */
> - memset(&data, 0, sizeof(data));
> data.mark_query = 1;
> expand_format(&output,
> opt->format ? opt->format : DEFAULT_FORMAT,
>
> static int is_atom(const char *atom, const char *s, int slen)
> {
...and then call batch_one_object() over and over. So at first glance,
doing this:
(echo HEAD:Makefile; echo HEAD) |
git cat-file --batch-check='%(objectmode)'
would let the mode from the first object bleed over into the second. But
that doesn't happen, because we overwrite expand_data.mode for each
object unconditionally, here:
> @@ -613,6 +618,7 @@ static void batch_one_object(const char *obj_name,
> goto out;
> }
>
> + data->mode = ctx.mode;
> batch_object_write(obj_name, scratch, opt, data, NULL, 0);
>
> out:
And there we are relying on ctx.mode, which we get from
get_oid_with_context(), which always falls back to S_IFINVALID if no
mode is available. Good.
But I think that means that the value set in EXPAND_DATA_INIT is never
used, and we could continue to zero-initialize the struct with memset?
That said, it's probably OK to err on the side of over-initializing. The
worst case is probably somebody later reading the code being confused
about the importance of the line. And at best it may prevent a future
code path from unexpectedly reading a funny value.
And on to the third question. In the non-batch code path of
cat_one_file(), we do:
if (obj_context.mode == S_IFINVALID)
obj_context.mode = 0100644;
which made me wonder if we should be harmonizing our behavior. But that
mode is used only for passing to filter_object() and textconv_object().
Neither of which really care about the mode, and this is mostly just
saying "eh, do your regular thing as if it were a blob we found at
--path". I suspect we could get the same effect by just passing a
hard-coded 100644 to those functions, but probably not worth changing
now (and certainly very orthogonal to your patch). But the important
thing is we do not really need to worry about being consistent with this
line. Good.
-Peff
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] cat-file.c: add batch handling for submodules
2025-06-02 18:55 ` [PATCH 3/3] cat-file.c: add batch handling for submodules Victoria Dye via GitGitGadget
@ 2025-06-04 19:54 ` Jeff King
2025-06-05 0:12 ` Victoria Dye
0 siblings, 1 reply; 12+ messages in thread
From: Jeff King @ 2025-06-04 19:54 UTC (permalink / raw)
To: Victoria Dye via GitGitGadget; +Cc: git, gitster, Victoria Dye
On Mon, Jun 02, 2025 at 06:55:55PM +0000, Victoria Dye via GitGitGadget wrote:
> To disambiguate without needing to invoke a separate Git process (e.g.
> 'ls-tree'), print the message "<oid> submodule" for such objects instead of
> "<object> missing". In addition to the change from "missing" to "submodule",
> the new message differs from the old in that it always prints the resolved
> tree entry's OID, rather than the input object specification.
OK. I read over the discussion from last year, which I think mostly
centered around this patch. I do still think in the long run it would be
nice for cat-file to produce what output it _can_ for a missing object
(e.g., the oid and mode).
But I think it is OK to punt on that for now. Because "<oid> missing"
lines already exist, we'd probably need to put such behavior behind a
new command-line option. So while "<oid> submodule" lines would be
unnecessary in that hypothetical future world, we are not digging the
hole any deeper, from a backwards-compatibility standpoint.
Although speaking of backwards compatibility, I guess older readers may
be surprised that the old "missing" message becomes a "submodule" one.
They may need to be updated if they were written carefully to bail on
unknown input (and were happy seeing "missing" messages for submodules).
So there may be some fallout, but it's not like the existing messages
were particularly useful in the first place.
> Note that this implementation maintains a distinction between submodules
> where the commit OID is not present in the repo, and submodules where the
> commit OID *is* present; the former will now print "<object> submodule", but
> the latter will still print the full object content.
Hmm, that is an interesting point. It feels kind of arbitrary, but I'm
having trouble making a strong argument for one direction or the other.
The way you've written it means that readers need to be prepared to
parse _both_ the mode and "<oid> submodule" lines to find submodules.
But maybe there's some value in finding out more information about
submodule commits you do have in-repo.
The implementations are similar. Replacing this hunk:
> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> index b11576756bcc..4b23fcecbd8e 100644
> --- a/builtin/cat-file.c
> +++ b/builtin/cat-file.c
> @@ -496,7 +496,10 @@ static void batch_object_write(const char *obj_name,
> &data->oid, &data->info,
> OBJECT_INFO_LOOKUP_REPLACE);
> if (ret < 0) {
> - report_object_status(opt, obj_name, &data->oid, "missing");
> + if (data->mode == S_IFGITLINK)
> + report_object_status(opt, oid_to_hex(&data->oid), &data->oid, "submodule");
> + else
> + report_object_status(opt, obj_name, &data->oid, "missing");
> return;
> }
>
with:
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 4b23fcecbd..1b200e1607 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -488,6 +488,11 @@ static void batch_object_write(const char *obj_name,
if (opt->objects_filter.choice == LOFC_BLOB_LIMIT)
data->info.sizep = &data->size;
+ if (data->mode == S_IFGITLINK) {
+ report_object_status(opt, oid_to_hex(&data->oid), &data->oid, "submodule");
+ return;
+ }
+
if (pack)
ret = packed_object_info(the_repository, pack, offset,
&data->info);
so I think the decision is really about what people will find most
useful. So I dunno. It is mostly a coin-flip, leading me to say that
what you picked just came up "heads" and is good enough. ;)
-Peff
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 0/3] cat-file: add %(objectmode) and submodule message to batch commands
2025-06-04 14:43 ` [PATCH 0/3] cat-file: add %(objectmode) and submodule message to batch commands Junio C Hamano
@ 2025-06-04 19:57 ` Jeff King
0 siblings, 0 replies; 12+ messages in thread
From: Jeff King @ 2025-06-04 19:57 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Victoria Dye via GitGitGadget, git, Victoria Dye
On Wed, Jun 04, 2025 at 07:43:21AM -0700, Junio C Hamano wrote:
> > The final patch takes a different approach to submodule resolution than the
> > initial submission; rather than treat the entry as a "regular" commit object
> > with empty content, we now print an error message similar to the "missing",
> > "ambiguous", etc. cases, but with the tree entry's OID rather than the input
> > object name.
>
> I did not send any line-by-line reviews, but after reading these
> patches I didn't see anything questionable. Unless we see others
> comments that need to be addressed, let's merge it to 'next' in
> preparation for the next cycle.
I left a more detailed review. But it's mostly musing and self-answering
questions. I'd be OK to see this progress as-is.
-Peff
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] cat-file.c: add batch handling for submodules
2025-06-04 19:54 ` Jeff King
@ 2025-06-05 0:12 ` Victoria Dye
2025-06-05 7:51 ` Jeff King
0 siblings, 1 reply; 12+ messages in thread
From: Victoria Dye @ 2025-06-05 0:12 UTC (permalink / raw)
To: Jeff King, Victoria Dye via GitGitGadget; +Cc: git, gitster
Jeff King wrote:
> On Mon, Jun 02, 2025 at 06:55:55PM +0000, Victoria Dye via GitGitGadget wrote:
>
>> To disambiguate without needing to invoke a separate Git process (e.g.
>> 'ls-tree'), print the message "<oid> submodule" for such objects instead of
>> "<object> missing". In addition to the change from "missing" to "submodule",
>> the new message differs from the old in that it always prints the resolved
>> tree entry's OID, rather than the input object specification.
>
> OK. I read over the discussion from last year, which I think mostly
> centered around this patch. I do still think in the long run it would be
> nice for cat-file to produce what output it _can_ for a missing object
> (e.g., the oid and mode).
One way to handle that could be changing the message to something like:
submodule SP <mode> SP <oid>
but...
>
> But I think it is OK to punt on that for now. Because "<oid> missing"
> lines already exist, we'd probably need to put such behavior behind a
> new command-line option. So while "<oid> submodule" lines would be
> unnecessary in that hypothetical future world, we are not digging the
> hole any deeper, from a backwards-compatibility standpoint.
>
> Although speaking of backwards compatibility, I guess older readers may
> be surprised that the old "missing" message becomes a "submodule" one.
> They may need to be updated if they were written carefully to bail on
> unknown input (and were happy seeing "missing" messages for submodules).
> So there may be some fallout, but it's not like the existing messages
> were particularly useful in the first place.
...I suspect that'd be even less compatible with existing automation around
'cat-file' than just swapping out "submodule" for "missing", and users can
theoretically infer that the mode is 160000 (S_IFGITLINK). That said, if at
some point in the future we support submodules with a different mode, then
an explicit value would be fairly useful.
Happy to change it or keep it the same, I have no strong preference either
way.
>
>> Note that this implementation maintains a distinction between submodules
>> where the commit OID is not present in the repo, and submodules where the
>> commit OID *is* present; the former will now print "<object> submodule", but
>> the latter will still print the full object content.
>
> Hmm, that is an interesting point. It feels kind of arbitrary, but I'm
> having trouble making a strong argument for one direction or the other.
> The way you've written it means that readers need to be prepared to
> parse _both_ the mode and "<oid> submodule" lines to find submodules.
> But maybe there's some value in finding out more information about
> submodule commits you do have in-repo.
This was pretty much my thought process on it. It was a somewhat arbitrary
choice, but what tipped me towards distinguishing the cases is that I'd
rather have information like size, content, etc. about a commit and not need
to use it, than need it but not have it available. That, and it does
maintain the existing treatment of self-referential submodules.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 3/3] cat-file.c: add batch handling for submodules
2025-06-05 0:12 ` Victoria Dye
@ 2025-06-05 7:51 ` Jeff King
0 siblings, 0 replies; 12+ messages in thread
From: Jeff King @ 2025-06-05 7:51 UTC (permalink / raw)
To: Victoria Dye; +Cc: Victoria Dye via GitGitGadget, git, gitster
On Wed, Jun 04, 2025 at 05:12:54PM -0700, Victoria Dye wrote:
> > OK. I read over the discussion from last year, which I think mostly
> > centered around this patch. I do still think in the long run it would be
> > nice for cat-file to produce what output it _can_ for a missing object
> > (e.g., the oid and mode).
>
> One way to handle that could be changing the message to something like:
>
> submodule SP <mode> SP <oid>
Hmm, yeah. That seemed weird to me at first because it doesn't
necessarily match what the caller asked for via batch-check. But really,
mode and oid are the only reasonable things we could report anyway[1].
And the mode is implicit in the word "submodule", so really there is
only the oid to report.
[1] For now, at least. If we ever finally unify all of the various
formatting code, then one might in theory be able to feed a refname
to cat-file and print information about the ref, or perhaps other
meta-information. But let's not worry about that hypothetical for
now.
> ...I suspect that'd be even less compatible with existing automation around
> 'cat-file' than just swapping out "submodule" for "missing", and users can
> theoretically infer that the mode is 160000 (S_IFGITLINK). That said, if at
> some point in the future we support submodules with a different mode, then
> an explicit value would be fairly useful.
>
> Happy to change it or keep it the same, I have no strong preference either
> way.
Right, that makes sense. I do wonder if:
<oid> missing submodule
might be friendlier to readers who are matching on /^[0-9a-f]+ missing/,
but now I am just guessing at a hypothetical program. So it may not be
worth going down that rabbit hole, and we can just go with what you
posted.
We can always worry about extending it later with an option to say "turn
placeholders for missing objects into empty strings" or similar.
I did come across one other interesting case while thinking about this,
though. When running:
git cat-file --batch-check='%(objectname) %(objectmode)'
we do not need to access the object at all! So why does a submodule
entry cause us to complain? The answer is that cat-file will (mostly for
historical reasons) confirm the existence of the object name that is fed
to it by calling oid_object_info(). The only exception is when we are
doing --batch-all-objects, since there we know we have the object,
because we found it by iterating the odb. And we optimize out the extra
call for that case (which makes a big difference if you're just printing
the object names).
But since we don't expect submodule entries to exist in the first place,
it might be reasonable to loosen that check. Something like this, though
I think it could benefit from some refactoring:
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 4b23fcecbd..bb52d9b673 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -304,8 +304,20 @@ struct expand_data {
* This flag will be true if the requested batch format and options
* don't require us to call oid_object_info, which can then be
* optimized out.
+ *
+ * The "submodule" variant is true if the format doesn't require it,
+ * but other options mean we'd usually continue to do so to check
+ * object existence. We can still omit the call for submodules in that
+ * case.
+ *
+ * This might be less confusing if we break skip_object_info down into
+ * two parts:
+ * - does the format require oid_object_info?
+ * - do the other options require checking existence?
*/
unsigned skip_object_info : 1;
+ unsigned skip_submodule_info : 1;
+
};
#define EXPAND_DATA_INIT { .mode = S_IFINVALID }
@@ -477,7 +489,8 @@ static void batch_object_write(const char *obj_name,
struct packed_git *pack,
off_t offset)
{
- if (!data->skip_object_info) {
+ if (!(data->skip_object_info ||
+ (data->skip_submodule_info && data->mode == S_IFGITLINK))) {
int ret;
if (use_mailmap ||
@@ -939,6 +952,12 @@ static int batch_objects(struct batch_options *opt)
strbuf_release(&output);
return 0;
+ } else {
+ struct object_info empty = OBJECT_INFO_INIT;
+
+ if (!memcmp(&data.info, &empty, sizeof(empty)) &&
+ opt->objects_filter.choice == LOFC_DISABLED)
+ data.skip_submodule_info = 1;
}
/*
I don't think that needs to be part of your series, though. We'd still
potentially need to handle the missing-submodule case for format
requests that actually look at the object, which would hit the "<oid>
submodule" case you're adding. So it could come later (or not at all),
and it's probably only worth pursuing if it would make life easier for
your intended caller.
-Peff
^ permalink raw reply related [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-06-05 7:51 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-02 18:55 [PATCH 0/3] cat-file: add %(objectmode) and submodule message to batch commands Victoria Dye via GitGitGadget
2025-06-02 18:55 ` [PATCH 1/3] t1006: update 'run_tests' to test generic object specifiers Victoria Dye via GitGitGadget
2025-06-02 18:55 ` [PATCH 2/3] cat-file: add %(objectmode) atom Victoria Dye via GitGitGadget
2025-06-04 19:36 ` Jeff King
2025-06-02 18:55 ` [PATCH 3/3] cat-file.c: add batch handling for submodules Victoria Dye via GitGitGadget
2025-06-04 19:54 ` Jeff King
2025-06-05 0:12 ` Victoria Dye
2025-06-05 7:51 ` Jeff King
2025-06-04 14:43 ` [PATCH 0/3] cat-file: add %(objectmode) and submodule message to batch commands Junio C Hamano
2025-06-04 19:57 ` Jeff King
-- strict thread matches above, loose matches on Subject: below --
2024-03-11 18:55 [PATCH 0/3] cat-file: add %(objectmode) avoid verifying submodules' OIDs Johannes Schindelin via GitGitGadget
2024-03-11 18:56 ` [PATCH 1/3] t1006: update 'run_tests' to test generic object specifiers Victoria Dye via GitGitGadget
2024-03-11 21:54 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).