* [PATCH v9 0/5] cat-file: teach cat-file a '--allow-unkown-type' option
@ 2015-04-29 12:50 karthik nayak
2015-04-29 12:52 ` [PATCH v9 1/5] sha1_file: support reading from a loose object of unknown type Karthik Nayak
` (4 more replies)
0 siblings, 5 replies; 17+ messages in thread
From: karthik nayak @ 2015-04-29 12:50 UTC (permalink / raw)
To: Git List; +Cc: Junio C Hamano, Eric Sunshine
The last iteration of the patch can be seen :
http://thread.gmane.org/gmane.comp.version-control.git/267213
Changes since last version:
sha1_file:
* eliminate "struct strbuf typename = STRBUF_INIT" in
"parse_sha1_header_extended()"
* make "unpack_sha1_header_to_strbuf()" work automagically. Now it
unpacks first 32 bytes and checks if the header is unpacked before
moving to a strbuf.
* remove unnecessary if condition for "strbuf_release()" .
cat-file:
* change the option name from "--literally" to "--allow-unkown-type".
* make the options mutually exclusive.
* clean up '-s' option.
t1006:
* use echo_without_newline() and added quotes.
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH v9 1/5] sha1_file: support reading from a loose object of unknown type
2015-04-29 12:50 [PATCH v9 0/5] cat-file: teach cat-file a '--allow-unkown-type' option karthik nayak
@ 2015-04-29 12:52 ` Karthik Nayak
2015-04-29 14:49 ` Junio C Hamano
2015-04-29 12:52 ` [PATCH v9 2/5] cat-file: make the options mutually exclusive Karthik Nayak
` (3 subsequent siblings)
4 siblings, 1 reply; 17+ messages in thread
From: Karthik Nayak @ 2015-04-29 12:52 UTC (permalink / raw)
To: git; +Cc: gitster, sunshine, Karthik Nayak
Update sha1_loose_object_info() to optionally allow it to read
from a loose object file of unknown/bogus type; as the function
usually returns the type of the object it read in the form of enum
for known types, add an optional "typename" field to receive the
name of the type in textual form and a flag to indicate the reading
of a loose object file of unknown/bogus type.
Add parse_sha1_header_extended() which acts as a wrapper around
parse_sha1_header() allowing more information to be obtained.
Add unpack_sha1_header_to_strbuf() to unpack sha1 headers of
unknown/corrupt objects which have a unknown sha1 header size to
a strbuf structure. This was written by Junio C Hamano but tested
by me.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Hepled-by: Jeff King <peff@peff.net>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
cache.h | 2 +
sha1_file.c | 119 +++++++++++++++++++++++++++++++++++++++++++++++++-----------
2 files changed, 99 insertions(+), 22 deletions(-)
diff --git a/cache.h b/cache.h
index c47323e..088577d 100644
--- a/cache.h
+++ b/cache.h
@@ -881,6 +881,7 @@ extern int is_ntfs_dotgit(const char *name);
/* object replacement */
#define LOOKUP_REPLACE_OBJECT 1
+#define LOOKUP_UNKNOWN_OBJECT 2
extern void *read_sha1_file_extended(const unsigned char *sha1, enum object_type *type, unsigned long *size, unsigned flag);
static inline void *read_sha1_file(const unsigned char *sha1, enum object_type *type, unsigned long *size)
{
@@ -1349,6 +1350,7 @@ struct object_info {
unsigned long *sizep;
unsigned long *disk_sizep;
unsigned char *delta_base_sha1;
+ struct strbuf *typename;
/* Response */
enum {
diff --git a/sha1_file.c b/sha1_file.c
index 980ce6b..9a15634 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1564,6 +1564,33 @@ int unpack_sha1_header(git_zstream *stream, unsigned char *map, unsigned long ma
return git_inflate(stream, 0);
}
+static int unpack_sha1_header_to_strbuf(git_zstream *stream, unsigned char *map,
+ unsigned long mapsize, void *buffer,
+ unsigned long bufsiz, struct strbuf *header)
+{
+ unsigned char *cp;
+ int status;
+
+ status = unpack_sha1_header(stream, map, mapsize, buffer, bufsiz);
+
+ /*
+ * Check if entire header is unpacked in the first iteration.
+ */
+ if (memchr(buffer, '\0', stream->next_out - (unsigned char *)buffer))
+ return 0;
+
+ strbuf_add(header, buffer, stream->next_out - (unsigned char *)buffer);
+ do {
+ status = git_inflate(stream, 0);
+ strbuf_add(header, buffer, stream->next_out - (unsigned char *)buffer);
+ if (memchr(buffer, '\0', stream->next_out - (unsigned char *)buffer))
+ return 0;
+ stream->next_out = buffer;
+ stream->avail_out = bufsiz;
+ } while (status != Z_STREAM_END);
+ return -1;
+}
+
static void *unpack_sha1_rest(git_zstream *stream, void *buffer, unsigned long size, const unsigned char *sha1)
{
int bytes = strlen(buffer) + 1;
@@ -1614,27 +1641,38 @@ static void *unpack_sha1_rest(git_zstream *stream, void *buffer, unsigned long s
* too permissive for what we want to check. So do an anal
* object header parse by hand.
*/
-int parse_sha1_header(const char *hdr, unsigned long *sizep)
+static int parse_sha1_header_extended(const char *hdr, struct object_info *oi,
+ unsigned int flags)
{
- char type[10];
- int i;
+ const char *type_buf = hdr;
unsigned long size;
+ int type, type_len = 0;
/*
- * The type can be at most ten bytes (including the
- * terminating '\0' that we add), and is followed by
+ * The type can be of any size but is followed by
* a space.
*/
- i = 0;
for (;;) {
char c = *hdr++;
if (c == ' ')
break;
- type[i++] = c;
- if (i >= sizeof(type))
- return -1;
+ type_len++;
}
- type[i] = 0;
+
+ type = type_from_string_gently(type_buf, type_len, 1);
+ if (oi->typename)
+ strbuf_add(oi->typename, type_buf, type_len);
+ /*
+ * Set type to 0 if its an unknown object and
+ * we're obtaining the type using '--allow-unkown-type'
+ * option.
+ */
+ if ((flags & LOOKUP_UNKNOWN_OBJECT) && (type == -1))
+ type = 0;
+ else if (type == -1)
+ die("invalid object type");
+ if (oi->typep)
+ *oi->typep = type;
/*
* The length must follow immediately, and be in canonical
@@ -1652,12 +1690,24 @@ int parse_sha1_header(const char *hdr, unsigned long *sizep)
size = size * 10 + c;
}
}
- *sizep = size;
+
+ if (oi->sizep)
+ *oi->sizep = size;
/*
* The length must be followed by a zero byte
*/
- return *hdr ? -1 : type_from_string(type);
+ return *hdr ? -1 : type;
+}
+
+int parse_sha1_header(const char *hdr, unsigned long *sizep)
+{
+ struct object_info oi;
+
+ oi.sizep = sizep;
+ oi.typename = NULL;
+ oi.typep = NULL;
+ return parse_sha1_header_extended(hdr, &oi, LOOKUP_REPLACE_OBJECT);
}
static void *unpack_sha1_file(void *map, unsigned long mapsize, enum object_type *type, unsigned long *size, const unsigned char *sha1)
@@ -2522,13 +2572,15 @@ struct packed_git *find_sha1_pack(const unsigned char *sha1,
}
static int sha1_loose_object_info(const unsigned char *sha1,
- struct object_info *oi)
+ struct object_info *oi,
+ int flags)
{
- int status;
- unsigned long mapsize, size;
+ int status = 0;
+ unsigned long mapsize;
void *map;
git_zstream stream;
char hdr[32];
+ struct strbuf hdrbuf = STRBUF_INIT;
if (oi->delta_base_sha1)
hashclr(oi->delta_base_sha1);
@@ -2541,7 +2593,7 @@ static int sha1_loose_object_info(const unsigned char *sha1,
* return value implicitly indicates whether the
* object even exists.
*/
- if (!oi->typep && !oi->sizep) {
+ if (!oi->typep && !oi->typename && !oi->sizep) {
struct stat st;
if (stat_sha1_file(sha1, &st) < 0)
return -1;
@@ -2555,17 +2607,24 @@ static int sha1_loose_object_info(const unsigned char *sha1,
return -1;
if (oi->disk_sizep)
*oi->disk_sizep = mapsize;
- if (unpack_sha1_header(&stream, map, mapsize, hdr, sizeof(hdr)) < 0)
+ if ((flags & LOOKUP_UNKNOWN_OBJECT)) {
+ if (unpack_sha1_header_to_strbuf(&stream, map, mapsize, hdr, sizeof(hdr), &hdrbuf) < 0)
+ status = error("unable to unpack %s header with --allow-unknown-type",
+ sha1_to_hex(sha1));
+ } else if (unpack_sha1_header(&stream, map, mapsize, hdr, sizeof(hdr)) < 0)
status = error("unable to unpack %s header",
sha1_to_hex(sha1));
- else if ((status = parse_sha1_header(hdr, &size)) < 0)
+ if (hdrbuf.len) {
+ if ((status = parse_sha1_header_extended(hdrbuf.buf, oi, flags)) < 0)
+ status = error("unable to parse %s header with --allow-unknown-type",
+ sha1_to_hex(sha1));
+ } else if ((status = parse_sha1_header_extended(hdr, oi, flags)) < 0)
status = error("unable to parse %s header", sha1_to_hex(sha1));
- else if (oi->sizep)
- *oi->sizep = size;
git_inflate_end(&stream);
munmap(map, mapsize);
- if (oi->typep)
+ if (status && oi->typep)
*oi->typep = status;
+ strbuf_release(&hdrbuf);
return 0;
}
@@ -2574,6 +2633,7 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi,
struct cached_object *co;
struct pack_entry e;
int rtype;
+ enum object_type real_type;
const unsigned char *real = lookup_replace_object_extended(sha1, flags);
co = find_cached_object(real);
@@ -2586,13 +2646,15 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi,
*(oi->disk_sizep) = 0;
if (oi->delta_base_sha1)
hashclr(oi->delta_base_sha1);
+ if (oi->typename)
+ strbuf_addstr(oi->typename, typename(co->type));
oi->whence = OI_CACHED;
return 0;
}
if (!find_pack_entry(real, &e)) {
/* Most likely it's a loose object. */
- if (!sha1_loose_object_info(real, oi)) {
+ if (!sha1_loose_object_info(real, oi, flags)) {
oi->whence = OI_LOOSE;
return 0;
}
@@ -2603,9 +2665,18 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi,
return -1;
}
+ /*
+ * packed_object_info() does not follow the delta chain to
+ * find out the real type, unless it is given oi->typep.
+ */
+ if (oi->typename && !oi->typep)
+ oi->typep = &real_type;
+
rtype = packed_object_info(e.p, e.offset, oi);
if (rtype < 0) {
mark_bad_packed_object(e.p, real);
+ if (oi->typep == &real_type)
+ oi->typep = NULL;
return sha1_object_info_extended(real, oi, 0);
} else if (in_delta_base_cache(e.p, e.offset)) {
oi->whence = OI_DBCACHED;
@@ -2616,6 +2687,10 @@ int sha1_object_info_extended(const unsigned char *sha1, struct object_info *oi,
oi->u.packed.is_delta = (rtype == OBJ_REF_DELTA ||
rtype == OBJ_OFS_DELTA);
}
+ if (oi->typename)
+ strbuf_addstr(oi->typename, typename(*oi->typep));
+ if (oi->typep == &real_type)
+ oi->typep = NULL;
return 0;
}
--
2.4.0.rc1.250.g565e85b
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v9 2/5] cat-file: make the options mutually exclusive
2015-04-29 12:50 [PATCH v9 0/5] cat-file: teach cat-file a '--allow-unkown-type' option karthik nayak
2015-04-29 12:52 ` [PATCH v9 1/5] sha1_file: support reading from a loose object of unknown type Karthik Nayak
@ 2015-04-29 12:52 ` Karthik Nayak
2015-04-29 12:53 ` [PATCH v9 3/5] cat-file: teach cat-file a '--allow-unknown-type' option Karthik Nayak
` (2 subsequent siblings)
4 siblings, 0 replies; 17+ messages in thread
From: Karthik Nayak @ 2015-04-29 12:52 UTC (permalink / raw)
To: git; +Cc: gitster, sunshine, Karthik Nayak
Currently we only parse the options if 2 or 3 arguments are specified.
Update 'struct option options[]' to use OPT_CMDMODE rather than
OPT_SET_INT to allow only one mutually exclusive option and avoid the
need for checking number of arguments. This was written by Junio C Hamano,
tested by me.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
builtin/cat-file.c | 13 +++++--------
1 file changed, 5 insertions(+), 8 deletions(-)
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index df99df4..53b5376 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -362,12 +362,12 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
const struct option options[] = {
OPT_GROUP(N_("<type> can be one of: blob, tree, commit, tag")),
- OPT_SET_INT('t', NULL, &opt, N_("show object type"), 't'),
- OPT_SET_INT('s', NULL, &opt, N_("show object size"), 's'),
- OPT_SET_INT('e', NULL, &opt,
+ OPT_CMDMODE('t', NULL, &opt, N_("show object type"), 't'),
+ OPT_CMDMODE('s', NULL, &opt, N_("show object size"), 's'),
+ OPT_CMDMODE('e', NULL, &opt,
N_("exit with zero when there's no error"), 'e'),
- OPT_SET_INT('p', NULL, &opt, N_("pretty-print object's content"), 'p'),
- OPT_SET_INT(0, "textconv", &opt,
+ OPT_CMDMODE('p', NULL, &opt, N_("pretty-print object's content"), 'p'),
+ OPT_CMDMODE(0, "textconv", &opt,
N_("for blob objects, run textconv on object's content"), 'c'),
{ OPTION_CALLBACK, 0, "batch", &batch, "format",
N_("show info and content of objects fed from the standard input"),
@@ -380,9 +380,6 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
git_config(git_cat_file_config, NULL);
- if (argc != 3 && argc != 2)
- usage_with_options(cat_file_usage, options);
-
argc = parse_options(argc, argv, prefix, options, cat_file_usage, 0);
if (opt) {
--
2.4.0.rc1.250.g565e85b
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v9 3/5] cat-file: teach cat-file a '--allow-unknown-type' option
2015-04-29 12:50 [PATCH v9 0/5] cat-file: teach cat-file a '--allow-unkown-type' option karthik nayak
2015-04-29 12:52 ` [PATCH v9 1/5] sha1_file: support reading from a loose object of unknown type Karthik Nayak
2015-04-29 12:52 ` [PATCH v9 2/5] cat-file: make the options mutually exclusive Karthik Nayak
@ 2015-04-29 12:53 ` Karthik Nayak
2015-04-29 14:52 ` Junio C Hamano
2015-04-29 14:53 ` Phil Hord
2015-04-29 12:54 ` [PATCH v9 5/5] t1006: add tests for git cat-file --allow-unkown-type Karthik Nayak
2015-04-29 12:56 ` [PATCH v9 4/5] cat-file: add documentation for '--allow-unkown-type' option Karthik Nayak
4 siblings, 2 replies; 17+ messages in thread
From: Karthik Nayak @ 2015-04-29 12:53 UTC (permalink / raw)
To: git; +Cc: gitster, sunshine, Karthik Nayak
Currently 'git cat-file' throws an error while trying to
print the type or size of a broken/corrupt object. This is
because these objects are usually of unknown types.
Teach git cat-file a '--allow-unkown-type' option where it prints
the type or size of a broken/corrupt object without throwing
an error.
Modify '-t' and '-s' options to call sha1_object_info_extended()
directly to support the '--allow-unkown-type' option.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
builtin/cat-file.c | 38 ++++++++++++++++++++++++++------------
1 file changed, 26 insertions(+), 12 deletions(-)
diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 53b5376..299e2e5 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -9,13 +9,20 @@
#include "userdiff.h"
#include "streaming.h"
-static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
+static int cat_one_file(int opt, const char *exp_type, const char *obj_name,
+ int unkown_type)
{
unsigned char sha1[20];
enum object_type type;
char *buf;
unsigned long size;
struct object_context obj_context;
+ struct object_info oi = {NULL};
+ struct strbuf sb = STRBUF_INIT;
+ unsigned flags = LOOKUP_REPLACE_OBJECT;
+
+ if (unkown_type)
+ flags |= LOOKUP_UNKNOWN_OBJECT;
if (get_sha1_with_context(obj_name, 0, sha1, &obj_context))
die("Not a valid object name %s", obj_name);
@@ -23,20 +30,22 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
buf = NULL;
switch (opt) {
case 't':
- type = sha1_object_info(sha1, NULL);
- if (type > 0) {
- printf("%s\n", typename(type));
+ oi.typename = &sb;
+ if (sha1_object_info_extended(sha1, &oi, flags) < 0)
+ die("git cat-file: could not get object info");
+ if (sb.len) {
+ printf("%s\n", sb.buf);
+ strbuf_release(&sb);
return 0;
}
break;
case 's':
- type = sha1_object_info(sha1, &size);
- if (type > 0) {
- printf("%lu\n", size);
- return 0;
- }
- break;
+ oi.sizep = &size;
+ if (sha1_object_info_extended(sha1, &oi, flags) < 0)
+ die("git cat-file: could not get object info");
+ printf("%lu\n", size);
+ return 0;
case 'e':
return !has_sha1_file(sha1);
@@ -323,7 +332,7 @@ static int batch_objects(struct batch_options *opt)
}
static const char * const cat_file_usage[] = {
- N_("git cat-file (-t | -s | -e | -p | <type> | --textconv) <object>"),
+ N_("git cat-file (-t [--allow-unkown-type]|-s [--allow-unkown-type]|-e|-p|<type>|--textconv) <object>"),
N_("git cat-file (--batch | --batch-check) < <list-of-objects>"),
NULL
};
@@ -359,6 +368,7 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
int opt = 0;
const char *exp_type = NULL, *obj_name = NULL;
struct batch_options batch = {0};
+ int unkown_type = 0;
const struct option options[] = {
OPT_GROUP(N_("<type> can be one of: blob, tree, commit, tag")),
@@ -369,6 +379,8 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
OPT_CMDMODE('p', NULL, &opt, N_("pretty-print object's content"), 'p'),
OPT_CMDMODE(0, "textconv", &opt,
N_("for blob objects, run textconv on object's content"), 'c'),
+ OPT_BOOL( 0, "allow-unkown-type", &unkown_type,
+ N_("allow -s and -t to work with broken/corrupt objects")),
{ OPTION_CALLBACK, 0, "batch", &batch, "format",
N_("show info and content of objects fed from the standard input"),
PARSE_OPT_OPTARG, batch_option_callback },
@@ -402,5 +414,7 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
if (batch.enabled)
return batch_objects(&batch);
- return cat_one_file(opt, exp_type, obj_name);
+ if (unkown_type && opt != 't' && opt != 's')
+ die("git cat-file --allow-unkown-type: use with -s or -t");
+ return cat_one_file(opt, exp_type, obj_name, unkown_type);
}
--
2.4.0.rc1.250.g565e85b
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v9 5/5] t1006: add tests for git cat-file --allow-unkown-type
2015-04-29 12:50 [PATCH v9 0/5] cat-file: teach cat-file a '--allow-unkown-type' option karthik nayak
` (2 preceding siblings ...)
2015-04-29 12:53 ` [PATCH v9 3/5] cat-file: teach cat-file a '--allow-unknown-type' option Karthik Nayak
@ 2015-04-29 12:54 ` Karthik Nayak
2015-04-29 21:16 ` Eric Sunshine
2015-04-29 12:56 ` [PATCH v9 4/5] cat-file: add documentation for '--allow-unkown-type' option Karthik Nayak
4 siblings, 1 reply; 17+ messages in thread
From: Karthik Nayak @ 2015-04-29 12:54 UTC (permalink / raw)
To: git; +Cc: gitster, sunshine, Karthik Nayak
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
t/t1006-cat-file.sh | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index ab36b1e..8362019 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -47,6 +47,18 @@ $content"
test_cmp expect actual
'
+ test_expect_success "Type of $type is correct using --allow-unkown-type" '
+ echo $type >expect &&
+ git cat-file -t --allow-unkown-type $sha1 >actual &&
+ test_cmp expect actual
+ '
+
+ test_expect_success "Size of $type is correct using --allow-unkown-type" '
+ echo $size >expect &&
+ git cat-file -s --allow-unkown-type $sha1 >actual &&
+ test_cmp expect actual
+ '
+
test -z "$content" ||
test_expect_success "Content of $type is correct" '
maybe_remove_timestamp "$content" $no_ts >expect &&
@@ -296,4 +308,21 @@ test_expect_success '%(deltabase) reports packed delta bases' '
}
'
+bogus_type="bogus"
+bogus_content="bogus"
+bogus_size=$(strlen "$bogus_content")
+bogus_sha1=$(echo_without_newline "$bogus_content" | git hash-object -t $bogus_type --literally -w --stdin)
+
+test_expect_success "Type of broken object is correct" '
+ echo $bogus_type >expect &&
+ git cat-file -t --allow-unkown-type $bogus_sha1 >actual &&
+ test_cmp expect actual
+'
+
+test_expect_success "Size of broken object is correct" '
+ echo $bogus_size >expect &&
+ git cat-file -s --allow-unkown-type $bogus_sha1 >actual &&
+ test_cmp expect actual
+'
+
test_done
--
2.4.0.rc1.250.g565e85b
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH v9 4/5] cat-file: add documentation for '--allow-unkown-type' option.
2015-04-29 12:50 [PATCH v9 0/5] cat-file: teach cat-file a '--allow-unkown-type' option karthik nayak
` (3 preceding siblings ...)
2015-04-29 12:54 ` [PATCH v9 5/5] t1006: add tests for git cat-file --allow-unkown-type Karthik Nayak
@ 2015-04-29 12:56 ` Karthik Nayak
2015-04-29 21:13 ` Eric Sunshine
4 siblings, 1 reply; 17+ messages in thread
From: Karthik Nayak @ 2015-04-29 12:56 UTC (permalink / raw)
To: git; +Cc: gitster, sunshine, Karthik Nayak
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
---
Documentation/git-cat-file.txt | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
index f6a16f4..f6f6064 100644
--- a/Documentation/git-cat-file.txt
+++ b/Documentation/git-cat-file.txt
@@ -9,7 +9,7 @@ git-cat-file - Provide content or type and size information for repository objec
SYNOPSIS
--------
[verse]
-'git cat-file' (-t | -s | -e | -p | <type> | --textconv ) <object>
+'git cat-file' (-t [--allow-unkown-type]| -s [--allow-unkown-type]| -e | -p | <type> | --textconv ) <object>
'git cat-file' (--batch | --batch-check) < <list-of-objects>
DESCRIPTION
@@ -69,6 +69,9 @@ OPTIONS
not be combined with any other options or arguments. See the
section `BATCH OUTPUT` below for details.
+--allow-unkown-type::
+ Allow -s or -t to query broken/corrupt objects of unknown type.
+
OUTPUT
------
If '-t' is specified, one of the <type>.
--
2.4.0.rc1.250.g565e85b
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH v9 1/5] sha1_file: support reading from a loose object of unknown type
2015-04-29 12:52 ` [PATCH v9 1/5] sha1_file: support reading from a loose object of unknown type Karthik Nayak
@ 2015-04-29 14:49 ` Junio C Hamano
2015-04-29 17:50 ` karthik nayak
0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2015-04-29 14:49 UTC (permalink / raw)
To: Karthik Nayak; +Cc: git, sunshine
Karthik Nayak <karthik.188@gmail.com> writes:
> Update sha1_loose_object_info() to optionally allow it to read
> from a loose object file of unknown/bogus type; as the function
> usually returns the type of the object it read in the form of enum
> for known types, add an optional "typename" field to receive the
> name of the type in textual form and a flag to indicate the reading
> of a loose object file of unknown/bogus type.
>
> Add parse_sha1_header_extended() which acts as a wrapper around
> parse_sha1_header() allowing more information to be obtained.
Thanks. This mostly looks good modulo a nit.
> diff --git a/sha1_file.c b/sha1_file.c
> index 980ce6b..9a15634 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -1564,6 +1564,33 @@ int unpack_sha1_header(git_zstream *stream, unsigned char *map, unsigned long ma
> return git_inflate(stream, 0);
> }
>
> +static int unpack_sha1_header_to_strbuf(git_zstream *stream, unsigned char *map,
> + unsigned long mapsize, void *buffer,
> + unsigned long bufsiz, struct strbuf *header)
This function in this round looks somewhat tricky.
> +{
> + unsigned char *cp;
> + int status;
> +
> + status = unpack_sha1_header(stream, map, mapsize, buffer, bufsiz);
Here, we unpack into the caller-supplied buffer, without touching
the caller-supplied header strbuf.
> + /*
> + * Check if entire header is unpacked in the first iteration.
> + */
> + if (memchr(buffer, '\0', stream->next_out - (unsigned char *)buffer))
> + return 0;
And we return the object header in the buffer without ever touching
the header strbuf. We expect that the caller would know that the
buffer it gave us would contain the full object header line.
> + strbuf_add(header, buffer, stream->next_out - (unsigned char *)buffer);
> + do {
> + status = git_inflate(stream, 0);
> + strbuf_add(header, buffer, stream->next_out - (unsigned char *)buffer);
> + if (memchr(buffer, '\0', stream->next_out - (unsigned char *)buffer))
> + return 0;
However, here, we return the object header in the caller-supplied
header strbuf, while using the caller-supplied buffer as a scratch
area. We expect that the caller would know that the header strbuf
is what it has to use to find the object header.
Which is good in the sense that there is no unnecessary copies, but
the caller needs to be careful to do something like:
if (!unpack_to_strbuf(... buffer, sizeof(buffer), &header)) {
if (header.len)
object_header = header.buf;
else
object_header = buffer;
} else {
error("cannot unpack the object header");
}
It is a good trade-off between complexity and efficiency. The
complexity is isolated as the function is file-scope-static and it
is perfectly fine to force the callers to be extra careful.
But this suggests that the patch to add tests should try at least
two, preferably three, kinds of test input. A bogus type that needs
a header longer than the caller's fixed buffer, a bogus type whose
header would fit within the fixed buffer, and optionally a correct
type whose header should always fit within the fixed buffer.
> @@ -1614,27 +1641,38 @@ static void *unpack_sha1_rest(git_zstream *stream, void
> ...
> + /*
> + * Set type to 0 if its an unknown object and
> + * we're obtaining the type using '--allow-unkown-type'
> + * option.
> + */
> + if ((flags & LOOKUP_UNKNOWN_OBJECT) && (type == -1))
> + type = 0;
> + else if (type == -1)
> + die("invalid object type");
Would "type == -2" or any other negative value, if existed, mean
something different? I do not think so, and hence I would prefer
these two checks be done with "type < 0" instead.
> @@ -2522,13 +2572,15 @@ struct packed_git *find_sha1_pack(const unsigned char *sha1,
> }
>
> static int sha1_loose_object_info(const unsigned char *sha1,
> - struct object_info *oi)
> + struct object_info *oi,
> + int flags)
> {
> - int status;
> - unsigned long mapsize, size;
> + int status = 0;
> + unsigned long mapsize;
> void *map;
> git_zstream stream;
> char hdr[32];
> + struct strbuf hdrbuf = STRBUF_INIT;
> ...
> + if ((flags & LOOKUP_UNKNOWN_OBJECT)) {
> + if (unpack_sha1_header_to_strbuf(&stream, map, mapsize, hdr, sizeof(hdr), &hdrbuf) < 0)
> + status = error("unable to unpack %s header with --allow-unknown-type",
> + sha1_to_hex(sha1));
> + } else if (unpack_sha1_header(&stream, map, mapsize, hdr, sizeof(hdr)) < 0)
> status = error("unable to unpack %s header",
> sha1_to_hex(sha1));
> - else if ((status = parse_sha1_header(hdr, &size)) < 0)
> + if (hdrbuf.len) {
And here the caller is being careful, but the way it does is a bit
subtle. Even if it has a codepath that entirely bypasses a call to
unpack_to_strbuf(), it knows that the other codepath does not touch
the hdrbuf strbuf, which means that this block can be reached only
when it called unpack_to_strbuf(), the callee found a long object
header, and the unpacking succeeded. Otherwise, it will make a
call to parse_sha1_header_extended() using the fixed hdr[] in the
"else if" clause that corresponds to this "if".
> + if ((status = parse_sha1_header_extended(hdrbuf.buf, oi, flags)) < 0)
> + status = error("unable to parse %s header with --allow-unknown-type",
> + sha1_to_hex(sha1));
> + } else if ((status = parse_sha1_header_extended(hdr, oi, flags)) < 0)
> status = error("unable to parse %s header", sha1_to_hex(sha1));
However, I wonder what happens when "status" is already set after
the call to unpack functions above. hdrbuf.len may be zero so this
"else if" part would be taken, and it then still try to feed the
hdr[] to parse_sha1_header_extended(), no?
In other words, shouldn't the flow be more like this?
if (lookup-unknown) {
if (unpack-to-strbuf)
status = error();
} else if (unpack-to-hdr)
status = error();
+ if (status < 0)
+ ; /* do not do anything */
+ else if (hdrbuf.len)
- if (hdrbuf.len)
parse(hdrbuf.buf);
else
parse(hdr);
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v9 3/5] cat-file: teach cat-file a '--allow-unknown-type' option
2015-04-29 12:53 ` [PATCH v9 3/5] cat-file: teach cat-file a '--allow-unknown-type' option Karthik Nayak
@ 2015-04-29 14:52 ` Junio C Hamano
2015-04-29 17:43 ` karthik nayak
2015-04-29 14:53 ` Phil Hord
1 sibling, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2015-04-29 14:52 UTC (permalink / raw)
To: Karthik Nayak; +Cc: git, sunshine
Karthik Nayak <karthik.188@gmail.com> writes:
> Currently 'git cat-file' throws an error while trying to
> print the type or size of a broken/corrupt object. This is
> because these objects are usually of unknown types.
>
> Teach git cat-file a '--allow-unkown-type' option where it prints
> the type or size of a broken/corrupt object without throwing
> an error.
Drop "Currently" from the description of the problem you are
solving. We know that the problem you have to solve in the code is
current without being told. This comment applies to all patches.
> Modify '-t' and '-s' options to call sha1_object_info_extended()
> directly to support the '--allow-unkown-type' option.
>
> Helped-by: Junio C Hamano <gitster@pobox.com>
> Helped-by: Eric Sunshine <sunshine@sunshineco.com>
> Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
> ---
> builtin/cat-file.c | 38 ++++++++++++++++++++++++++------------
> 1 file changed, 26 insertions(+), 12 deletions(-)
>
> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> index 53b5376..299e2e5 100644
> --- a/builtin/cat-file.c
> +++ b/builtin/cat-file.c
> @@ -9,13 +9,20 @@
> #include "userdiff.h"
> #include "streaming.h"
>
> -static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
> +static int cat_one_file(int opt, const char *exp_type, const char *obj_name,
> + int unkown_type)
> {
> unsigned char sha1[20];
> enum object_type type;
> char *buf;
> unsigned long size;
> struct object_context obj_context;
> + struct object_info oi = {NULL};
> + struct strbuf sb = STRBUF_INIT;
> + unsigned flags = LOOKUP_REPLACE_OBJECT;
> +
> + if (unkown_type)
> + flags |= LOOKUP_UNKNOWN_OBJECT;
>
> if (get_sha1_with_context(obj_name, 0, sha1, &obj_context))
> die("Not a valid object name %s", obj_name);
> @@ -23,20 +30,22 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
> buf = NULL;
> switch (opt) {
> case 't':
> - type = sha1_object_info(sha1, NULL);
> - if (type > 0) {
> - printf("%s\n", typename(type));
> + oi.typename = &sb;
> + if (sha1_object_info_extended(sha1, &oi, flags) < 0)
> + die("git cat-file: could not get object info");
> + if (sb.len) {
> + printf("%s\n", sb.buf);
> + strbuf_release(&sb);
> return 0;
> }
> break;
>
> case 's':
> - type = sha1_object_info(sha1, &size);
> - if (type > 0) {
> - printf("%lu\n", size);
> - return 0;
> - }
> - break;
> + oi.sizep = &size;
> + if (sha1_object_info_extended(sha1, &oi, flags) < 0)
> + die("git cat-file: could not get object info");
> + printf("%lu\n", size);
> + return 0;
>
> case 'e':
> return !has_sha1_file(sha1);
> @@ -323,7 +332,7 @@ static int batch_objects(struct batch_options *opt)
> }
>
> static const char * const cat_file_usage[] = {
> - N_("git cat-file (-t | -s | -e | -p | <type> | --textconv) <object>"),
> + N_("git cat-file (-t [--allow-unkown-type]|-s [--allow-unkown-type]|-e|-p|<type>|--textconv) <object>"),
> N_("git cat-file (--batch | --batch-check) < <list-of-objects>"),
> NULL
> };
> @@ -359,6 +368,7 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
> int opt = 0;
> const char *exp_type = NULL, *obj_name = NULL;
> struct batch_options batch = {0};
> + int unkown_type = 0;
>
> const struct option options[] = {
> OPT_GROUP(N_("<type> can be one of: blob, tree, commit, tag")),
> @@ -369,6 +379,8 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
> OPT_CMDMODE('p', NULL, &opt, N_("pretty-print object's content"), 'p'),
> OPT_CMDMODE(0, "textconv", &opt,
> N_("for blob objects, run textconv on object's content"), 'c'),
> + OPT_BOOL( 0, "allow-unkown-type", &unkown_type,
> + N_("allow -s and -t to work with broken/corrupt objects")),
> { OPTION_CALLBACK, 0, "batch", &batch, "format",
> N_("show info and content of objects fed from the standard input"),
> PARSE_OPT_OPTARG, batch_option_callback },
> @@ -402,5 +414,7 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
> if (batch.enabled)
> return batch_objects(&batch);
>
> - return cat_one_file(opt, exp_type, obj_name);
> + if (unkown_type && opt != 't' && opt != 's')
> + die("git cat-file --allow-unkown-type: use with -s or -t");
> + return cat_one_file(opt, exp_type, obj_name, unkown_type);
> }
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v9 3/5] cat-file: teach cat-file a '--allow-unknown-type' option
2015-04-29 12:53 ` [PATCH v9 3/5] cat-file: teach cat-file a '--allow-unknown-type' option Karthik Nayak
2015-04-29 14:52 ` Junio C Hamano
@ 2015-04-29 14:53 ` Phil Hord
2015-04-29 17:44 ` karthik nayak
1 sibling, 1 reply; 17+ messages in thread
From: Phil Hord @ 2015-04-29 14:53 UTC (permalink / raw)
To: Karthik Nayak, Git; +Cc: Junio C Hamano, Eric Sunshine
On Wed, Apr 29, 2015 at 9:01 AM Karthik Nayak <karthik.188@gmail.com> wrote:
>
> Currently 'git cat-file' throws an error while trying to
> print the type or size of a broken/corrupt object. This is
> because these objects are usually of unknown types.
>
> Teach git cat-file a '--allow-unkown-type' option where it prints
> the type or size of a broken/corrupt object without throwing
> an error.
In this entire series, replace all 'unkown' with 'unknown' in both the
commit messages and the code ("unknown" is misspelled most of the
time). I notice the switch name itself is misspelled, but also
variable names such as 'unkown_type' in this patch.
Respectfully, because I know English is a challenging beast sometimes,
and spelling is difficult even for many native speakers...
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v9 3/5] cat-file: teach cat-file a '--allow-unknown-type' option
2015-04-29 14:52 ` Junio C Hamano
@ 2015-04-29 17:43 ` karthik nayak
0 siblings, 0 replies; 17+ messages in thread
From: karthik nayak @ 2015-04-29 17:43 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, sunshine
On 04/29/2015 08:22 PM, Junio C Hamano wrote:
> Karthik Nayak <karthik.188@gmail.com> writes:
>
>> Currently 'git cat-file' throws an error while trying to
>> print the type or size of a broken/corrupt object. This is
>> because these objects are usually of unknown types.
>>
>> Teach git cat-file a '--allow-unkown-type' option where it prints
>> the type or size of a broken/corrupt object without throwing
>> an error.
>
> Drop "Currently" from the description of the problem you are
> solving. We know that the problem you have to solve in the code is
> current without being told. This comment applies to all patches.
Noted. Thanks!
>
>> Modify '-t' and '-s' options to call sha1_object_info_extended()
>> directly to support the '--allow-unkown-type' option.
>>
>> Helped-by: Junio C Hamano <gitster@pobox.com>
>> Helped-by: Eric Sunshine <sunshine@sunshineco.com>
>> Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
>> ---
>> builtin/cat-file.c | 38 ++++++++++++++++++++++++++------------
>> 1 file changed, 26 insertions(+), 12 deletions(-)
>>
>> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
>> index 53b5376..299e2e5 100644
>> --- a/builtin/cat-file.c
>> +++ b/builtin/cat-file.c
>> @@ -9,13 +9,20 @@
>> #include "userdiff.h"
>> #include "streaming.h"
>>
>> -static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
>> +static int cat_one_file(int opt, const char *exp_type, const char *obj_name,
>> + int unkown_type)
>> {
>> unsigned char sha1[20];
>> enum object_type type;
>> char *buf;
>> unsigned long size;
>> struct object_context obj_context;
>> + struct object_info oi = {NULL};
>> + struct strbuf sb = STRBUF_INIT;
>> + unsigned flags = LOOKUP_REPLACE_OBJECT;
>> +
>> + if (unkown_type)
>> + flags |= LOOKUP_UNKNOWN_OBJECT;
>>
>> if (get_sha1_with_context(obj_name, 0, sha1, &obj_context))
>> die("Not a valid object name %s", obj_name);
>> @@ -23,20 +30,22 @@ static int cat_one_file(int opt, const char *exp_type, const char *obj_name)
>> buf = NULL;
>> switch (opt) {
>> case 't':
>> - type = sha1_object_info(sha1, NULL);
>> - if (type > 0) {
>> - printf("%s\n", typename(type));
>> + oi.typename = &sb;
>> + if (sha1_object_info_extended(sha1, &oi, flags) < 0)
>> + die("git cat-file: could not get object info");
>> + if (sb.len) {
>> + printf("%s\n", sb.buf);
>> + strbuf_release(&sb);
>> return 0;
>> }
>> break;
>>
>> case 's':
>> - type = sha1_object_info(sha1, &size);
>> - if (type > 0) {
>> - printf("%lu\n", size);
>> - return 0;
>> - }
>> - break;
>> + oi.sizep = &size;
>> + if (sha1_object_info_extended(sha1, &oi, flags) < 0)
>> + die("git cat-file: could not get object info");
>> + printf("%lu\n", size);
>> + return 0;
>>
>> case 'e':
>> return !has_sha1_file(sha1);
>> @@ -323,7 +332,7 @@ static int batch_objects(struct batch_options *opt)
>> }
>>
>> static const char * const cat_file_usage[] = {
>> - N_("git cat-file (-t | -s | -e | -p | <type> | --textconv) <object>"),
>> + N_("git cat-file (-t [--allow-unkown-type]|-s [--allow-unkown-type]|-e|-p|<type>|--textconv) <object>"),
>> N_("git cat-file (--batch | --batch-check) < <list-of-objects>"),
>> NULL
>> };
>> @@ -359,6 +368,7 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
>> int opt = 0;
>> const char *exp_type = NULL, *obj_name = NULL;
>> struct batch_options batch = {0};
>> + int unkown_type = 0;
>>
>> const struct option options[] = {
>> OPT_GROUP(N_("<type> can be one of: blob, tree, commit, tag")),
>> @@ -369,6 +379,8 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
>> OPT_CMDMODE('p', NULL, &opt, N_("pretty-print object's content"), 'p'),
>> OPT_CMDMODE(0, "textconv", &opt,
>> N_("for blob objects, run textconv on object's content"), 'c'),
>> + OPT_BOOL( 0, "allow-unkown-type", &unkown_type,
>> + N_("allow -s and -t to work with broken/corrupt objects")),
>> { OPTION_CALLBACK, 0, "batch", &batch, "format",
>> N_("show info and content of objects fed from the standard input"),
>> PARSE_OPT_OPTARG, batch_option_callback },
>> @@ -402,5 +414,7 @@ int cmd_cat_file(int argc, const char **argv, const char *prefix)
>> if (batch.enabled)
>> return batch_objects(&batch);
>>
>> - return cat_one_file(opt, exp_type, obj_name);
>> + if (unkown_type && opt != 't' && opt != 's')
>> + die("git cat-file --allow-unkown-type: use with -s or -t");
>> + return cat_one_file(opt, exp_type, obj_name, unkown_type);
>> }
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v9 3/5] cat-file: teach cat-file a '--allow-unknown-type' option
2015-04-29 14:53 ` Phil Hord
@ 2015-04-29 17:44 ` karthik nayak
2015-04-29 19:38 ` Junio C Hamano
0 siblings, 1 reply; 17+ messages in thread
From: karthik nayak @ 2015-04-29 17:44 UTC (permalink / raw)
To: Phil Hord; +Cc: Junio C Hamano, Eric Sunshine, Git
On 04/29/2015 08:23 PM, Phil Hord wrote:
> On Wed, Apr 29, 2015 at 9:01 AM Karthik Nayak <karthik.188@gmail.com>
> wrote:
> >
> > Currently 'git cat-file' throws an error while trying to
> > print the type or size of a broken/corrupt object. This is
> > because these objects are usually of unknown types.
> >
> > Teach git cat-file a '--allow-unkown-type' option where it prints
> > the type or size of a broken/corrupt object without throwing
> > an error.
>
> In this entire series, replace all 'unkown' with 'unknown' in both the
> commit messages and the code ("unknown" is misspelled most of the
> time). I notice the switch name itself is misspelled, but also
> variable names such as 'unkown_type' in this patch.
>
> Respectfully, because I know English is a challenging beast sometimes,
> and spelling is difficult even for many native speakers...
>
Thanks for that, Yes it does get a bit tricky with spellings, will find
an editor with spellcheck and avoid nano :D
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v9 1/5] sha1_file: support reading from a loose object of unknown type
2015-04-29 14:49 ` Junio C Hamano
@ 2015-04-29 17:50 ` karthik nayak
2015-04-29 19:35 ` Junio C Hamano
0 siblings, 1 reply; 17+ messages in thread
From: karthik nayak @ 2015-04-29 17:50 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, sunshine
On 04/29/2015 08:19 PM, Junio C Hamano wrote:
> Karthik Nayak <karthik.188@gmail.com> writes:
>
>> Update sha1_loose_object_info() to optionally allow it to read
>> from a loose object file of unknown/bogus type; as the function
>> usually returns the type of the object it read in the form of enum
>> for known types, add an optional "typename" field to receive the
>> name of the type in textual form and a flag to indicate the reading
>> of a loose object file of unknown/bogus type.
>>
>> Add parse_sha1_header_extended() which acts as a wrapper around
>> parse_sha1_header() allowing more information to be obtained.
>
> Thanks. This mostly looks good modulo a nit.
Sorry didn't get what you meant by "modulo a nit.".
>
>> diff --git a/sha1_file.c b/sha1_file.c
>> index 980ce6b..9a15634 100644
>> --- a/sha1_file.c
>> +++ b/sha1_file.c
>> @@ -1564,6 +1564,33 @@ int unpack_sha1_header(git_zstream *stream, unsigned char *map, unsigned long ma
>> return git_inflate(stream, 0);
>> }
>>
>> +static int unpack_sha1_header_to_strbuf(git_zstream *stream, unsigned char *map,
>> + unsigned long mapsize, void *buffer,
>> + unsigned long bufsiz, struct strbuf *header)
>
> This function in this round looks somewhat tricky.
>
>> +{
>> + unsigned char *cp;
>> + int status;
>> +
>> + status = unpack_sha1_header(stream, map, mapsize, buffer, bufsiz);
>
> Here, we unpack into the caller-supplied buffer, without touching
> the caller-supplied header strbuf.
>
>> + /*
>> + * Check if entire header is unpacked in the first iteration.
>> + */
>> + if (memchr(buffer, '\0', stream->next_out - (unsigned char *)buffer))
>> + return 0;
>
> And we return the object header in the buffer without ever touching
> the header strbuf. We expect that the caller would know that the
> buffer it gave us would contain the full object header line.
>
>> + strbuf_add(header, buffer, stream->next_out - (unsigned char *)buffer);
>> + do {
>> + status = git_inflate(stream, 0);
>> + strbuf_add(header, buffer, stream->next_out - (unsigned char *)buffer);
>> + if (memchr(buffer, '\0', stream->next_out - (unsigned char *)buffer))
>> + return 0;
>
> However, here, we return the object header in the caller-supplied
> header strbuf, while using the caller-supplied buffer as a scratch
> area. We expect that the caller would know that the header strbuf
> is what it has to use to find the object header.
>
> Which is good in the sense that there is no unnecessary copies, but
> the caller needs to be careful to do something like:
>
> if (!unpack_to_strbuf(... buffer, sizeof(buffer), &header)) {
> if (header.len)
> object_header = header.buf;
> else
> object_header = buffer;
> } else {
> error("cannot unpack the object header");
> }
>
> It is a good trade-off between complexity and efficiency. The
> complexity is isolated as the function is file-scope-static and it
> is perfectly fine to force the callers to be extra careful.
>
> But this suggests that the patch to add tests should try at least
> two, preferably three, kinds of test input. A bogus type that needs
> a header longer than the caller's fixed buffer, a bogus type whose
> header would fit within the fixed buffer, and optionally a correct
> type whose header should always fit within the fixed buffer.
Yes it is a tradeoff, and it is complex as in the user has to check the
strbuf provided to see if its been used. But this like you said I guess
its a good tradeoff.
About the three tests, My patch checks "a bogus type whose header would
fit within the fixed buffer" and "correct type whose header should
always fit within the fixed buffer" but will write a test to check "A
bogus type that needs a header longer than the caller's fixed buffer"
>
>> @@ -1614,27 +1641,38 @@ static void *unpack_sha1_rest(git_zstream *stream, void
>> ...
>> + /*
>> + * Set type to 0 if its an unknown object and
>> + * we're obtaining the type using '--allow-unkown-type'
>> + * option.
>> + */
>> + if ((flags & LOOKUP_UNKNOWN_OBJECT) && (type == -1))
>> + type = 0;
>> + else if (type == -1)
>> + die("invalid object type");
>
> Would "type == -2" or any other negative value, if existed, mean
> something different? I do not think so, and hence I would prefer
> these two checks be done with "type < 0" instead.
Thanks will change this.
>
>> @@ -2522,13 +2572,15 @@ struct packed_git *find_sha1_pack(const unsigned char *sha1,
>> }
>>
>> static int sha1_loose_object_info(const unsigned char *sha1,
>> - struct object_info *oi)
>> + struct object_info *oi,
>> + int flags)
>> {
>> - int status;
>> - unsigned long mapsize, size;
>> + int status = 0;
>> + unsigned long mapsize;
>> void *map;
>> git_zstream stream;
>> char hdr[32];
>> + struct strbuf hdrbuf = STRBUF_INIT;
>> ...
>> + if ((flags & LOOKUP_UNKNOWN_OBJECT)) {
>> + if (unpack_sha1_header_to_strbuf(&stream, map, mapsize, hdr, sizeof(hdr), &hdrbuf) < 0)
>> + status = error("unable to unpack %s header with --allow-unknown-type",
>> + sha1_to_hex(sha1));
>> + } else if (unpack_sha1_header(&stream, map, mapsize, hdr, sizeof(hdr)) < 0)
>> status = error("unable to unpack %s header",
>> sha1_to_hex(sha1));
>> - else if ((status = parse_sha1_header(hdr, &size)) < 0)
>> + if (hdrbuf.len) {
>
> And here the caller is being careful, but the way it does is a bit
> subtle. Even if it has a codepath that entirely bypasses a call to
> unpack_to_strbuf(), it knows that the other codepath does not touch
> the hdrbuf strbuf, which means that this block can be reached only
> when it called unpack_to_strbuf(), the callee found a long object
> header, and the unpacking succeeded. Otherwise, it will make a
> call to parse_sha1_header_extended() using the fixed hdr[] in the
> "else if" clause that corresponds to this "if".
>
>> + if ((status = parse_sha1_header_extended(hdrbuf.buf, oi, flags)) < 0)
>> + status = error("unable to parse %s header with --allow-unknown-type",
>> + sha1_to_hex(sha1));
>> + } else if ((status = parse_sha1_header_extended(hdr, oi, flags)) < 0)
>> status = error("unable to parse %s header", sha1_to_hex(sha1));
>
> However, I wonder what happens when "status" is already set after
> the call to unpack functions above. hdrbuf.len may be zero so this
> "else if" part would be taken, and it then still try to feed the
> hdr[] to parse_sha1_header_extended(), no?
>
> In other words, shouldn't the flow be more like this?
>
> if (lookup-unknown) {
> if (unpack-to-strbuf)
> status = error();
> } else if (unpack-to-hdr)
> status = error();
>
> + if (status < 0)
> + ; /* do not do anything */
> + else if (hdrbuf.len)
> - if (hdrbuf.len)
> parse(hdrbuf.buf);
> else
> parse(hdr);
>
Yes you're right here, you had mentioned it before too, I missed it
somehow. Thanks
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v9 1/5] sha1_file: support reading from a loose object of unknown type
2015-04-29 17:50 ` karthik nayak
@ 2015-04-29 19:35 ` Junio C Hamano
2015-04-30 4:40 ` karthik nayak
0 siblings, 1 reply; 17+ messages in thread
From: Junio C Hamano @ 2015-04-29 19:35 UTC (permalink / raw)
To: karthik nayak; +Cc: git, sunshine
karthik nayak <karthik.188@gmail.com> writes:
> On 04/29/2015 08:19 PM, Junio C Hamano wrote:
>> Karthik Nayak <karthik.188@gmail.com> writes:
>>
>>> Update sha1_loose_object_info() to optionally allow it to read
>>> from a loose object file of unknown/bogus type; as the function
>>> usually returns the type of the object it read in the form of enum
>>> for known types, add an optional "typename" field to receive the
>>> name of the type in textual form and a flag to indicate the reading
>>> of a loose object file of unknown/bogus type.
>>>
>>> Add parse_sha1_header_extended() which acts as a wrapper around
>>> parse_sha1_header() allowing more information to be obtained.
>>
>> Thanks. This mostly looks good modulo a nit.
>
> Sorry didn't get what you meant by "modulo a nit.".
"nit" as in "Nit-pick"; a small imperfection that may need to be
corrected (such as the "what if we saw failure earlier and 'status'
already had a value?" issue).
>> It is a good trade-off between complexity and efficiency. The
>> complexity is isolated as the function is file-scope-static and it
>> is perfectly fine to force the callers to be extra careful.
>>
>> But this suggests that the patch to add tests should try at least
>> two, preferably three, kinds of test input. A bogus type that needs
>> a header longer than the caller's fixed buffer, a bogus type whose
>> header would fit within the fixed buffer, and optionally a correct
>> type whose header should always fit within the fixed buffer.
>
> Yes it is a tradeoff, and it is complex as in the user has to check
> the strbuf provided to see if its been used. But this like you said I
> guess its a good tradeoff.
> About the three tests, My patch checks "a bogus type whose header
> would fit within the fixed buffer" and "correct type whose header
> should always fit within the fixed buffer" but will write a test to
> check "A bogus type that needs a header longer than the caller's fixed
> buffer"
Yup. Please do so; that would make the test coverage more complete.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v9 3/5] cat-file: teach cat-file a '--allow-unknown-type' option
2015-04-29 17:44 ` karthik nayak
@ 2015-04-29 19:38 ` Junio C Hamano
0 siblings, 0 replies; 17+ messages in thread
From: Junio C Hamano @ 2015-04-29 19:38 UTC (permalink / raw)
To: karthik nayak; +Cc: Phil Hord, Eric Sunshine, Git
karthik nayak <karthik.188@gmail.com> writes:
>> Respectfully, because I know English is a challenging beast sometimes,
>> and spelling is difficult even for many native speakers...
>>
> Thanks for that, Yes it does get a bit tricky with spellings, will
> find an editor with spellcheck and avoid nano :D
;-).
Reminds me of a pecurilar habit my old mentor had that I never
picked up, which is to run "spell" on the code, not just comments
and log messages, to catch misnamed variables.
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v9 4/5] cat-file: add documentation for '--allow-unkown-type' option.
2015-04-29 12:56 ` [PATCH v9 4/5] cat-file: add documentation for '--allow-unkown-type' option Karthik Nayak
@ 2015-04-29 21:13 ` Eric Sunshine
0 siblings, 0 replies; 17+ messages in thread
From: Eric Sunshine @ 2015-04-29 21:13 UTC (permalink / raw)
To: Karthik Nayak; +Cc: Git List, Junio C Hamano
On Wed, Apr 29, 2015 at 8:56 AM, Karthik Nayak <karthik.188@gmail.com> wrote:
> cat-file: add documentation for '--allow-unkown-type' option.
Drop the end-of-line period.
> Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
It's not clear why this change is done separately from patch 3/5
(cat-file: teach cat-file a '--allow-unknown-type' option). Given how
small this patch is and considering how closely related it is to the
actual introduction of the --allow-unknown-type option, it might make
sense to fold it into 3/5.
> ---
> diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
> index f6a16f4..f6f6064 100644
> --- a/Documentation/git-cat-file.txt
> +++ b/Documentation/git-cat-file.txt
> @@ -9,7 +9,7 @@ git-cat-file - Provide content or type and size information for repository objec
> SYNOPSIS
> --------
> [verse]
> -'git cat-file' (-t | -s | -e | -p | <type> | --textconv ) <object>
> +'git cat-file' (-t [--allow-unkown-type]| -s [--allow-unkown-type]| -e | -p | <type> | --textconv ) <object>
> 'git cat-file' (--batch | --batch-check) < <list-of-objects>
>
> DESCRIPTION
> @@ -69,6 +69,9 @@ OPTIONS
> not be combined with any other options or arguments. See the
> section `BATCH OUTPUT` below for details.
>
> +--allow-unkown-type::
> + Allow -s or -t to query broken/corrupt objects of unknown type.
> +
> OUTPUT
> ------
> If '-t' is specified, one of the <type>.
> --
> 2.4.0.rc1.250.g565e85b
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v9 5/5] t1006: add tests for git cat-file --allow-unkown-type
2015-04-29 12:54 ` [PATCH v9 5/5] t1006: add tests for git cat-file --allow-unkown-type Karthik Nayak
@ 2015-04-29 21:16 ` Eric Sunshine
0 siblings, 0 replies; 17+ messages in thread
From: Eric Sunshine @ 2015-04-29 21:16 UTC (permalink / raw)
To: Karthik Nayak; +Cc: Git List, Junio C Hamano
On Wed, Apr 29, 2015 at 8:54 AM, Karthik Nayak <karthik.188@gmail.com> wrote:
> Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
> ---
> diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
> index ab36b1e..8362019 100755
> --- a/t/t1006-cat-file.sh
> +++ b/t/t1006-cat-file.sh
> @@ -47,6 +47,18 @@ $content"
> test_cmp expect actual
> '
>
> + test_expect_success "Type of $type is correct using --allow-unkown-type" '
> + echo $type >expect &&
> + git cat-file -t --allow-unkown-type $sha1 >actual &&
Somehow the indentation got botched here and in the remaining tests
you added. This issue is new since v8.
> + test_cmp expect actual
> + '
> +
> + test_expect_success "Size of $type is correct using --allow-unkown-type" '
> + echo $size >expect &&
> + git cat-file -s --allow-unkown-type $sha1 >actual &&
> + test_cmp expect actual
> + '
> +
> test -z "$content" ||
> test_expect_success "Content of $type is correct" '
> maybe_remove_timestamp "$content" $no_ts >expect &&
> @@ -296,4 +308,21 @@ test_expect_success '%(deltabase) reports packed delta bases' '
> }
> '
>
> +bogus_type="bogus"
> +bogus_content="bogus"
> +bogus_size=$(strlen "$bogus_content")
> +bogus_sha1=$(echo_without_newline "$bogus_content" | git hash-object -t $bogus_type --literally -w --stdin)
> +
> +test_expect_success "Type of broken object is correct" '
> + echo $bogus_type >expect &&
> + git cat-file -t --allow-unkown-type $bogus_sha1 >actual &&
> + test_cmp expect actual
> +'
> +
> +test_expect_success "Size of broken object is correct" '
> + echo $bogus_size >expect &&
> + git cat-file -s --allow-unkown-type $bogus_sha1 >actual &&
> + test_cmp expect actual
> +'
> +
> test_done
> --
> 2.4.0.rc1.250.g565e85b
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v9 1/5] sha1_file: support reading from a loose object of unknown type
2015-04-29 19:35 ` Junio C Hamano
@ 2015-04-30 4:40 ` karthik nayak
0 siblings, 0 replies; 17+ messages in thread
From: karthik nayak @ 2015-04-30 4:40 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, sunshine
On 04/30/2015 01:05 AM, Junio C Hamano wrote:
> karthik nayak <karthik.188@gmail.com> writes:
>
> > On 04/29/2015 08:19 PM, Junio C Hamano wrote:
> >> Karthik Nayak <karthik.188@gmail.com> writes:
> >>
> >>> Update sha1_loose_object_info() to optionally allow it to read
> >>> from a loose object file of unknown/bogus type; as the function
> >>> usually returns the type of the object it read in the form of enum
> >>> for known types, add an optional "typename" field to receive the
> >>> name of the type in textual form and a flag to indicate the reading
> >>> of a loose object file of unknown/bogus type.
> >>>
> >>> Add parse_sha1_header_extended() which acts as a wrapper around
> >>> parse_sha1_header() allowing more information to be obtained.
> >>
> >> Thanks. This mostly looks good modulo a nit.
> >
> > Sorry didn't get what you meant by "modulo a nit.".
>
> "nit" as in "Nit-pick"; a small imperfection that may need to be
> corrected (such as the "what if we saw failure earlier and 'status'
> already had a value?" issue).
Thanks for clearing that out.
>
> >> It is a good trade-off between complexity and efficiency. The
> >> complexity is isolated as the function is file-scope-static and it
> >> is perfectly fine to force the callers to be extra careful.
> >>
> >> But this suggests that the patch to add tests should try at least
> >> two, preferably three, kinds of test input. A bogus type that needs
> >> a header longer than the caller's fixed buffer, a bogus type whose
> >> header would fit within the fixed buffer, and optionally a correct
> >> type whose header should always fit within the fixed buffer.
> >
> > Yes it is a tradeoff, and it is complex as in the user has to check
> > the strbuf provided to see if its been used. But this like you said I
> > guess its a good tradeoff.
> > About the three tests, My patch checks "a bogus type whose header
> > would fit within the fixed buffer" and "correct type whose header
> > should always fit within the fixed buffer" but will write a test to
> > check "A bogus type that needs a header longer than the caller's fixed
> > buffer"
>
> Yup. Please do so; that would make the test coverage more complete.
>
Yup will do :)
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2015-04-30 4:40 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-04-29 12:50 [PATCH v9 0/5] cat-file: teach cat-file a '--allow-unkown-type' option karthik nayak
2015-04-29 12:52 ` [PATCH v9 1/5] sha1_file: support reading from a loose object of unknown type Karthik Nayak
2015-04-29 14:49 ` Junio C Hamano
2015-04-29 17:50 ` karthik nayak
2015-04-29 19:35 ` Junio C Hamano
2015-04-30 4:40 ` karthik nayak
2015-04-29 12:52 ` [PATCH v9 2/5] cat-file: make the options mutually exclusive Karthik Nayak
2015-04-29 12:53 ` [PATCH v9 3/5] cat-file: teach cat-file a '--allow-unknown-type' option Karthik Nayak
2015-04-29 14:52 ` Junio C Hamano
2015-04-29 17:43 ` karthik nayak
2015-04-29 14:53 ` Phil Hord
2015-04-29 17:44 ` karthik nayak
2015-04-29 19:38 ` Junio C Hamano
2015-04-29 12:54 ` [PATCH v9 5/5] t1006: add tests for git cat-file --allow-unkown-type Karthik Nayak
2015-04-29 21:16 ` Eric Sunshine
2015-04-29 12:56 ` [PATCH v9 4/5] cat-file: add documentation for '--allow-unkown-type' option Karthik Nayak
2015-04-29 21:13 ` Eric Sunshine
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).