* [PATCH] type_from_string_gently: make sure length matches
@ 2015-04-17 14:52 Jeff King
2015-04-17 20:54 ` Junio C Hamano
0 siblings, 1 reply; 4+ messages in thread
From: Jeff King @ 2015-04-17 14:52 UTC (permalink / raw)
To: git; +Cc: Johannes Schindelin, Karthik Nayak, Junio C Hamano
When commit fe8e3b7 refactored type_from_string to allow
input that was not NUL-terminated, it switched to using
strncmp instead of strcmp. But this means we check only the
first "len" bytes of the strings, and ignore any remaining
bytes in the object_type_string. We should make sure that it
is also "len" bytes, or else we would accept "comm" as
"commit", and so forth.
Signed-off-by: Jeff King <peff@peff.net>
---
Since the strings we are matching are literals, we could also record
their sizes in the object_type_strings array and check the length first
before even calling strncmp. I doubt this is a performance hot-spot,
though.
You could also potentially just use strlen(object_type_strings[i]), but
I'm not sure if compilers will optimize out the strlen in this case,
since it is in a loop.
object.c | 3 ++-
t/t1007-hash-object.sh | 8 ++++++++
2 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/object.c b/object.c
index 23d6c96..980ac5f 100644
--- a/object.c
+++ b/object.c
@@ -41,7 +41,8 @@ int type_from_string_gently(const char *str, ssize_t len, int gentle)
len = strlen(str);
for (i = 1; i < ARRAY_SIZE(object_type_strings); i++)
- if (!strncmp(str, object_type_strings[i], len))
+ if (!strncmp(str, object_type_strings[i], len) &&
+ object_type_strings[i][len] == '\0')
return i;
if (gentle)
diff --git a/t/t1007-hash-object.sh b/t/t1007-hash-object.sh
index f83df8e..ebb3a69 100755
--- a/t/t1007-hash-object.sh
+++ b/t/t1007-hash-object.sh
@@ -201,4 +201,12 @@ test_expect_success 'corrupt tag' '
test_must_fail git hash-object -t tag --stdin </dev/null
'
+test_expect_success 'hash-object complains about bogus type name' '
+ test_must_fail git hash-object -t bogus --stdin </dev/null
+'
+
+test_expect_success 'hash-object complains about truncated type name' '
+ test_must_fail git hash-object -t bl --stdin </dev/null
+'
+
test_done
--
2.4.0.rc2.384.g7297a4a
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] type_from_string_gently: make sure length matches
2015-04-17 14:52 [PATCH] type_from_string_gently: make sure length matches Jeff King
@ 2015-04-17 20:54 ` Junio C Hamano
2015-04-17 21:07 ` Jeff King
0 siblings, 1 reply; 4+ messages in thread
From: Junio C Hamano @ 2015-04-17 20:54 UTC (permalink / raw)
To: Jeff King; +Cc: git, Johannes Schindelin, Karthik Nayak
Jeff King <peff@peff.net> writes:
> When commit fe8e3b7 refactored type_from_string to allow
> input that was not NUL-terminated, it switched to using
> strncmp instead of strcmp. But this means we check only the
> first "len" bytes of the strings, and ignore any remaining
> bytes in the object_type_string. We should make sure that it
> is also "len" bytes, or else we would accept "comm" as
> "commit", and so forth.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> Since the strings we are matching are literals, we could also record
> their sizes in the object_type_strings array and check the length first
> before even calling strncmp. I doubt this is a performance hot-spot,
> though.
>
> You could also potentially just use strlen(object_type_strings[i]), but
> I'm not sure if compilers will optimize out the strlen in this case,
> since it is in a loop.
That thought crossed my mind while reading your patch. It could
even make it go faster if we made object_type_strings into an array
of counted strings (i.e. "struct { const char *str; int len; }")
and then took advantage of the fact that we have lengths of both.
object.c | 24 ++++++++++++++----------
1 file changed, 14 insertions(+), 10 deletions(-)
diff --git a/object.c b/object.c
index aedac24..51584ea 100644
--- a/object.c
+++ b/object.c
@@ -18,19 +18,22 @@ struct object *get_indexed_object(unsigned int idx)
return obj_hash[idx];
}
-static const char *object_type_strings[] = {
- NULL, /* OBJ_NONE = 0 */
- "commit", /* OBJ_COMMIT = 1 */
- "tree", /* OBJ_TREE = 2 */
- "blob", /* OBJ_BLOB = 3 */
- "tag", /* OBJ_TAG = 4 */
+static struct {
+ const char *str;
+ int len;
+} object_type_name[] = {
+ { NULL, 0 }, /* OBJ_NONE = 0 */
+ { "commit", 6 }, /* OBJ_COMMIT = 1 */
+ { "tree", 4 }, /* OBJ_TREE = 2 */
+ { "blob", 4 }, /* OBJ_BLOB = 3 */
+ { "tag", 3 }, /* OBJ_TAG = 4 */
};
const char *typename(unsigned int type)
{
- if (type >= ARRAY_SIZE(object_type_strings))
+ if (type >= ARRAY_SIZE(object_type_name))
return NULL;
- return object_type_strings[type];
+ return object_type_name[type].str;
}
int type_from_string_gently(const char *str, ssize_t len, int gentle)
@@ -40,8 +43,9 @@ int type_from_string_gently(const char *str, ssize_t len, int gentle)
if (len < 0)
len = strlen(str);
- for (i = 1; i < ARRAY_SIZE(object_type_strings); i++)
- if (!strncmp(str, object_type_strings[i], len))
+ for (i = 1; i < ARRAY_SIZE(object_type_name); i++)
+ if (object_type_name[i].len == len &&
+ !strncmp(str, object_type_name[i].str, len))
return i;
if (gentle)
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] type_from_string_gently: make sure length matches
2015-04-17 20:54 ` Junio C Hamano
@ 2015-04-17 21:07 ` Jeff King
2015-04-17 21:11 ` Junio C Hamano
0 siblings, 1 reply; 4+ messages in thread
From: Jeff King @ 2015-04-17 21:07 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Johannes Schindelin, Karthik Nayak
On Fri, Apr 17, 2015 at 01:54:27PM -0700, Junio C Hamano wrote:
> > Since the strings we are matching are literals, we could also record
> > their sizes in the object_type_strings array and check the length first
> > before even calling strncmp. I doubt this is a performance hot-spot,
> > though.
> >
> > You could also potentially just use strlen(object_type_strings[i]), but
> > I'm not sure if compilers will optimize out the strlen in this case,
> > since it is in a loop.
>
> That thought crossed my mind while reading your patch. It could
> even make it go faster if we made object_type_strings into an array
> of counted strings (i.e. "struct { const char *str; int len; }")
> and then took advantage of the fact that we have lengths of both.
Right, that was what I meant.
I'd be surprised if it appreciably speeds things up, but I guess it is
not too complicated to do.
> +static struct {
> + const char *str;
> + int len;
> +} object_type_name[] = {
> + { NULL, 0 }, /* OBJ_NONE = 0 */
> + { "commit", 6 }, /* OBJ_COMMIT = 1 */
> + { "tree", 4 }, /* OBJ_TREE = 2 */
> + { "blob", 4 }, /* OBJ_BLOB = 3 */
> + { "tag", 3 }, /* OBJ_TAG = 4 */
> };
I had envisioned a macro like:
#define SIZED_STRING(x) { (x), (sizeof(x) - 1) }
though perhaps that is overkill for such a short list (that we don't
even expect to change).
-Peff
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] type_from_string_gently: make sure length matches
2015-04-17 21:07 ` Jeff King
@ 2015-04-17 21:11 ` Junio C Hamano
0 siblings, 0 replies; 4+ messages in thread
From: Junio C Hamano @ 2015-04-17 21:11 UTC (permalink / raw)
To: Jeff King; +Cc: git, Johannes Schindelin, Karthik Nayak
Jeff King <peff@peff.net> writes:
> I'd be surprised if it appreciably speeds things up, but I guess it is
> not too complicated to do.
>
>> +static struct {
>> + const char *str;
>> + int len;
>> +} object_type_name[] = {
>> + { NULL, 0 }, /* OBJ_NONE = 0 */
>> + { "commit", 6 }, /* OBJ_COMMIT = 1 */
>> + { "tree", 4 }, /* OBJ_TREE = 2 */
>> + { "blob", 4 }, /* OBJ_BLOB = 3 */
>> + { "tag", 3 }, /* OBJ_TAG = 4 */
>> };
>
> I had envisioned a macro like:
>
> #define SIZED_STRING(x) { (x), (sizeof(x) - 1) }
>
> though perhaps that is overkill for such a short list (that we don't
> even expect to change).
Sounds good (either way ;-)
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2015-04-17 21:11 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-04-17 14:52 [PATCH] type_from_string_gently: make sure length matches Jeff King
2015-04-17 20:54 ` Junio C Hamano
2015-04-17 21:07 ` Jeff King
2015-04-17 21:11 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).