All of lore.kernel.org
 help / color / mirror / Atom feed
From: Victoria Dye <vdye@github.com>
To: Jeff King <peff@peff.net>, Derrick Stolee <derrickstolee@github.com>
Cc: git@vger.kernel.org
Subject: Re: t9210-scalar.sh fails with SANITIZE=undefined
Date: Thu, 22 Sep 2022 15:09:52 -0700	[thread overview]
Message-ID: <50c57a60-8346-6952-93d9-432a70ef74c5@github.com> (raw)
In-Reply-To: <Yyyzk3FVjmms7dkO@coredump.intra.peff.net>

Jeff King wrote:
> On Thu, Sep 22, 2022 at 08:35:22AM -0400, Derrick Stolee wrote:
> 
>>> I didn't dig further. It's obviously new in v2.38.0-rc1, but I'm not
>>> sure it's a show-stopper. It _might_ have been there all along, and is
>>> just now surfacing. Or it might be in an existing experimental feature
>>> that just wasn't exercised enough in the tests. Or if it really is new
>>> in scalar, then it will only hurt people using scalar, which didn't
>>> exist before. So I don't think it's a regression in the strictest sense,
>>> but it might be nice to get a more accurate understanding of the problem
>>> before the release.
>>
>> Interesting find!
>>
>> Here are the index-related settings that Scalar sets as of -rc1:
>>
>> * core.preloadIndex=true
>> * index.threads=true
>> * index.version=4
>>
>> My gut feeling is that index.version=4 might be the culprit. I thought
>> we tested GIT_TEST_INDEX_VERSION=4 in some CI modes, but apparently we
>> do not. Do you get the same error in other tests with that environment
>> variable?
> 
> Yeah, that seems by far the most likely of those three. And indeed,
> running with GIT_TEST_INDEX_VERSION=4 causes even t0000 to fail with the
> same problem. A minimal reproduction in git.git is just:
> 
>   make SANITIZE=undefined
>   git clone . tmp
>   cd tmp
>   rm .git/index
>   export GIT_INDEX_VERSION=4
>   ../git reset --hard ;# ok, writes v4 index
>   ../git reset --hard ;# fails reading unaligned v4 index
> 
> So it seems like a problem with the v4 format in general. Which...yikes.

Other than allowing us to use a (non-packed) 'struct ondisk_cache_entry' to
parse the index entries, is there any reason why the on-disk index entries
should (or need to be) 4-byte aligned? If not, we could update how we read
the 'ondisk' index entry in 'create_from_disk()' to avoid the misalignment.
E.g.:

------------------8<------------------8<------------------8<------------------
diff --git a/read-cache.c b/read-cache.c
index b09128b188..f132a3f256 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -1875,7 +1875,7 @@ static int read_index_extension(struct index_state *istate,
 
 static struct cache_entry *create_from_disk(struct mem_pool *ce_mem_pool,
 					    unsigned int version,
-					    struct ondisk_cache_entry *ondisk,
+					    const char *ondisk,
 					    unsigned long *ent_size,
 					    const struct cache_entry *previous_ce)
 {
@@ -1883,7 +1883,7 @@ static struct cache_entry *create_from_disk(struct mem_pool *ce_mem_pool,
 	size_t len;
 	const char *name;
 	const unsigned hashsz = the_hash_algo->rawsz;
-	const uint16_t *flagsp = (const uint16_t *)(ondisk->data + hashsz);
+	const char *flagsp = ondisk + offsetof(struct ondisk_cache_entry, data) + hashsz;
 	unsigned int flags;
 	size_t copy_len = 0;
 	/*
------------------>8------------------>8------------------>8------------------

the do the same sort of conversion with the rest of the function. It's
certainly uglier than just using the 'struct ondisk_index_entry *' pointer,
but it should avoid the misaligned addressing error.

> 
> -Peff


  reply	other threads:[~2022-09-22 22:10 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-22 10:04 t9210-scalar.sh fails with SANITIZE=undefined Jeff King
2022-09-22 12:35 ` Derrick Stolee
2022-09-22 19:12   ` Jeff King
2022-09-22 22:09     ` Victoria Dye [this message]
2022-09-22 22:27       ` Jeff King
2022-09-22 22:56         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50c57a60-8346-6952-93d9-432a70ef74c5@github.com \
    --to=vdye@github.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.