From: Thomas Rast <trast@inf.ethz.ch>
To: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Cc: <git@vger.kernel.org>, Junio C Hamano <gitster@pobox.com>,
Thomas Rast <trast@student.ethz.ch>
Subject: Re: [PATCH] unpack_entry: invalidate newly added cache entry in case of error
Date: Tue, 30 Apr 2013 10:27:53 +0200 [thread overview]
Message-ID: <87ppxcxw1i.fsf@linux-k42r.v.cablecom.net> (raw)
In-Reply-To: <1367288992-14979-1-git-send-email-pclouds@gmail.com> ("Nguyễn Thái Ngọc Duy"'s message of "Tue, 30 Apr 2013 09:29:52 +0700")
Nguyễn Thái Ngọc Duy <pclouds@gmail.com> writes:
> In this particular code path, we add "base" to the delta base
> cache. Then decide to free it, but we forgot about a dangling pointer
> in the cache. Invalidate that entry when we free "base".
>
> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
> ---
> Some of my changes triggered a double free fault at "free(base);" in
> t5303. This looks like a correct thing to do, but I may be missing
> something (I'm not even sure how it happened). Please check.
Can you describe how you triggered it?
I ran all of origin/pu through valgrind tests just yesterday, and it
found nothing (yay!), so it doesn't seem to reproduce here?
> sha1_file.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/sha1_file.c b/sha1_file.c
> index 64228a2..99ead7c 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -1912,7 +1912,8 @@ void clear_delta_base_cache(void)
> release_delta_base_cache(&delta_base_cache[p]);
> }
>
> -static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
> +static struct delta_base_cache_entry *
> +add_delta_base_cache(struct packed_git *p, off_t base_offset,
> void *base, unsigned long base_size, enum object_type type)
> {
> unsigned long hash = pack_entry_hash(p, base_offset);
> @@ -1947,6 +1948,7 @@ static void add_delta_base_cache(struct packed_git *p, off_t base_offset,
> ent->lru.prev = delta_base_cache_lru.prev;
> delta_base_cache_lru.prev->next = &ent->lru;
> delta_base_cache_lru.prev = &ent->lru;
> + return ent;
> }
>
> static void *read_object(const unsigned char *sha1, enum object_type *type,
> @@ -2086,12 +2088,13 @@ void *unpack_entry(struct packed_git *p, off_t obj_offset,
> void *delta_data;
> void *base = data;
> unsigned long delta_size, base_size = size;
> + struct delta_base_cache_entry *ent = NULL;
> int i;
>
> data = NULL;
>
> if (base)
> - add_delta_base_cache(p, obj_offset, base, base_size, type);
> + ent = add_delta_base_cache(p, obj_offset, base, base_size, type);
>
> if (!base) {
> /*
> @@ -2129,6 +2132,8 @@ void *unpack_entry(struct packed_git *p, off_t obj_offset,
> "at offset %"PRIuMAX" from %s",
> (uintmax_t)curpos, p->pack_name);
> free(base);
> + if (ent)
> + ent->data = NULL;
> data = NULL;
> continue;
> }
Why not clear_delta_base_cache_entry(), which also handles updating the
lru pointers?
Also I wonder if removing free(base) is the right fix: since the failure
is in decompressing the delta, the base might again be useful and we
should keep it cached.
Either way, one of those needs to be done. I think the mistake happened
in my abe601b (sha1_file: remove recursion in unpack_entry, 2013-03-27).
The change concerning this section was roughly:
- base = cache_or_unpack_entry(p, base_offset, &base_size, type, 0);
- if (!base) {
[...snip error path...]
- }
-
- delta_data = unpack_compressed_entry(p, w_curs, curpos, delta_size);
- if (!delta_data) {
- error("failed to unpack compressed delta "
- "at offset %"PRIuMAX" from %s",
- (uintmax_t)curpos, p->pack_name);
- free(base);
- return NULL;
- }
[...]
- add_delta_base_cache(p, base_offset, base, base_size, *type);
transforms into
+ if (base)
+ add_delta_base_cache(p, obj_offset, base, base_size, type);
+
+ if (!base) {
[...snip error path...]
+ }
+
+ i = --delta_stack_nr;
+ obj_offset = delta_stack[i].obj_offset;
+ curpos = delta_stack[i].curpos;
+ delta_size = delta_stack[i].size;
+
+ if (!base)
+ continue;
+
+ delta_data = unpack_compressed_entry(p, &w_curs, curpos, delta_size);
+
+ if (!delta_data) {
+ error("failed to unpack compressed delta "
+ "at offset %"PRIuMAX" from %s",
+ (uintmax_t)curpos, p->pack_name);
+ free(base);
+ data = NULL;
+ continue;
+ }
So it's not your mistake at all. I propose this explanation for the
commit message of whatever fix you make:
In the !delta_data error path of unpack_entry(), we run free(base).
This became a window for use-after-free() in abe601b (sha1_file: remove
recursion in unpack_entry, 2013-03-27), as follows:
Before abe601b, we got the 'base' from cache_or_unpack_entry(..., 0);
keep_cache=0 tells it to also remove that entry. So the 'base' is at
this point not cached, and freeing it in the error path is the right
thing.
After abe601b, the structure changed: we use a three-phase approach
where phase 1 finds the innermost base or a base that is already in the
cache. In phase 3 we therefore know that all bases we unpack are not
part of the delta cache yet. (Observe that we pop from the cache in
phase 1, so this is also true for the very first base.) So we make no
further attempts to look up the bases in the cache, and just call
add_delta_base_cache() on every base object we have assembled.
But the !delta_data error path remained unchanged, and now calls free()
on a base that has already been entered in the cache. This means that
there is a use-after-free if we later use the same base again.
--
Thomas Rast
trast@{inf,student}.ethz.ch
next prev parent reply other threads:[~2013-04-30 8:28 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-04-30 2:29 [PATCH] unpack_entry: invalidate newly added cache entry in case of error Nguyễn Thái Ngọc Duy
2013-04-30 8:27 ` Thomas Rast [this message]
2013-04-30 10:25 ` Duy Nguyen
2013-04-30 12:53 ` Thomas Rast
2013-04-30 13:01 ` Duy Nguyen
2013-04-30 22:39 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ppxcxw1i.fsf@linux-k42r.v.cablecom.net \
--to=trast@inf.ethz.ch \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=pclouds@gmail.com \
--cc=trast@student.ethz.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.