From: "Shawn O. Pearce" <spearce@spearce.org>
To: Junio C Hamano <gitster@pobox.com>, Tay Ray Chuan <rctay89@gmail.com>
Cc: git@vger.kernel.org, Johannes Sixt <j6t@kdbg.org>,
Ilari Liusvaara <ilari.liusvaara@elisanet.fi>,
Michael J Gruber <git@drmicha.warpmail.net>,
Christian Halstrick <christian.halstrick@gmail.com>,
jan.sievers@sap.com, Matthias Sohn <matthias.sohn@sap.com>
Subject: [PATCH v4 10/11] http-fetch: Use index-pack rather than verify-pack to check packs
Date: Mon, 19 Apr 2010 07:23:09 -0700 [thread overview]
Message-ID: <1271686990-16363-6-git-send-email-spearce@spearce.org> (raw)
In-Reply-To: <20100418115744.0000238b@unknown>
To ensure we don't leave a corrupt pack file positioned as though
it were a valid pack file, run index-pack on the temporary pack
before we rename it to its final name. If index-pack crashes out
when it discovers file corruption (e.g. GitHub's error HTML at the
end of the file), simply delete the temporary files to cleanup.
By waiting until the pack has been validated before we move it
to its final name, we eliminate a race condition where another
concurrent reader might try to access the pack at the same time
that we are still trying to verify its not corrupt.
Switching from verify-pack to index-pack is a change in behavior,
but it should turn out better for users. The index-pack algorithm
tries to minimize disk seeks, as well as the number of times any
given object is inflated, by organizing its work along delta chains.
The verify-pack logic does not attempt to do this, thrashing the
delta base cache and the filesystem cache.
By recreating the index file locally, we also can automatically
upgrade from a v1 pack table of contents to v2. This makes the
CRC32 data available for use during later repacks, even if the
server didn't have them on hand.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
Moved unlink of index to after the index-pack is successful,
per Tay Ray Chuan's request.
Removed Junio SOB line since the logic changed.
http.c | 44 +++++++++++++++++++++++++++++++++++++-------
t/t5550-http-fetch.sh | 15 +++++++++++++++
2 files changed, 52 insertions(+), 7 deletions(-)
diff --git a/http.c b/http.c
index 9c62632..2ebd679 100644
--- a/http.c
+++ b/http.c
@@ -1,6 +1,7 @@
#include "http.h"
#include "pack.h"
#include "sideband.h"
+#include "run-command.h"
int data_received;
int active_requests;
@@ -998,11 +999,14 @@ void release_http_pack_request(struct http_pack_request *preq)
int finish_http_pack_request(struct http_pack_request *preq)
{
- int ret;
struct packed_git **lst;
struct packed_git *p = preq->target;
+ char *tmp_idx;
+ struct child_process ip;
+ const char *ip_argv[8];
+
+ close_pack_index(p);
- p->pack_size = ftell(preq->packfile);
fclose(preq->packfile);
preq->packfile = NULL;
preq->slot->local = NULL;
@@ -1012,13 +1016,39 @@ int finish_http_pack_request(struct http_pack_request *preq)
lst = &((*lst)->next);
*lst = (*lst)->next;
- ret = move_temp_to_file(preq->tmpfile, sha1_pack_name(p->sha1));
- if (ret)
- return ret;
- if (verify_pack(p))
+ tmp_idx = xstrdup(preq->tmpfile);
+ strcpy(tmp_idx + strlen(tmp_idx) - strlen(".pack.temp"),
+ ".idx.temp");
+
+ ip_argv[0] = "index-pack";
+ ip_argv[1] = "-o";
+ ip_argv[2] = tmp_idx;
+ ip_argv[3] = preq->tmpfile;
+ ip_argv[4] = NULL;
+
+ memset(&ip, 0, sizeof(ip));
+ ip.argv = ip_argv;
+ ip.git_cmd = 1;
+ ip.no_stdin = 1;
+ ip.no_stdout = 1;
+
+ if (run_command(&ip)) {
+ unlink(preq->tmpfile);
+ unlink(tmp_idx);
+ free(tmp_idx);
return -1;
- install_packed_git(p);
+ }
+
+ unlink(sha1_pack_index_name(p->sha1));
+ if (move_temp_to_file(preq->tmpfile, sha1_pack_name(p->sha1))
+ || move_temp_to_file(tmp_idx, sha1_pack_index_name(p->sha1))) {
+ free(tmp_idx);
+ return -1;
+ }
+
+ install_packed_git(p);
+ free(tmp_idx);
return 0;
}
diff --git a/t/t5550-http-fetch.sh b/t/t5550-http-fetch.sh
index 78c31c9..1a4dfc9 100755
--- a/t/t5550-http-fetch.sh
+++ b/t/t5550-http-fetch.sh
@@ -62,6 +62,21 @@ test_expect_success 'fetch packed objects' '
git clone $HTTPD_URL/dumb/repo_pack.git
'
+test_expect_success 'fetch notices corrupt pack' '
+ cp -R "$HTTPD_DOCUMENT_ROOT_PATH"/repo_pack.git "$HTTPD_DOCUMENT_ROOT_PATH"/repo_bad1.git &&
+ (cd "$HTTPD_DOCUMENT_ROOT_PATH"/repo_bad1.git &&
+ p=`ls objects/pack/pack-*.pack` &&
+ chmod u+w $p &&
+ printf %0256d 0 | dd of=$p bs=256 count=1 seek=1 conv=notrunc
+ ) &&
+ mkdir repo_bad1.git &&
+ (cd repo_bad1.git &&
+ git --bare init &&
+ test_must_fail git --bare fetch $HTTPD_URL/dumb/repo_bad1.git &&
+ test 0 = `ls objects/pack/pack-*.pack | wc -l`
+ )
+'
+
test_expect_success 'did not use upload-pack service' '
grep '/git-upload-pack' <"$HTTPD_ROOT_PATH"/access.log >act
: >exp
--
1.7.1.rc1.279.g22727
next prev parent reply other threads:[~2010-04-19 14:24 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-15 9:51 git fetch over http:// left my repo broken Christian Halstrick
2010-04-15 9:58 ` Michael J Gruber
2010-04-15 11:43 ` Ilari Liusvaara
2010-04-15 14:15 ` Shawn O. Pearce
2010-04-15 19:09 ` [PATCH 0/6] detect dumb HTTP pack file corruption Shawn O. Pearce
2010-04-17 17:56 ` Junio C Hamano
2010-04-17 19:11 ` Shawn O. Pearce
2010-04-15 19:09 ` [PATCH 1/6] http.c: Remove bad free of static block Shawn O. Pearce
2010-04-15 19:09 ` [PATCH 2/6] t5550-http-fetch: Use subshell for repository operations Shawn O. Pearce
2010-04-15 19:09 ` [PATCH 3/6] http.c: Tiny refactoring of finish_http_pack_request Shawn O. Pearce
2010-04-15 19:09 ` [PATCH 4/6] http.c: Drop useless != NULL test in finish_http_pack_request Shawn O. Pearce
2010-04-15 19:09 ` [PATCH 5/6] http-fetch: Use index-pack rather than verify-pack to check packs Shawn O. Pearce
2010-04-15 19:34 ` Johannes Sixt
2010-04-15 21:25 ` [PATCH v2 " Shawn O. Pearce
2010-04-16 2:55 ` Tay Ray Chuan
2010-04-17 19:30 ` Shawn O. Pearce
2010-04-15 21:25 ` [PATCH v2 6/6] http-fetch: Use temporary files for pack-*.idx until verified Shawn O. Pearce
2010-04-16 2:03 ` Tay Ray Chuan
2010-04-17 20:07 ` [PATCH v3 01/11] http.c: Remove bad free of static block Shawn O. Pearce
2010-04-17 20:07 ` [PATCH v3 02/11] t5550-http-fetch: Use subshell for repository operations Shawn O. Pearce
2010-04-17 20:07 ` [PATCH v3 03/11] http.c: Tiny refactoring of finish_http_pack_request Shawn O. Pearce
2010-04-17 20:07 ` [PATCH v3 04/11] http.c: Drop useless != NULL test in finish_http_pack_request Shawn O. Pearce
2010-04-17 20:07 ` [PATCH v3 05/11] http.c: Don't store destination name in request structures Shawn O. Pearce
2010-04-18 3:36 ` Tay Ray Chuan
2010-04-17 20:07 ` [PATCH v3 06/11] http.c: Remove unnecessary strdup of sha1_to_hex result Shawn O. Pearce
2010-04-18 3:14 ` Tay Ray Chuan
2010-04-17 20:07 ` [PATCH v3 07/11] Introduce close_pack_index to permit replacement Shawn O. Pearce
2010-04-17 20:07 ` [PATCH v3 08/11] Extract verify_pack_index for reuse from verify_pack Shawn O. Pearce
2010-04-17 20:07 ` [PATCH v3 09/11] Allow parse_pack_index on temporary files Shawn O. Pearce
2010-04-17 20:07 ` [PATCH v3 10/11] http-fetch: Use index-pack rather than verify-pack to check packs Shawn O. Pearce
2010-04-18 3:07 ` Tay Ray Chuan
2010-04-17 20:07 ` [PATCH v3 11/11] http-fetch: Use temporary files for pack-*.idx until verified Shawn O. Pearce
2010-04-18 3:57 ` Tay Ray Chuan
2010-04-19 14:23 ` [PATCH v4 00/11] Resend sp/maint-dumb-http-pack-reidx Shawn O. Pearce
2010-04-19 14:46 ` Tay Ray Chuan
2010-04-19 14:49 ` Shawn O. Pearce
2010-04-20 4:33 ` Tay Ray Chuan
2010-04-19 14:23 ` [PATCH v4 06/11] http.c: Remove unnecessary strdup of sha1_to_hex result Shawn O. Pearce
2010-04-19 14:23 ` [PATCH v4 07/11] Introduce close_pack_index to permit replacement Shawn O. Pearce
2010-04-19 14:23 ` [PATCH v4 08/11] Extract verify_pack_index for reuse from verify_pack Shawn O. Pearce
2010-04-19 14:23 ` [PATCH v4 09/11] Allow parse_pack_index on temporary files Shawn O. Pearce
2010-04-19 14:23 ` Shawn O. Pearce [this message]
2010-04-19 14:35 ` [PATCH v4 10/11] http-fetch: Use index-pack rather than verify-pack to check packs Tay Ray Chuan
2010-04-19 14:23 ` [PATCH v4 11/11] http-fetch: Use temporary files for pack-*.idx until verified Shawn O. Pearce
2010-04-15 19:09 ` [PATCH 6/6] " Shawn O. Pearce
2010-04-15 11:33 ` git fetch over http:// left my repo broken Ilari Liusvaara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1271686990-16363-6-git-send-email-spearce@spearce.org \
--to=spearce@spearce.org \
--cc=christian.halstrick@gmail.com \
--cc=git@drmicha.warpmail.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=ilari.liusvaara@elisanet.fi \
--cc=j6t@kdbg.org \
--cc=jan.sievers@sap.com \
--cc=matthias.sohn@sap.com \
--cc=rctay89@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).