From: Taylor Blau <me@ttaylorr.com>
To: Jeff King <peff@peff.net>
Cc: fox <fox.gbr@townlong-yak.com>,
Eric Sunshine <sunshine@sunshineco.com>,
git@vger.kernel.org
Subject: Re: Bug report: v2.47.0 cannot fetch version 1 pack indexes
Date: Mon, 21 Oct 2024 16:33:15 -0400 [thread overview]
Message-ID: <Zxa6ixwP2aOJdfmL@nand.local> (raw)
In-Reply-To: <20241020024022.GA615104@coredump.intra.peff.net>
On Sat, Oct 19, 2024 at 10:40:22PM -0400, Jeff King wrote:
> diff --git a/http.c b/http.c
> index d59e59f66b..6df032e40f 100644
> --- a/http.c
> +++ b/http.c
> @@ -19,6 +19,7 @@
> #include "string-list.h"
> #include "object-file.h"
> #include "object-store-ll.h"
> +#include "tempfile.h"
>
> static struct trace_key trace_curl = TRACE_KEY_INIT(CURL);
> static int trace_curl_data = 1;
> @@ -2388,8 +2389,24 @@ static char *fetch_pack_index(unsigned char *hash, const char *base_url)
> strbuf_addf(&buf, "objects/pack/pack-%s.idx", hash_to_hex(hash));
> url = strbuf_detach(&buf, NULL);
>
> - strbuf_addf(&buf, "%s.temp", sha1_pack_index_name(hash));
> - tmp = strbuf_detach(&buf, NULL);
> + /*
> + * Don't put this into packs/, since it's just temporary and we don't
> + * want to confuse it with our local .idx files. We'll generate our
> + * own index if we choose to download the matching packfile.
> + *
> + * It's tempting to use xmks_tempfile() here, but it's important that
> + * the file not exist, otherwise http_get_file() complains. So we
> + * create a filename that should be unique, and then just register it
> + * as a tempfile so that it will get cleaned up on exit.
> + *
> + * Arguably it would be better to hold on to the tempfile handle so
> + * that we can remove it as soon as we download the pack and generate
> + * the real index, but that might need more surgery.
> + */
> + tmp = xstrfmt("%s/tmp_pack_%s.idx",
> + repo_get_object_directory(the_repository),
> + hash_to_hex(hash));
> + register_tempfile(tmp);
Makes perfect sense, and the comment above here is much appreciated.
I thought about trying to use some intermediate state of the strbuf here
to avoid an extra xstrfmt() call, but couldn't come up with anything I
didn't think was awkward.
> if (http_get_file(url, tmp, NULL) != HTTP_OK) {
> error("Unable to get pack index %s", url);
> @@ -2427,10 +2444,8 @@ static int fetch_and_setup_pack_index(struct packed_git **packs_head,
> }
>
> ret = verify_pack_index(new_pack);
> - if (!ret) {
> + if (!ret)
> close_pack_index(new_pack);
> - ret = finalize_object_file(tmp_idx, sha1_pack_index_name(sha1));
And here's where we stop calling finalize_object_file(). Good.
> - }
> free(tmp_idx);
> if (ret)
> return -1;
> diff --git a/packfile.c b/packfile.c
> index df4ba67719..16d3bcf7f7 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -237,13 +237,22 @@ static struct packed_git *alloc_packed_git(int extra)
> return p;
> }
>
> +static char *pack_path_from_idx(const char *idx_path)
> +{
> + size_t len;
> + if (!strip_suffix(idx_path, ".idx", &len))
> + BUG("idx path does not end in .idx: %s", idx_path);
> + return xstrfmt("%.*s.pack", (int)len, idx_path);
> +}
> +
> struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path)
> {
> - const char *path = sha1_pack_name(sha1);
> + char *path = pack_path_from_idx(idx_path);
Huh. I would have thought we have such a helper function already. I
guess we probably do, but that it's also defined statically because it's
so easy to write.
In any case, this looks like the right thing to do here. It would be
nice to have a corresponding test here, since unlike the other
finalize_object_file() changes, this one can be provoked
deterministically.
Would you mind submitting this as a bona-fide patch, which I can then
pick up and start merging down?
In any event, I appreciate you figuring out what was going on here and
outlining a couple of paths forward.
Thanks,
Taylor
next prev parent reply other threads:[~2024-10-21 20:33 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-19 23:22 Bug report: v2.47.0 cannot fetch version 1 pack indexes fox
2024-10-20 0:37 ` Eric Sunshine
2024-10-21 20:06 ` Taylor Blau
2024-10-20 1:24 ` Jeff King
2024-10-20 2:40 ` Jeff King
2024-10-21 20:33 ` Taylor Blau [this message]
2024-10-22 5:14 ` Jeff King
2024-10-22 15:18 ` Taylor Blau
2024-10-25 6:41 ` [PATCH 0/11] dumb-http pack index v1 regression + cleanups Jeff King
2024-10-25 6:43 ` [PATCH 01/11] midx: avoid duplicate packed_git entries Jeff King
2024-10-25 21:09 ` Taylor Blau
2024-10-25 6:44 ` [PATCH 02/11] t5550: count fetches in "previously-fetched .idx" test Jeff King
2024-10-25 6:58 ` [PATCH 03/11] dumb-http: store downloaded pack idx as tempfile Jeff King
2024-10-25 21:18 ` Taylor Blau
2024-10-26 6:02 ` Jeff King
2024-10-28 0:14 ` Taylor Blau
2024-10-25 7:00 ` [PATCH 04/11] packfile: drop has_pack_index() Jeff King
2024-10-25 21:27 ` Taylor Blau
2024-10-25 7:00 ` [PATCH 05/11] packfile: drop sha1_pack_name() Jeff King
2024-10-25 7:01 ` [PATCH 06/11] packfile: drop sha1_pack_index_name() Jeff King
2024-10-25 7:02 ` [PATCH 07/11] packfile: warn people away from parse_packed_git() Jeff King
2024-10-25 21:28 ` Taylor Blau
2024-10-25 7:03 ` [PATCH 08/11] http-walker: use object_id instead of bare hash Jeff King
2024-10-25 7:05 ` [PATCH 09/11] packfile: convert find_sha1_pack() to use object_id Jeff King
2024-10-25 7:06 ` [PATCH 10/11] packfile: use object_id in find_pack_entry_one() Jeff King
2024-10-25 21:33 ` Taylor Blau
2024-10-25 7:08 ` [PATCH 11/11] packfile: use oidread() instead of hashcpy() to fill object_id Jeff King
2024-10-25 21:35 ` [PATCH 0/11] dumb-http pack index v1 regression + cleanups Taylor Blau
2024-10-21 20:23 ` Bug report: v2.47.0 cannot fetch version 1 pack indexes Taylor Blau
2024-10-22 5:00 ` Jeff King
2024-10-22 15:50 ` Taylor Blau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zxa6ixwP2aOJdfmL@nand.local \
--to=me@ttaylorr.com \
--cc=fox.gbr@townlong-yak.com \
--cc=git@vger.kernel.org \
--cc=peff@peff.net \
--cc=sunshine@sunshineco.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).