git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: git@vger.kernel.org
Cc: Taylor Blau <me@ttaylorr.com>, Jeff King <peff@peff.net>,
	 Junio C Hamano <gitster@pobox.com>
Subject: [PATCH v3 0/2] packfile: avoid access(3p) calls for missing packs
Date: Wed, 28 May 2025 14:24:09 +0200	[thread overview]
Message-ID: <20250528-pks-pack-avoid-stats-on-missing-v3-0-4cbd0e14bed8@pks.im> (raw)
In-Reply-To: <20250516-pks-pack-avoid-stats-on-missing-v1-1-e2ef4d8798a3@pks.im>

Hi,

this patch addresses an issue we have recently seen in our production
systems due to a stale MIDX. The MIDX contained entries for packfiles
that didn't exist anymore, which caused Git to repeatedly look up those
packfiles. Each missing packfile resulted in four repeated syscalls:
three access(3p) calls to check for supporting data structures, and one
call to stat(3p) to check for the packfile itself. The first three calls
are essentially wasted though when the stat(3p) call itself fails, which
is being fixed by this patch.

I doubt that the patch matters in almost any repository, but given that
the refactoring is trivial I thought to submit the patch regardless of
that. Another step would be to introduce a negative lookup cache -- but
that would be a bit more involved, so I decided against it for now as I
don't want to introduce complexity for dubious gains.

Changes in v2:
  - Drop the patch that reorders syscalls and add a comment explaining
    why the order is important.
  - Add a negative lookup cache for indexed packfiles.
  - Link to v1: https://lore.kernel.org/r/20250516-pks-pack-avoid-stats-on-missing-v1-1-e2ef4d8798a3@pks.im

Changes in v3:
  - Use a macro to hide away the `(void *)(intptr_t)-1` magic.
  - Link to v2: https://lore.kernel.org/r/20250520-pks-pack-avoid-stats-on-missing-v2-0-333c5217fb05@pks.im

Thanks!

Patrick

---
Patrick Steinhardt (2):
      packfile: explain ordering of how we look up auxiliary pack files
      midx: stop repeatedly looking up nonexistent packfiles

 midx.c     | 12 ++++++++++--
 packfile.c | 11 +++++++++++
 2 files changed, 21 insertions(+), 2 deletions(-)

Range-diff versus v2:

1:  cf601a63e98 = 1:  c4087674967 packfile: explain ordering of how we look up auxiliary pack files
2:  e3108f7ce48 ! 2:  2fa98231464 midx: stop repeatedly looking up nonexistent packfiles
    @@ Commit message
         Signed-off-by: Patrick Steinhardt <ps@pks.im>
     
      ## midx.c ##
    +@@
    + #include "pack-bitmap.h"
    + #include "pack-revindex.h"
    + 
    ++#define MIDX_PACK_ERROR ((void *)(intptr_t)-1)
    ++
    + int midx_checksum_valid(struct multi_pack_index *m);
    + void clear_midx_files_ext(const char *object_dir, const char *ext,
    + 			  const char *keep_hash);
     @@ midx.c: void close_midx(struct multi_pack_index *m)
      	munmap((unsigned char *)m->data, m->data_len);
      
      	for (i = 0; i < m->num_packs; i++) {
     -		if (m->packs[i])
    -+		if (m->packs[i] && m->packs[i] != (void *)(intptr_t)-1)
    ++		if (m->packs[i] && m->packs[i] != MIDX_PACK_ERROR)
      			m->packs[i]->multi_pack_index = 0;
      	}
      	FREE_AND_NULL(m->packs);
    @@ midx.c: int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
      
      	pack_int_id = midx_for_pack(&m, pack_int_id);
      
    -+	if (m->packs[pack_int_id] == (void *)(intptr_t)-1)
    ++	if (m->packs[pack_int_id] == MIDX_PACK_ERROR)
     +		return 1;
      	if (m->packs[pack_int_id])
      		return 0;
    @@ midx.c: int prepare_midx_pack(struct repository *r, struct multi_pack_index *m,
      
     -	if (!p)
     +	if (!p) {
    -+		m->packs[pack_int_id] = (void *)(intptr_t)-1;
    ++		m->packs[pack_int_id] = MIDX_PACK_ERROR;
      		return 1;
     +	}
      
    @@ midx.c: struct packed_git *nth_midxed_pack(struct multi_pack_index *m,
      				   uint32_t pack_int_id)
      {
      	uint32_t local_pack_int_id = midx_for_pack(&m, pack_int_id);
    -+	if (m->packs[local_pack_int_id] == (void *)(intptr_t)-1)
    ++	if (m->packs[local_pack_int_id] == MIDX_PACK_ERROR)
     +		return NULL;
      	return m->packs[local_pack_int_id];
      }

---
base-commit: 1a8a4971cc6c179c4dd711f4a7f5d7178f4b3ab7
change-id: 20250516-pks-pack-avoid-stats-on-missing-8e3b75755cf0


  parent reply	other threads:[~2025-05-28 12:24 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-16  8:55 [PATCH] packfile: avoid access(3p) calls for missing packs Patrick Steinhardt
2025-05-16 18:34 ` Junio C Hamano
2025-05-19  6:52   ` Jeff King
2025-05-19 15:46     ` Junio C Hamano
2025-05-20  6:45     ` Patrick Steinhardt
2025-05-22  5:28       ` Jeff King
2025-05-23  1:02     ` Taylor Blau
2025-05-23  2:03       ` Jeff King
2025-05-20  9:53 ` [PATCH v2 0/2] " Patrick Steinhardt
2025-05-20  9:53   ` [PATCH v2 1/2] packfile: explain ordering of how we look up auxiliary pack files Patrick Steinhardt
2025-05-23  1:03     ` Taylor Blau
2025-05-20  9:53   ` [PATCH v2 2/2] midx: stop repeatedly looking up nonexistent packfiles Patrick Steinhardt
2025-05-22  5:32     ` Jeff King
2025-05-22 15:47       ` Junio C Hamano
2025-05-22 16:59         ` Jeff King
2025-05-22 18:44           ` Junio C Hamano
2025-05-23  1:22           ` Taylor Blau
2025-05-23  2:08             ` Jeff King
2025-05-23 17:46               ` Taylor Blau
2025-05-25 18:41                 ` [PATCH 0/5] midx: improve prepare_midx_pack() ergonomics Taylor Blau
2025-05-25 18:41                   ` [PATCH 1/5] pack-bitmap.c: fix broken warning() when missing MIDX'd pack Taylor Blau
2025-05-26  7:23                     ` Patrick Steinhardt
2025-05-28  2:00                       ` Taylor Blau
2025-05-25 18:41                   ` [PATCH 2/5] midx-write.c: guard against incremental MIDXs in want_included_pack() Taylor Blau
2025-05-26  7:23                     ` Patrick Steinhardt
2025-05-28  2:08                       ` Taylor Blau
2025-05-25 18:41                   ` [PATCH 3/5] midx-write.c: simplify fill_packs_from_midx() Taylor Blau
2025-05-26  7:23                     ` Patrick Steinhardt
2025-05-28  2:15                       ` Taylor Blau
2025-05-25 18:42                   ` [PATCH 4/5] midx-write.c: extract inner loop from fill_packs_from_midx() Taylor Blau
2025-05-25 18:42                   ` [PATCH 5/5] midx: return a `packed_git` pointer from `prepare_midx_pack()` Taylor Blau
2025-05-26  7:24                     ` Patrick Steinhardt
2025-05-28  2:18                       ` Taylor Blau
2025-05-28 11:53                         ` Patrick Steinhardt
2025-05-28 22:58                   ` [PATCH v2 0/4] midx: improve prepare_midx_pack() ergonomics Taylor Blau
2025-05-28 22:59                     ` [PATCH v2 1/4] midx: access pack names through `nth_midxed_pack_name()` Taylor Blau
2025-05-29 20:47                       ` Junio C Hamano
2025-06-03 22:22                         ` Taylor Blau
2025-05-29 20:51                       ` Junio C Hamano
2025-06-03 22:23                         ` Taylor Blau
2025-05-28 22:59                     ` [PATCH v2 2/4] midx-write.c: guard against incremental MIDXs in want_included_pack() Taylor Blau
2025-05-28 22:59                     ` [PATCH v2 3/4] midx-write.c: extract inner loop from fill_packs_from_midx() Taylor Blau
2025-05-28 22:59                     ` [PATCH v2 4/4] midx: return a `packed_git` pointer from `prepare_midx_pack()` Taylor Blau
2025-05-30  6:50                       ` Jeff King
2025-06-03 22:27                         ` Taylor Blau
2025-08-28 23:25                           ` Junio C Hamano
2025-05-23  1:31       ` [PATCH v2 2/2] midx: stop repeatedly looking up nonexistent packfiles Taylor Blau
2025-05-23  2:18         ` Jeff King
2025-05-21 13:24   ` [PATCH v2 0/2] packfile: avoid access(3p) calls for missing packs Junio C Hamano
2025-05-28 12:24 ` Patrick Steinhardt [this message]
2025-05-28 12:24   ` [PATCH v3 1/2] packfile: explain ordering of how we look up auxiliary pack files Patrick Steinhardt
2025-05-28 12:24   ` [PATCH v3 2/2] midx: stop repeatedly looking up nonexistent packfiles Patrick Steinhardt
2025-05-30  6:27   ` [PATCH v3 0/2] packfile: avoid access(3p) calls for missing packs Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250528-pks-pack-avoid-stats-on-missing-v3-0-4cbd0e14bed8@pks.im \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).