git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Elijah Newren <newren@gmail.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Taylor Blau <me@ttaylorr.com>,
	git@vger.kernel.org, Jeff King <peff@peff.net>,
	 Patrick Steinhardt <ps@pks.im>
Subject: Re: [PATCH v4 4/6] pack-objects: generate cruft packs at most one object over threshold
Date: Thu, 13 Mar 2025 09:23:35 -0700	[thread overview]
Message-ID: <CABPp-BFSXVkZb76va0CO_w3G+MtNuvC3jv6-9vo-oR3LHW4YNQ@mail.gmail.com> (raw)
In-Reply-To: <xmqqmsdpruqg.fsf@gitster.g>

On Thu, Mar 13, 2025 at 5:16 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Elijah Newren <newren@gmail.com> writes:
>
> >> With below-size and max-size set to say 180 and 200 respectively, an
> >> attempt to combine the crufts may end up filling a cruft pack to 170
> >> but the smallest of the remaining cruft may weigh 40, which means
> >> including it would cause the max-size to be exceeded.  In such a
> >> scenario, there may not be a solution to satisfy given constraints,
> >> i.e. go above the below-size without stay below the max-size.
> >>
> >> So I am not sure if the approach would really solve much.
> >>
> >> Other than that a separate names, especially losing "max" from the
> >> threshold that really does not mean "max", would solve the confusion
> >> that comes from naming, that is.
> >
> > --max-pack-size is a constraint.  --combine-cruft-below-size is not.
> > Think particularly of the case where the user doesn't even have any
> > cruft packs yet and has only accumulated a little bit of cruft.  That
> > option is merely a guide post to say that if it's smaller than that
> > size, then feel free to keep trying to add to it (so long as it
> > doesn't violate constraints such as --max-pack-size).
>
> That is correct and it is why I said the suggestion solves the name
> confusion.  But think about the sample situation, before and after
> such a repack with two thresholds.  You had below- and max-size set
> to 180 and 200 respectively, and a cruft pack of size 170, and you
> failed to grow that cruft pack beyond 180 because the next available
> cruft weighed 40.  Then you'll repeat the exercise again, find 170
> that is smaller than the below- threshold, try to cram more and
> would fail.  Isn't that what Taylor's series wanted to prevent from
> happening, and isn't the two-threshod approach supposed to be a way
> to improve on it?

I don't think "always combine" is necessary for improvement.  Perhaps,
in your example, this round of repacking can't combine things.  But
the next time we want to repack cruft objects and there is anything
new that (individually or collectively) weighs between 10 and 30, we
can add it and get something over the lower threshold and then ignore
the resulting cruft pack it in the future.

In contrast, the single threshold either has to violate the maximum
constraint, or always reconsider everything.  The two threshold system
allows progress to be made (so long as it doesn't just look at the
first biggest object and fail every time), but particularly if you set
the thresholds too close to each other or you just have really large
cruft, then you _sometimes_ might not make progress.

Personally, I think I'd set the --combine-cruft-below-size to half of
--max-pack-size, because that guarantees that any two existing cruft
packs being considered for combining can be, and the resulting
combined cruft pack if big enough can then be ignored in the future.
In other words, this scheme would allow you to always make progress.

Am I missing something?

  reply	other threads:[~2025-03-13 16:23 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-27 18:29 [PATCH 0/2] pack-objects: freshen objects with multi-cruft packs Taylor Blau
2025-02-27 18:29 ` [PATCH 1/2] builtin/repack.c: simplify cruft pack aggregation Taylor Blau
2025-02-27 19:23   ` Elijah Newren
2025-02-27 22:53     ` Taylor Blau
2025-02-28  7:52   ` Patrick Steinhardt
2025-03-04 21:52     ` Elijah Newren
2025-03-05  2:04       ` Junio C Hamano
2025-03-05  0:09     ` Taylor Blau
2025-02-27 18:29 ` [PATCH 2/2] builtin/pack-objects.c: freshen objects from existing cruft packs Taylor Blau
2025-02-27 19:26   ` Elijah Newren
2025-02-27 23:03     ` Taylor Blau
2025-02-27 19:28 ` [PATCH 0/2] pack-objects: freshen objects with multi-cruft packs Elijah Newren
2025-02-27 23:05   ` Taylor Blau
2025-03-04 21:35 ` [PATCH v2 " Taylor Blau
2025-03-04 21:35   ` [PATCH v2 1/2] builtin/repack.c: simplify cruft pack aggregation Taylor Blau
2025-03-04 21:35   ` [PATCH v2 2/2] builtin/pack-objects.c: freshen objects from existing cruft packs Taylor Blau
2025-03-04 22:55   ` [PATCH v2 0/2] pack-objects: freshen objects with multi-cruft packs Elijah Newren
2025-03-05  0:06     ` Taylor Blau
2025-03-05  0:13       ` Taylor Blau
2025-03-05  0:15 ` [PATCH v3 0/1] " Taylor Blau
2025-03-05  0:15   ` [PATCH v3 1/1] builtin/pack-objects.c: freshen objects from existing cruft packs Taylor Blau
2025-03-06 10:31     ` Patrick Steinhardt
2025-03-13 17:32       ` Taylor Blau
2025-03-06 10:31   ` [PATCH v3 0/1] pack-objects: freshen objects with multi-cruft packs Patrick Steinhardt
2025-03-11  0:21 ` [PATCH v4 0/6] " Taylor Blau
2025-03-11  0:21   ` [PATCH v4 1/6] t/t5329-pack-objects-cruft.sh: evict 'repack'-related tests Taylor Blau
2025-03-11  0:21   ` [PATCH v4 2/6] t7704-repack-cruft.sh: consolidate `write_blob()` Taylor Blau
2025-03-11  0:21   ` [PATCH v4 3/6] t/lib-cruft.sh: extract some cruft-related helpers Taylor Blau
2025-03-11  0:21   ` [PATCH v4 4/6] pack-objects: generate cruft packs at most one object over threshold Taylor Blau
2025-03-11 21:59     ` Junio C Hamano
2025-03-12 15:22       ` Taylor Blau
2025-03-12 18:26         ` Junio C Hamano
2025-03-12 19:02           ` Taylor Blau
2025-03-12 19:13             ` Elijah Newren
2025-03-12 19:33               ` Taylor Blau
2025-03-12 20:43               ` Junio C Hamano
2025-03-12 20:49                 ` Elijah Newren
2025-03-13 12:16                   ` Junio C Hamano
2025-03-13 16:23                     ` Elijah Newren [this message]
2025-03-13 17:06                       ` Junio C Hamano
2025-03-11  0:21   ` [PATCH v4 5/6] builtin/repack.c: simplify cruft pack aggregation Taylor Blau
2025-03-11  0:21   ` [PATCH v4 6/6] builtin/pack-objects.c: freshen objects from existing cruft packs Taylor Blau
2025-03-11 20:13   ` [PATCH v4 0/6] pack-objects: freshen objects with multi-cruft packs Junio C Hamano
2025-03-12 15:33     ` Taylor Blau
2025-03-12 18:28       ` Junio C Hamano
2025-03-12 19:04         ` Taylor Blau
2025-03-12 19:46           ` Junio C Hamano
2025-03-12 19:52             ` Taylor Blau
2025-03-13 17:17               ` Junio C Hamano
2025-03-13 17:35                 ` Taylor Blau
2025-03-13  6:29           ` Jeff King
2025-03-13 15:12             ` Junio C Hamano
2025-03-13 18:09 ` [PATCH v5] builtin/pack-objects.c: freshen objects from existing cruft packs Taylor Blau
2025-03-13 18:41   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABPp-BFSXVkZb76va0CO_w3G+MtNuvC3jv6-9vo-oR3LHW4YNQ@mail.gmail.com \
    --to=newren@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    --cc=peff@peff.net \
    --cc=ps@pks.im \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).