All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kousik Sanagavarapu <five231003@gmail.com>
To: jonathantanmy@google.com
Cc: git@vger.kernel.org, gitster@pobox.com
Subject: Re: [PATCH] index-pack: remove fetch_if_missing=0
Date: Sat, 11 Mar 2023 11:52:19 +0530	[thread overview]
Message-ID: <20230311062219.22325-1-five231003@gmail.com> (raw)
In-Reply-To: <20230310211321.4135748-1-jonathantanmy@google.com>

On Sat, 11 Mar 2023 at 02:43, Jonathan Tan <jonathantanmy@google.com> wrote:
>
> Junio C Hamano <gitster@pobox.com> writes:
> > > Hence, use has_object() to check for the existence of an object, which
> > > has the default behavior of not lazy-fetching in a partial clone. It is
> > > worth mentioning that this is the only place where there is potential for
> > > lazy-fetching and all other cases are properly handled, making it safe to
> > > remove this global here.
> >
> > This paragraph is very well explained.
>
> It might be good if the "all other cases" were enumerated here in the
> commit message (since the consequence of missing a case might be an
> infinite loop of fetching).
>

I will make the change.

> > OK.  The comment describes the design choice we made to flip the
> > fetch_if_missing flag off.  The old world-view was that we would
> > notice a breakage by non-functioning index-pack when a lazy clone is
> > missing objects that we need by disabling auto-fetching, and we
> > instead explicitly handle any missing and necessary objects by lazy
> > fetching (like "when we lack REF_DELTA bases").  It does sound like
> > a conservative thing to do, compared to the opposite approach we are
> > taking with this patch, i.e. we would not fail if we tried to access
> > objects we do not need to, because we have lazy fetching enabled,
> > and we just ended up with bloated object store nobody may notice.
> >
> > To protect us from future breakage that can come from the new
> > approach, it is a very good thing that you added new tests to ensure
> > no unnecessary lazy fetching is done (I am not offhand sure if that
> > test is sufficient, though).
>
> I don't think the test is sufficient - I'll explain that below.
>
> > > +test_expect_success 'index-pack does not lazy-fetch when checking for sha1 collsions' '
> > > +   rm -rf server promisor-remote client repo trace &&
> > > +
> > > +   # setup
> > > +   git init server &&
> > > +   for i in 1 2 3 4
> > > +   do
> > > +           echo $i >server/file$i &&
> > > +           git -C server add file$i &&
> > > +           git -C server commit -am "Commit $i" || return 1
> > > +   done &&
> > > +   git -C server config --local uploadpack.allowFilter 1 &&
> > > +   git -C server config --local uploadpack.allowAnySha1InWant 1 &&
> > > +   HASH=$(git -C server hash-object file3) &&
> > > +
> > > +   git init promisor-remote &&
> > > +   git -C promisor-remote fetch --keep "file://$(pwd)/server" &&
> > > +
> > > +   git clone --no-checkout --filter=blob:none "file://$(pwd)/server" client &&
> > > +   git -C client remote set-url origin "file://$(pwd)/promisor-remote" &&
> > > +   git -C client config extensions.partialClone 1 &&
> > > +   git -C client config remote.origin.promisor 1 &&
> > > +
> > > +   git init repo &&
> > > +   echo "5" >repo/file5 &&
> > > +   git -C repo config --local uploadpack.allowFilter 1 &&
> > > +   git -C repo config --local uploadpack.allowAnySha1InWant 1 &&
>
> The file5 isn't committed?

That is a blunder.

>
> [...]
>
> So I think the way to do this is to have 3 repositories like the author
> is doing now (server, client, and repo), and do it as follows:
>  - create "server", one commit will do
>  - clone "server" into "client" (partial clone)
>  - clone "server" into "another-remote" (not partial clone)
>  - add a file ("new-file") to "server", commit it, and pull from "another-remote"
>  - fetch from "another-remote" into "client"
>
> This way, "client" will need to verify that the hash of "new-file" has
> no collisions with any object it currently has. If there is no bug,
> "new-file" will never be fetched from "server", and if there is a bug,
> "new-file" will be fetched.
>

So, we can lose the "promisor-remote" in the original test and make the
"server" itself a promisor-remote?

Thanks for the review

> One problem is that if there is a bug, such a test will cause an
> infinite loop (we fetch "new-file", so we want to check it for
> collisions, and because of the bug, we fetch "new-file" again, which we
> check for collisions, and so on) which might be problematic for things
> like CI. But we might be able to treat timeouts as the same as test
> failures, so this should be OK.

  parent reply	other threads:[~2023-03-11  6:22 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-25  5:24 [PATCH] index-pack: remove fetch_if_missing=0 Kousik Sanagavarapu
2023-02-27 16:56 ` Kousik Sanagavarapu
2023-02-27 22:14 ` Jonathan Tan
2023-02-28  3:54   ` Kousik Sanagavarapu
2023-03-10 18:30 ` [PATCH v2] " Kousik Sanagavarapu
2023-03-10 20:30   ` Junio C Hamano
2023-03-10 21:13     ` Jonathan Tan
2023-03-10 21:41       ` Junio C Hamano
2023-03-11  2:59         ` Jonathan Tan
2023-03-12 17:16         ` Kousik Sanagavarapu
2023-03-11  6:22       ` Kousik Sanagavarapu [this message]
2023-03-11  6:00     ` [PATCH] " Kousik Sanagavarapu
2023-03-13 18:15   ` [PATCH v3] " Kousik Sanagavarapu
2023-03-13 19:17     ` Junio C Hamano
2023-03-13 19:18     ` Junio C Hamano
2023-03-17 17:56     ` [PATCH v4] " Kousik Sanagavarapu
2023-03-17 22:58       ` Junio C Hamano
2023-03-19  6:17         ` Kousik Sanagavarapu
2023-03-11 20:01 ` [PATCH] " Sean Allred
2023-03-11 20:37   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230311062219.22325-1-five231003@gmail.com \
    --to=five231003@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonathantanmy@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.