All of lore.kernel.org
 help / color / mirror / Atom feed
From: Josh Steadmon <steadmon@google.com>
To: Taylor Blau <me@ttaylorr.com>
Cc: Derrick Stolee <derrickstolee@github.com>,
	git@vger.kernel.org, lessleydennington@gmail.com,
	gitster@pobox.com, vdye@github.com
Subject: Re: [RFC PATCH] repo-settings: set defaults even when not in a repo
Date: Mon, 28 Mar 2022 12:15:53 -0700	[thread overview]
Message-ID: <YkIJacCaaqFk1MDa@google.com> (raw)
In-Reply-To: <Yjt6mLIfw0V3aVTO@nand.local>

On 2022.03.23 15:52, Taylor Blau wrote:
> On Wed, Mar 23, 2022 at 03:22:13PM -0400, Derrick Stolee wrote:
> > On 3/23/2022 2:03 PM, Josh Steadmon wrote:
> > > prepare_repo_settings() initializes a `struct repository` with various
> > > default config options and settings read from a repository-local config
> > > file. In 44c7e62 (2021-12-06, repo-settings:prepare_repo_settings only
> > > in git repos), prepare_repo_settings was changed to issue a BUG() if it
> > > is called by a process whose CWD is not a Git repository. This approach
> > > was suggested in [1].
> > >
> > > This breaks fuzz-commit-graph, which attempts to parse arbitrary
> > > fuzzing-engine-provided bytes as a commit graph file.
> > > commit-graph.c:parse_commit_graph() calls prepare_repo_settings(), but
> > > since we run the fuzz tests without a valid repository, we are hitting
> > > the BUG() from 44c7e62 for every test case.
> > >
> > > Fix this by refactoring prepare_repo_settings() such that it sets
> > > default options unconditionally; if its process is in a Git repository,
> > > it will also load settings from the local config. This eliminates the
> > > need for a BUG() when not in a repository.
> >
> > I think you have the right idea and this can work.
> 
> Hmmm. To me this feels like bending over backwards in
> `prepare_repo_settings()` to accommodate one particular caller. I'm not
> necessarily opposed to it, but it does feel strange to make
> `prepare_repo_settings()` a noop here, since I would expect that any
> callers who do want to call `prepare_repo_settings()` are likely
> convinced that they are inside of a repository, and it probably should
> be a BUG() if they aren't.
> 
> I was initially thinking that Josh's alternative of introducing and
> calling a lower-level version of `prepare_commit_graph()` that doesn't
> require the use of any repository settings would make sense.
> 
> But when I started looking at implementing it, I became confused at how
> this is supposed to work at all without using a repository. We depend on
> the values parsed from:
> 
>   - commitGraph.generationVersion
>   - commitGraph.readChangedPaths
> 
> to determine which part(s) of the commit graph we do and don't read.
> 
> Looking around, I think I probably inadvertently broke this in
> ab14d0676c (commit-graph: pass a 'struct repository *' in more places,
> 2020-09-09). But prior to ab14d0676c, neither of those settings existed,
> so parsing the commit graph was a pure function of the commit graph's
> contents alone, and didn't rely on the existence of a repository.

Yeah, I have not done a great job keeping the fuzzers up to date with
commit-graph changes :(.

> We could pretend as if `commitGraph.generationVersion` is always "2" and
> `commitGraph.readChangedPaths` is always "true", and I think that would
> still give us good-enough coverage.

It might also be worthwhile for the fuzzer to test each interesting
combination of settings, using the same arbitrary input.

> I assume that libFuzzer doesn't give us a convenient way to stub out a
> repository. Maybe we should have a variant of `parse_commit_graph` that
> instead takes a `struct repo_settings` filled out with the defaults?
> 
> We could use that to teach libFuzzer about additional states that the
> commit graph parser can be in, too (though probably outside the scope of
> this patch).
> 
> I tried to sketch all of this out, which seems to work. Let me know what
> you think:

Yes, your patch looks like a much smaller change than I feared would be
the case for a parse_commit_graph refactor. I'll test it out, and
probably add some additional fuzzer improvements on top. Thanks for the
patch!

  reply	other threads:[~2022-03-28 19:16 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-23 18:03 [RFC PATCH] repo-settings: set defaults even when not in a repo Josh Steadmon
2022-03-23 19:22 ` Derrick Stolee
2022-03-23 19:52   ` Taylor Blau
2022-03-28 19:15     ` Josh Steadmon [this message]
2022-03-29  1:21       ` Taylor Blau
2022-03-28 19:53     ` Josh Steadmon
2022-03-29  1:22       ` Taylor Blau
2022-03-29  9:03     ` Ævar Arnfjörð Bjarmason
2022-03-30  2:26       ` Taylor Blau
2022-04-09  6:33         ` Josh Steadmon
2022-03-29  9:04     ` Ævar Arnfjörð Bjarmason
2022-03-30  2:34       ` Taylor Blau
2022-03-30 17:38         ` Ævar Arnfjörð Bjarmason
2022-03-30 20:14           ` Junio C Hamano
2022-04-09  6:52     ` [RFC PATCH v2] commit-graph: refactor to avoid prepare_repo_settings Josh Steadmon
2022-06-07 20:02       ` Jonathan Tan
2022-06-14 22:38         ` Josh Steadmon
2022-06-14 22:37     ` [PATCH v3] " Josh Steadmon
2022-06-14 23:32       ` Taylor Blau
2022-06-23 21:59       ` Junio C Hamano
2022-07-14 21:44         ` Josh Steadmon
2022-07-14 21:43     ` [PATCH v4] commit-graph: pass repo_settings instead of repository Josh Steadmon
2022-07-14 22:48       ` Junio C Hamano
2022-03-23 20:11 ` [RFC PATCH] repo-settings: set defaults even when not in a repo Victoria Dye
2022-03-23 20:54   ` Junio C Hamano
2022-03-23 21:19     ` Victoria Dye
2022-03-23 20:51 ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YkIJacCaaqFk1MDa@google.com \
    --to=steadmon@google.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=lessleydennington@gmail.com \
    --cc=me@ttaylorr.com \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.