Git development

Git development
 help / color / mirror / Atom feed

* What's cooking in git.git (Jan 2023, #07; Mon, 30)
From: Junio C Hamano @ 2023-01-30 23:10 UTC (permalink / raw)
  To: git

Here are the topics that have been cooking in my tree.  Commits
prefixed with '+' are in 'next' (being in 'next' is a sign that a
topic is stable enough to be used and are candidate to be in a future
release).  Commits prefixed with '-' are only in 'seen', and aren't
considered "accepted" at all.  A topic without enough support may be
discarded after a long period of no activity.

Copies of the source code to Git live in many repositories, and the
following is a list of the ones I push into or their mirrors.  Some
repositories have only a subset of branches.

With maint, master, next, seen, todo:

	git://git.kernel.org/pub/scm/git/git.git/
	git://repo.or.cz/alt-git.git/
	https://kernel.googlesource.com/pub/scm/git/git/
	https://github.com/git/git/
	https://gitlab.com/git-vcs/git/

With all the integration branches and topics broken out:

	https://github.com/gitster/git/

Even though the preformatted documentation in HTML and man format
are not sources, they are published in these repositories for
convenience (replace "htmldocs" with "manpages" for the manual
pages):

	git://git.kernel.org/pub/scm/git/git-htmldocs.git/
	https://github.com/gitster/git-htmldocs.git/

Release tarballs are available at:

	https://www.kernel.org/pub/software/scm/git/

--------------------------------------------------
[Graduated to 'master']

* ab/cache-api-cleanup-users (2023-01-17) 3 commits
  (merged to 'next' on 2023-01-18 at c5a4374652)
 + treewide: always have a valid "index_state.repo" member
 + Merge branch 'ds/omit-trailing-hash-in-index' into ab/cache-api-cleanup-users
 + Merge branch 'ab/cache-api-cleanup' into ab/cache-api-cleanup-users

 Updates the users of the cache API.
 cf. <db312853-81a1-542b-db96-d816c463516c@github.com>
 source: <patch-1.1-b4998652822-20230117T135234Z-avarab@gmail.com>

* ar/markup-em-dash (2023-01-23) 1 commit
  (merged to 'next' on 2023-01-24 at 0367e3035f)
 + Documentation: render dash correctly

 Doc mark-up updates.
 source: <20230123090114.429844-1-rybak.a.v@gmail.com>

* cb/grep-pcre-ucp (2023-01-18) 1 commit
  (merged to 'next' on 2023-01-19 at 2c7e531839)
 + grep: correctly identify utf-8 characters with \{b,w} in -P

 "grep -P" learned to use Unicode Character Property to grok
 character classes when processing \b and \w etc.
 cf. <xmqqzgaf2zpt.fsf@gitster.g>
 source: <20230108155217.2817-1-carenas@gmail.com>

* cw/fetch-remote-group-with-duplication (2023-01-19) 1 commit
  (merged to 'next' on 2023-01-20 at 7f00e43209)
 + fetch: fix duplicate remote parallel fetch bug

 "git fetch <group>", when "<group>" of remotes lists the same
 remote twice, unnecessarily failed when parallel fetching was
 enabled, which has been corrected.
 source: <20230119220538.1522464-1-calvinwan@google.com>

* jc/doc-branch-update-checked-out-branch (2023-01-18) 1 commit
  (merged to 'next' on 2023-01-19 at 970900a232)
 + branch: document `-f` and linked worktree behaviour

 Document that "branch -f <branch>" disables only the safety to
 avoid recreating an existing branch.
 source: <xmqqa62f2dj1.fsf_-_@gitster.g>

* jc/doc-checkout-b (2023-01-19) 1 commit
  (merged to 'next' on 2023-01-23 at 95340e1941)
 + checkout: document -b/-B to highlight the differences from "git branch"

 Clarify how "checkout -b/-B" and "git branch [-f]" are similar but
 different in the documentation.
 source: <xmqqtu0m1m9i.fsf@gitster.g>

* jk/hash-object-fsck (2023-01-19) 7 commits
  (merged to 'next' on 2023-01-23 at 985e87fc34)
 + fsck: do not assume NUL-termination of buffers
 + hash-object: use fsck for object checks
 + fsck: provide a function to fsck buffer without object struct
 + t: use hash-object --literally when created malformed objects
 + t7030: stop using invalid tag name
 + t1006: stop using 0-padded timestamps
 + t1007: modernize malformed object tests

 "git hash-object" now checks that the resulting object is well
 formed with the same code as "git fsck".
 source: <Y8hX+pIZUKXsyYj5@coredump.intra.peff.net>
 source: <Y8ifa7hyqxSbL92U@coredump.intra.peff.net>

* jk/hash-object-literally-fd-leak (2023-01-19) 1 commit
  (merged to 'next' on 2023-01-19 at fff9b60a36)
 + hash-object: fix descriptor leak with --literally

 Leakfix.
 source: <Y8ijpJqtkDTi792i@coredump.intra.peff.net>

* km/send-email-with-v-reroll-count (2022-11-27) 1 commit
  (merged to 'next' on 2023-01-19 at 9b3543471c)
 + send-email: relay '-v N' to format-patch

 "git send-email -v 3" used to be expanded to "git send-email
 --validate 3" when the user meant to pass them down to
 "format-patch", which has been corrected.
 source: <87edtp5uws.fsf@kyleam.com>

* pb/branch-advice-recurse-submodules (2023-01-18) 1 commit
  (merged to 'next' on 2023-01-19 at 13747fc72d)
 + branch: improve advice when --recurse-submodules fails

 Improve advice message given when "git branch --recurse-submodules"
 fails.
 source: <pull.1464.git.1673890908453.gitgitgadget@gmail.com>

* po/pretty-format-columns-doc (2023-01-19) 5 commits
  (merged to 'next' on 2023-01-23 at d41cb5f527)
 + doc: pretty-formats note wide char limitations, and add tests
 + doc: pretty-formats describe use of ellipsis in truncation
 + doc: pretty-formats document negative column alignments
 + doc: pretty-formats: delineate `%<|(` parameter values
 + doc: pretty-formats: separate parameters from placeholders

 Clarify column-padding operators in the pretty format string.
 source: <20230119181827.1319-1-philipoakley@iee.email>

* sa/cat-file-mailmap--batch-check (2023-01-18) 1 commit
  (merged to 'next' on 2023-01-18 at 25ecb1dd3a)
 + git-cat-file.txt: fix list continuations rendering literally

 Docfix.
 source: <20230118082749.1252459-1-martin.agren@gmail.com>

* tb/t0003-invoke-dd-more-portably (2023-01-22) 1 commit
  (merged to 'next' on 2023-01-23 at 917aa24a27)
 + t0003: call dd with portable blocksize

 Test portability fix.
 source: <20230122062839.14542-1-tboegi@web.de>

--------------------------------------------------
[New Topics]

* ds/scalar-ignore-cron-error (2023-01-27) 3 commits
 - scalar: only warn when background maintenance fails
 - t921*: test scalar behavior starting maintenance
 - t: allow 'scalar' in test_must_fail

 Allow "scalar" to warn but continue when its periodic maintenance
 feature cannot be enabled.

 Will merge to 'next'.
 source: <pull.1473.git.1674849963.gitgitgadget@gmail.com>

* mh/doc-credential-cache-only-in-core (2023-01-29) 1 commit
  (merged to 'next' on 2023-01-30 at 021b5227af)
 + Documentation: clarify that cache forgets credentials if the system restarts

 Documentation clarification.

 Will merge to 'master'.
 source: <pull.1447.v3.git.1674936815117.gitgitgadget@gmail.com>

--------------------------------------------------
[Stalled]

* ja/worktree-orphan (2023-01-13) 4 commits
 - worktree add: add hint to direct users towards --orphan
 - worktree add: add --orphan flag
 - worktree add: refactor opt exclusion tests
 - worktree add: include -B in usage docs

 'git worktree add' learned how to create a worktree based on an
 orphaned branch with `--orphan`.

 Expecting a reroll.
 cf. <11be1b0e-ee38-119f-1d80-cb818946116b@dunelm.org.uk>
 source: <20230109173227.29264-1-jacobabel@nullpo.dev>

* ab/avoid-losing-exit-codes-in-tests (2022-12-20) 6 commits
 - tests: don't lose misc "git" exit codes
 - tests: don't lose "git" exit codes in "! ( git ... | grep )"
 - tests: don't lose exit status with "test <op> $(git ...)"
 - tests: don't lose exit status with "(cd ...; test <op> $(git ...))"
 - t/lib-patch-mode.sh: fix ignored exit codes
 - auto-crlf tests: don't lose exit code in loops and outside tests

 Test clean-up.

 Expecting a hopefully minor and final reroll.
 cf. <1182283a-4a78-3c99-e716-a8c3e58a5823@web.de>
 cf. <xmqqsfhb0vum.fsf@gitster.g>
 source: <cover-v4-0.6-00000000000-20221219T101240Z-avarab@gmail.com>

* tl/notes--blankline (2022-11-09) 5 commits
 - notes.c: introduce "--no-blank-line" option
 - notes.c: provide tips when target and append note are both empty
 - notes.c: drop unreachable code in 'append_edit()'
 - notes.c: cleanup for "designated init" and "char ptr init"
 - notes.c: cleanup 'strbuf_grow' call in 'append_edit'

 'git notes append' was taught '--[no-]blank-line' to conditionally
 add a LF between a new and existing note.

 Expecting a reroll.
 cf. <CAPig+cRcezSp4Rqt1Y9bD-FT6+7b0g9qHfbGRx65AOnw2FQXKg@mail.gmail.com>
 source: <cover.1667980450.git.dyroneteng@gmail.com>

* mc/switch-advice (2022-11-09) 1 commit
 - po: use `switch` over `checkout` in error message

 Use 'switch' instead of 'checkout' in an error message.

 Waiting for review response.
 source: <pull.1308.git.git.1668018620148.gitgitgadget@gmail.com>

* js/range-diff-mbox (2022-11-23) 1 commit
 - range-diff: support reading mbox files

 'git range-diff' gained support for reading either side from an .mbox
 file instead of a revision range.

 Waiting for review response.
 cf. <xmqqr0xupmnf.fsf@gitster.g>
 source: <pull.1420.v3.git.1669108102092.gitgitgadget@gmail.com>

* ab/tag-object-type-errors (2022-11-22) 5 commits
 - tag: don't emit potentially incorrect "object is a X, not a Y"
 - tag: don't misreport type of tagged objects in errors
 - object tests: add test for unexpected objects in tags
 - object-file.c: free the "t.tag" in check_tag()
 - Merge branch 'jk/parse-object-type-mismatch' into ab/tag-object-type-errors

 Hardening checks around mismatched object types when one of those
 objects is a tag.

 Expecting a reroll.
 cf. <xmqqzgb5jz5c.fsf@gitster.g>
 cf. <xmqqsfgxjugi.fsf@gitster.g>
 source: <cover-0.4-00000000000-20221118T113442Z-avarab@gmail.com>

* ab/config-multi-and-nonbool (2022-11-27) 9 commits
 - for-each-repo: with bad config, don't conflate <path> and <cmd>
 - config API: add "string" version of *_value_multi(), fix segfaults
 - config API users: test for *_get_value_multi() segfaults
 - for-each-repo: error on bad --config
 - config API: have *_multi() return an "int" and take a "dest"
 - versioncmp.c: refactor config reading next commit
 - config tests: add "NULL" tests for *_get_value_multi()
 - config tests: cover blind spots in git_die_config() tests
 - for-each-repo tests: test bad --config keys

 Assorted config API updates.

 Expecting a reroll.
 source: <cover-v3-0.9-00000000000-20221125T093158Z-avarab@gmail.com>

* ed/fsmonitor-inotify (2022-12-13) 6 commits
 - fsmonitor: update doc for Linux
 - fsmonitor: test updates
 - fsmonitor: enable fsmonitor for Linux
 - fsmonitor: implement filesystem change listener for Linux
 - fsmonitor: determine if filesystem is local or remote
 - fsmonitor: prepare to share code between Mac OS and Linux

 Bundled fsmonitor for Linux using inotify API.

 Needs review on the updated round.
 source: <pull.1352.v5.git.git.1670882286.gitgitgadget@gmail.com>

* jc/spell-id-in-both-caps-in-message-id (2022-12-17) 1 commit
 - e-mail workflow: Message-ID is spelled with ID in both capital letters

 Consistently spell "Message-ID" as such, not "Message-Id".

 Needs review.
 source: <xmqqsfhgnmqg.fsf@gitster.g>

* ad/test-record-count-when-harness-is-in-use (2022-12-25) 1 commit
 - test-lib: allow storing counts with test harnesses

 Allow summary results from tests to be written to t/test-results
 directory even when a test harness like 'prove' is in use.

 Needs review.
 source: <20221224225200.1027806-1-adam@dinwoodie.org>

* so/diff-merges-more (2022-12-18) 5 commits
 - diff-merges: improve --diff-merges documentation
 - diff-merges: issue warning on lone '-m' option
 - diff-merges: support list of values for --diff-merges
 - diff-merges: implement log.diffMerges-m-imply-p config
 - diff-merges: implement [no-]hide option and log.diffMergesHide config

 Assorted updates to "--diff-merges=X" option.

 May want to discard.  Breaking compatibility does not seem worth it.
 source: <20221217132955.108542-1-sorganov@gmail.com>

--------------------------------------------------
[Cooking]

* ab/hook-api-with-stdin (2023-01-23) 5 commits
 - hook: support a --to-stdin=<path> option for testing
 - sequencer: use the new hook API for the simpler "post-rewrite" call
 - hook API: support passing stdin to hooks, convert am's 'post-rewrite'
 - run-command: allow stdin for run_processes_parallel
 - run-command.c: remove dead assignment in while-loop

 Extend the run-hooks API to allow feeding data from the standard
 input when running the hook script(s).

 Expecting review responses.
 source: <cover-0.5-00000000000-20230123T170550Z-avarab@gmail.com>

* as/ssh-signing-improve-key-missing-error (2023-01-25) 1 commit
  (merged to 'next' on 2023-01-25 at 140f2c2c60)
 + ssh signing: better error message when key not in agent

 Improve the error message given when private key is not loaded in
 the ssh agent in the codepath to sign with an ssh key.

 Will merge to 'master'.
 source: <pull.1270.v3.git.git.1674650450662.gitgitgadget@gmail.com>

* en/rebase-incompatible-opts (2023-01-25) 10 commits
  (merged to 'next' on 2023-01-27 at 35a67cf2c6)
 + rebase: provide better error message for apply options vs. merge config
 + rebase: put rebase_options initialization in single place
 + rebase: fix formatting of rebase --reapply-cherry-picks option in docs
 + rebase: clarify the OPT_CMDMODE incompatibilities
 + rebase: add coverage of other incompatible options
 + rebase: fix incompatiblity checks for --[no-]reapply-cherry-picks
 + rebase: fix docs about incompatibilities with --root
 + rebase: remove --allow-empty-message from incompatible opts
 + rebase: flag --apply and --merge as incompatible
 + rebase: mark --update-refs as requiring the merge backend

 "git rebase" often ignored incompatible options instead of
 complaining, which has been corrected.

 Will merge to 'master'.
 Replaces en/rebase-update-refs-needs-merge-backend.
 source: <pull.1466.v5.git.1674619434.gitgitgadget@gmail.com>

* gm/request-pull-with-non-pgp-signed-tags (2023-01-25) 1 commit
  (merged to 'next' on 2023-01-30 at abc684d8df)
 + request-pull: filter out SSH/X.509 tag signatures

 Adjust "git request-pull" to strip embedded signature from signed
 tags to notice non-PGP signatures.

 Will merge to 'master'.
 source: <20230125234725.3918563-1-gwymor@tilde.club>

* cb/grep-fallback-failing-jit (2023-01-30) 2 commits
 - SQUASH???
 - grep: fall back to interpreter if JIT memory allocation fails

 In an environment where dynamically generated code is prohibited to
 run (e.g. SELinux), failure to JIT pcre patterns is expected.  Fall
 back to interpreted execution in such a case.

 Expecting a (hopefully final minor) reroll.
 cf. <xmqqlelj3hvk.fsf@gitster.g>
 source: <20230127154952.485913-1-minipli@grsecurity.net>

* cb/checkout-same-branch-twice (2023-01-20) 1 commit
 - checkout/switch: disallow checking out same branch in multiple worktrees

 "git checkout -B $branch" failed to protect against checking out
 a branch that is checked out elsewhere, unlike "git branch -f" did.

 Expecting a (hopefully final) reroll.
 cf. <8f24fc3c-c30f-dc70-5a94-5ee4ed3de102@dunelm.org.uk>
 source: <20230120113553.24655-1-carenas@gmail.com>

* ab/sequencer-unleak (2023-01-18) 8 commits
 - commit.c: free() revs.commit in get_fork_point()
 - builtin/rebase.c: free() "options.strategy_opts"
 - sequencer.c: always free() the "msgbuf" in do_pick_commit()
 - builtin/rebase.c: fix "options.onto_name" leak
 - builtin/revert.c: move free-ing of "revs" to replay_opts_release()
 - rebase & sequencer API: fix get_replay_opts() leak in "rebase"
 - sequencer.c: split up sequencer_remove_state()
 - rebase: use "cleanup" pattern in do_interactive_rebase()

 Plug leaks in sequencer subsystem and its users.

 Expecting a hopefully minor and final reroll.
 cf. <xmqqedry17r4.fsf@gitster.g>
 source: <cover-v3-0.8-00000000000-20230118T160600Z-avarab@gmail.com>

* jc/attr-doc-fix (2023-01-26) 1 commit
  (merged to 'next' on 2023-01-26 at cb327c4b5f)
 + attr: fix instructions on how to check attrs

 Comment fix.

 Will merge to 'master'.
 source: <pull.1441.v3.git.git.1674768107941.gitgitgadget@gmail.com>

* rj/avoid-switching-to-already-used-branch (2023-01-22) 3 commits
 - switch: reject if the branch is already checked out elsewhere (test)
 - rebase: refuse to switch to a branch already checked out elsewhere (test)
 - branch: fix die_if_checked_out() when ignore_current_worktree

 A few subcommands have been taught to stop users from working on a
 branch that is being used in another worktree linked to the same
 repository.

 Expecting a (hopefully final) reroll.
 cf. <d61a2393-64c8-da49-fe13-00bc4a52d5e3@gmail.com>
 source: <f7f45f54-9261-45ea-3399-8ba8dee6832b@gmail.com>

* rj/bisect-already-used-branch (2023-01-22) 1 commit
 - bisect: fix "reset" when branch is checked out elsewhere

 Allow "git bisect reset [name]" to check out the named branch (or
 the original one) even when the branch is already checked out in a
 different worktree linked to the same repository.

 Leaning negative. Why is it a good thing?
 cf. <xmqqo7qqovp1.fsf@gitster.g>
 source: <1c36c334-9f10-3859-c92f-3d889e226769@gmail.com>

* en/ls-files-doc-update (2023-01-13) 4 commits
  (merged to 'next' on 2023-01-27 at 20b9803add)
 + ls-files: guide folks to --exclude-standard over other --exclude* options
 + ls-files: clarify descriptions of status tags for -t
 + ls-files: clarify descriptions of file selection options
 + ls-files: add missing documentation for --resolve-undo option

 Doc update to ls-files.

 Will merge to 'master'.
 source: <pull.1463.git.1673584914.gitgitgadget@gmail.com>

* ms/send-email-feed-header-to-validate-hook (2023-01-19) 2 commits
 - send-email: expose header information to git-send-email's sendemail-validate hook
 - send-email: refactor header generation functions

 "git send-email" learned to give the e-mail headers to the validate
 hook by passing an extra argument from the command line.

 Expecting a (hopefully final) reroll.
 cf. <c1ba0a28-3c39-b313-2757-dceb02930334@amd.com>
 source: <20230120012459.920932-1-michael.strawbridge@amd.com>

* ds/bundle-uri-5 (2023-01-23) 10 commits
 - bundle-uri: test missing bundles with heuristic
 - bundle-uri: store fetch.bundleCreationToken
 - fetch: fetch from an external bundle URI
 - bundle-uri: drop bundle.flag from design doc
 - clone: set fetch.bundleURI if appropriate
 - bundle-uri: download in creationToken order
 - bundle-uri: parse bundle.<id>.creationToken values
 - bundle-uri: parse bundle.heuristic=creationToken
 - t5558: add tests for creationToken heuristic
 - bundle: optionally skip reachability walk

 The bundle-URI subsystem adds support for creation-token heuristics
 to help incremental fetches.

 Expecting a reroll.
 cf. <771a2993-85bd-0831-0977-24204f84e206@github.com>
 cf. <01f97aff-58a1-ef2c-e668-d37ea513c64e@github.com>
 cf. <ecc6b167-f5c4-48ce-3973-461d1659ed40@github.com>
 source: <pull.1454.v2.git.1674487310.gitgitgadget@gmail.com>

* tc/cat-file-z-use-cquote (2023-01-16) 1 commit
 - cat-file: quote-format name in error when using -z

 "cat-file" in the batch mode that is fed NUL-terminated pathnames
 learned to cquote them in its error output (otherwise, a funny
 pathname with LF in it would break the lines in the output stream).

 Expecting a reroll.
 cf. <2a2a46f0-a9bc-06a6-72e1-28800518777c@dunelm.org.uk>
 source: <20230116190749.4141516-1-toon@iotcl.com>

* cw/submodule-status-in-parallel (2023-01-17) 6 commits
 - submodule: call parallel code from serial status
 - diff-lib: parallelize run_diff_files for submodules
 - diff-lib: refactor match_stat_with_submodule
 - submodule: move status parsing into function
 - submodule: strbuf variable rename
 - run-command: add duplicate_output_fn to run_processes_parallel_opts

 "git submodule status" learned to run the comparison in submodule
 repositories in parallel.

 Expecting a reroll.
 cf. <CAFySSZBiW7=ZTmXRaLzCoKUi0Jd=fzvW5PJ6=Ka0jKHoP2ddSw@mail.gmail.com>
 cf. <kl6lo7qlvg4h.fsf@chooglen-macbookpro.roam.corp.google.com>
 source: <20230104215415.1083526-1-calvinwan@google.com>

* ab/various-leak-fixes (2023-01-18) 19 commits
 - push: free_refs() the "local_refs" in set_refspecs()
 - receive-pack: free() the "ref_name" in "struct command"
 - grep API: plug memory leaks by freeing "header_list"
 - grep.c: refactor free_grep_patterns()
 - object-file.c: release the "tag" in check_tag()
 - builtin/merge.c: free "&buf" on "Your local changes..." error
 - builtin/merge.c: use fixed strings, not "strbuf", fix leak
 - show-branch: free() allocated "head" before return
 - commit-graph: fix a parse_options_concat() leak
 - http-backend.c: fix cmd_main() memory leak, refactor reg{exec,free}()
 - http-backend.c: fix "dir" and "cmd_arg" leaks in cmd_main()
 - worktree: fix a trivial leak in prune_worktrees()
 - repack: fix leaks on error with "goto cleanup"
 - name-rev: don't xstrdup() an already dup'd string
 - various: add missing clear_pathspec(), fix leaks
 - clone: use free() instead of UNLEAK()
 - commit-graph: use free_commit_graph() instead of UNLEAK()
 - bundle.c: don't leak the "args" in the "struct child_process"
 - tests: mark tests as passing with SANITIZE=leak

 Leak fixes.

 Needs review.
 source: <cover-v5-00.19-00000000000-20230118T120334Z-avarab@gmail.com>

* rj/branch-unborn-in-other-worktrees (2023-01-19) 3 commits
 - branch: rename orphan branches in any worktree
 - branch: description for orphan branch errors
 - avoid unnecessary worktrees traversing

 Error messages given when working on an unborn branch that is
 checked out in another worktree have been improvved.

 Expecting a reroll.
 cf. <527f7315-be7b-7ec0-04fc-d07da7d4fefa@gmail.com>
 source: <34a58449-4f2e-66ef-ea01-119186aebd23@gmail.com>

* mc/credential-helper-auth-headers (2023-01-20) 12 commits
  (merged to 'next' on 2023-01-25 at cb95006bb2)
 + credential: add WWW-Authenticate header to cred requests
 + http: read HTTP WWW-Authenticate response headers
 + http: replace unsafe size_t multiplication with st_mult
 + test-http-server: add sending of arbitrary headers
 + test-http-server: add simple authentication
 + test-http-server: pass Git requests to http-backend
 + test-http-server: add HTTP request parsing
 + test-http-server: add HTTP error response function
 + test-http-server: add stub HTTP server test helper
 + daemon: rename some esoteric/laboured terminology
 + daemon: libify child process handling functions
 + daemon: libify socket setup and option functions

 Extending credential helper protocol.

 Will kick out of 'next'.  The test-only server is an eyesore.
 cf. <e57c1ca3-c21c-db41-a386-e5887f46055c@github.com>
 cf. <Y9JkMLueCwjkLHOr@coredump.intra.peff.net>
 source: <pull.1352.v7.git.1674252530.gitgitgadget@gmail.com>

--------------------------------------------------
[Discarded]

* jc/ci-deprecated-declarations-are-not-fatal (2023-01-14) 1 commit
  (merged to 'next' on 2023-01-14 at 5efb778ab0)
 + ci: do not die on deprecated-declarations warning

 CI build fix for overzealous -Werror.

 Reverted out of 'next'
 Preferring jk/curl-avoid-deprecated-api that fixes the code properly.
 source: <xmqq7cxpkpjp.fsf@gitster.g>

* po/pretty-hard-trunc (2022-11-13) 1 commit
 . pretty-formats: add hard truncation, without ellipsis, options

 Add a new pretty format which truncates without ellipsis.

 Superseded by the 'po/pretty-format-columns-doc' topic.
 source: <20221112143616.1429-1-philipoakley@iee.email>

* en/rebase-update-refs-needs-merge-backend (2023-01-22) 9 commits
  (merged to 'next' on 2023-01-23 at 1b65346647)
 + rebase: provide better error message for apply options vs. merge config
 + rebase: put rebase_options initialization in single place
 + rebase: fix formatting of rebase --reapply-cherry-picks option in docs
 + rebase: clarify the OPT_CMDMODE incompatibilities
 + rebase: add coverage of other incompatible options
 + rebase: fix docs about incompatibilities with --root
 + rebase: remove --allow-empty-message from incompatible opts
 + rebase: flag --apply and --merge as incompatible
 + rebase: mark --update-refs as requiring the merge backend

 The "--update-refs" feature of "git rebase" requires the use of the
 merge backend, while "--whitespace=fix" feature does not work with
 the said backend.  Notice the combination and error out, instead of
 silently ignoring one of the features requested.

 Reverted out of 'next' to be replaced with en/rebase-incompatible-opts
 source: <pull.1466.v4.git.1674367961.gitgitgadget@gmail.com>

* rs/tree-parse-mode-overflow-check (2023-01-21) 1 commit
 . tree-walk: disallow overflowing modes

 Reject tree objects with entries whose mode bits are overly wide.

 Retracted.
 cf. <b4b48877-5b80-e96f-d09f-2fe275f42950@web.de>
 source: <d673fde7-7eb2-6306-86b6-1c1a4c988ee8@web.de>

* cc/filtered-repack (2022-12-25) 3 commits
 . gc: add gc.repackFilter config option
 . repack: add --filter=<filter-spec> option
 . pack-objects: allow --filter without --stdout

 "git repack" learns to discard objects that ought to be retrievable
 again from the promisor remote.

 May want to discard.  Its jaggy edges may be a bit too sharp.
 cf. <Y7WTv19aqiFCU8au@ncase>
 source: <20221221040446.2860985-1-christian.couder@gmail.com>

^ permalink raw reply

* Re: [PATCH v2] grep: fall back to interpreter if JIT memory allocation fails
From: Ramsay Jones @ 2023-01-30 22:30 UTC (permalink / raw)
  To: Junio C Hamano, Mathias Krause
  Cc: git, Ævar Arnfjörð Bjarmason,
	Carlo Marcelo Arenas Belón
In-Reply-To: <xmqqk0131zxi.fsf@gitster.g>



On 30/01/2023 21:21, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
>> Having said all that, I do not mind queuing v2 if the "use *NO_JIT
>> to disable" is added to the message to help users who are forced to
>> redo the query.
> 
> In the meantime, here is what I plan to apply on top of v2 while
> queuing it.  The message given to die() should lack the terminating
> LF, and the overlong line can and should be split at operator
> boundary.
> 
> Thanks.
> 
>  grep.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git c/grep.c w/grep.c
> index 59afc3f07f..42f184bd09 100644
> --- c/grep.c
> +++ w/grep.c
> @@ -357,7 +357,11 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
>  			p->pcre2_jit_on = 0;
>  			return;
>  		} else if (jitret) {
> -			die("Couldn't JIT the PCRE2 pattern '%s', got '%d'\n", p->pattern, jitret);
> +			die("Couldn't JIT the PCRE2 pattern '%s', got '%d'%s",
> +			    p->pattern, jitret,
> +			    pcre2_jit_functional() 
> +			    ? "\nPerhaps prefix (*NO_GIT) to your pattern?" 

s/NO_GIT/NO_JIT/ ? :)

ATB,
Ramsay Jones

> +			    : "");
>  		}
>  
>  		/*

^ permalink raw reply

* Re: [PATCH v2] grep: fall back to interpreter if JIT memory allocation fails
From: Junio C Hamano @ 2023-01-30 21:21 UTC (permalink / raw)
  To: Mathias Krause
  Cc: git, Ævar Arnfjörð Bjarmason,
	Carlo Marcelo Arenas Belón
In-Reply-To: <xmqqlelj3hvk.fsf@gitster.g>

Junio C Hamano <gitster@pobox.com> writes:

> Having said all that, I do not mind queuing v2 if the "use *NO_JIT
> to disable" is added to the message to help users who are forced to
> redo the query.

In the meantime, here is what I plan to apply on top of v2 while
queuing it.  The message given to die() should lack the terminating
LF, and the overlong line can and should be split at operator
boundary.

Thanks.

 grep.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git c/grep.c w/grep.c
index 59afc3f07f..42f184bd09 100644
--- c/grep.c
+++ w/grep.c
@@ -357,7 +357,11 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt
 			p->pcre2_jit_on = 0;
 			return;
 		} else if (jitret) {
-			die("Couldn't JIT the PCRE2 pattern '%s', got '%d'\n", p->pattern, jitret);
+			die("Couldn't JIT the PCRE2 pattern '%s', got '%d'%s",
+			    p->pattern, jitret,
+			    pcre2_jit_functional() 
+			    ? "\nPerhaps prefix (*NO_GIT) to your pattern?" 
+			    : "");
 		}
 
 		/*

^ permalink raw reply related

* Re: [PATCH v2] grep: fall back to interpreter if JIT memory allocation fails
From: Junio C Hamano @ 2023-01-30 20:08 UTC (permalink / raw)
  To: Mathias Krause
  Cc: git, Ævar Arnfjörð Bjarmason,
	Carlo Marcelo Arenas Belón
In-Reply-To: <xmqq357r4zvk.fsf@gitster.g>

Junio C Hamano <gitster@pobox.com> writes:

> If we were to keep that "die", it is absolutely required, I would
> think.  Users who got their Git with JIT-enabled pcre2 may be
> viewing JIT merely as "a clever optimization the implementation is
> allowed to use when able", without knowing and more importantly
> without wanting to know how to disable it from within their
> patterns.
>
> But can't we drop that die() if we took the v1 route?

Having said all that, I do not mind queuing v2 if the "use *NO_JIT
to disable" is added to the message to help users who are forced to
redo the query.

And in practice, it shouldn't make that much difference, because the
only scenario (other than the SELinux-like situation where JIT is
compiled in but does not work at all) that the difference may matter
would happen when a non-trivial portion of the patterns users use
are not workable with JIT, but if that were the case, we would have
written JIT off as not mature enough and not yet usable long time
ago.  So, in practice, patterns refused by JIT would be a very tiny
minority to matter in real life, and "failing fast to inconvenience
users" would not be too bad.

So while I still think v1's simplicity is the right thing to have
here, I think it is waste of our braincell to compare v1 vs v2.  As
v2 gives smaller incremental behaviour change perceived by end
users, if somebody really wanted to, I'd expect that a low-hanging
fruit #leftoverbit on top of such a patch, after the dust settles,
would be to

 (1) rename pcre2_jit_functional() to fall_back_to_interpreter() or
     something,

 (2) add a configuration variable to tell fall_back_to_interpreter()
     that any form of JIT error is allowed to fall back to
     interpreter().

and such a patch will essentially give back the simplicity of v1 to
folks who opt into the configuration.

Thanks.

^ permalink raw reply

* Re: [PATCH 3/3] scalar: only warn when background maintenance fails
From: Derrick Stolee @ 2023-01-30 19:25 UTC (permalink / raw)
  To: Victoria Dye, Junio C Hamano, Derrick Stolee via GitGitGadget; +Cc: git
In-Reply-To: <3ade6d9f-8477-40c2-d683-d629e863c6ab@github.com>

On 1/27/2023 5:18 PM, Victoria Dye wrote:
> Junio C Hamano wrote:
>> "Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

This reply almost got lost in the shuffle, but mostly because it
wasn't super-relevant if we were going with the --no-maintenance
option. It's relevant again, so I wanted to point something out.

> I see Stolee's approach as more in line with how FSMonitor is treated by
> 'scalar', which is "only turn it on if it's supported, but otherwise do
> nothing" (the main difference here being that a warning is displayed if
> maintenance can't be turned on). That still fits the stated goal of 'scalar'
> ("configure all the niche large-repo settings for me when I
> clone/register"), but it makes 'scalar' more forgiving of system
> configurations that don't support maintenance.
> 
> That said, this doesn't distinguish between "maintenance couldn't be turned
> on because the system doesn't support it" and "maintenance couldn't be
> turned on because of an internal error". The latter might still be worth
> failing on, so maybe something like this would be more palatable?
> 
> --------8<--------8<--------8<--------
> diff --git a/scalar.c b/scalar.c
> index 6c52243cdf1..138780abe5f 100644
> --- a/scalar.c
> +++ b/scalar.c
> @@ -252,6 +252,10 @@ static int stop_fsmonitor_daemon(void)
>  	return 0;
>  }
>  
> +static int have_maintenance_support(void) {
> +	/* check whether at least one scheduler is supported on the system */
> +}
> +
>  static int register_dir(void)
>  {
>  	if (add_or_remove_enlistment(1))
> @@ -260,7 +264,7 @@ static int register_dir(void)
>  	if (set_recommended_config(0))
>  		return error(_("could not set recommended config"));
>  
> -	if (toggle_maintenance(1))
> +	if (have_maintenance_support() && toggle_maintenance(1))
>  		return error(_("could not turn on maintenance"));
>  
>  	if (have_fsmonitor_support() && start_fsmonitor_daemon()) {

The tricky thing about this is that have_fsmonitor_support() is
something we can detect at compile time, while have_maintenance_support()
would not (unless we start building for a new platform).

The case that brought this up is a platform that has both 'crontab'
and 'systemctl' on the PATH, but when executing those commands an
error occurs due to permissions.

So, if we wanted to distinguish between permissions issues and/or
other unrelated failures, we would need to be able to parse the
output of those commands. That seems fraught with potential errors,
so it seems like we should _attempt_ to enable maintenance and warn
with whatever failure is presented.

The only thing I could think is that we could define a custom exit
code within 'git maintenance start' that means "we were able to
start the scheduler process, but it failed" to differentiate from
other kinds of failures, such as failing to write to global config.

Thanks,
-Stolee

^ permalink raw reply

* Re: [PATCH v2 05/10] bundle-uri: download in creationToken order
From: Derrick Stolee @ 2023-01-30 19:12 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Victoria Dye, Derrick Stolee via GitGitGadget, git, me, avarab,
	steadmon, chooglen
In-Reply-To: <xmqqtu073kx1.fsf@gitster.g>

On 1/30/2023 2:02 PM, Junio C Hamano wrote:
> Derrick Stolee <derrickstolee@github.com> writes:
> 
>> I think there are two possible directions we can have when talking
>> about interrupted downloads:
>>
>> 1. The network connection was disconnected, and the client may want
>>    to respond to that with a retry and a ranged request.
>>
>> 2. The client process itself terminates for some reason, and a
>>    second process recognizes that some of the data already exists
>>    and could be used for a range request of the remainder.
... 
> Mostly.  We probably do not want / need to cater to "I killed it
> with ^C and would want to continue".

I mention it because it has been mentioned to me as an example use
case. This includes other failures, such as power outages.

I don't think we have range requests enabled for the longer-lived
packfile-uri, but we shall see if adoption of bundle URIs presents
more motivation to build that feature. Case (1) will be the easiest
to consider, and I agree that it would be more generally useful.

Thanks,
-Stolee

^ permalink raw reply

* Re: [PATCH 3/3] scalar: only warn when background maintenance fails
From: Derrick Stolee @ 2023-01-30 19:06 UTC (permalink / raw)
  To: Junio C Hamano, Victoria Dye; +Cc: Derrick Stolee via GitGitGadget, git
In-Reply-To: <xmqqy1pj3l3k.fsf@gitster.g>

On 1/30/2023 1:58 PM, Junio C Hamano wrote:
> Victoria Dye <vdye@github.com> writes:
> 
>> I'm also still worried about cluttering scalar's UX with options that toggle
>> use of its internally-configured options and features. One of the big
>> selling points for including scalar in the upstream project ([2], [3]) was
>> its ability to "intelligently" configure all of the settings a user would
>> need to optimize a large repository *without* a user needing to know what
>> any of those options are/what they mean. These settings are inherently
>> subject to change (due to use of experimental features); exposing a feature
>> toggle entrenches that setting permanently within scalar and makes a user
>> aware of implementation details that were intended to be hidden. At a high
>> level, it pushes scalar towards simply being an "opinionated" 'git
>> config'-configurator, which was a model I explicitly tried to move away from
>> while upstreaming last year.  
> 
> I personally do not think "opinionated configurator" is a bad model
> at all.  And "this does not seem to work here, so let's silently
> disable it, as the user does not want to hear about minute details"
> is a valid opinion to have for such a tool.
> 
> I too share the aversion to command line option for this one.
> Disabled periodic task support is most likely system-wide, and
> passing --no-whatever every time you touch a new repository on the
> same system does not make much sense.

Thanks, both. v2 will include --no-src, but not --no-maintenance.

-Stolee

^ permalink raw reply

* Re: [PATCH v2 05/10] bundle-uri: download in creationToken order
From: Junio C Hamano @ 2023-01-30 19:02 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Victoria Dye, Derrick Stolee via GitGitGadget, git, me, avarab,
	steadmon, chooglen
In-Reply-To: <07c4658e-89dd-0f82-77e9-e7c443f747cd@github.com>

Derrick Stolee <derrickstolee@github.com> writes:

> I think there are two possible directions we can have when talking
> about interrupted downloads:
>
> 1. The network connection was disconnected, and the client may want
>    to respond to that with a retry and a ranged request.
>
> 2. The client process itself terminates for some reason, and a
>    second process recognizes that some of the data already exists
>    and could be used for a range request of the remainder.
>
> I think both of these would not be handled at this layer, but
> instead further down, inside fetch_bundle_uri_internal()
> (specifically further down in download_https_uri_to_file()).
>
> Any retry logic should happen there, closer to the connection,
> and at the layer of the current patch, we should assume that any
> retry logic that was attempted ended up failing in the end.
>
> Does that satisfy your concerns here?

Mostly.  We probably do not want / need to cater to "I killed it
with ^C and would want to continue".

Thanks.

^ permalink raw reply

* Re: [PATCH 3/3] scalar: only warn when background maintenance fails
From: Junio C Hamano @ 2023-01-30 18:58 UTC (permalink / raw)
  To: Victoria Dye; +Cc: Derrick Stolee, Derrick Stolee via GitGitGadget, git
In-Reply-To: <db04e31d-681f-5809-f51a-37b9c734b45a@github.com>

Victoria Dye <vdye@github.com> writes:

> I'm also still worried about cluttering scalar's UX with options that toggle
> use of its internally-configured options and features. One of the big
> selling points for including scalar in the upstream project ([2], [3]) was
> its ability to "intelligently" configure all of the settings a user would
> need to optimize a large repository *without* a user needing to know what
> any of those options are/what they mean. These settings are inherently
> subject to change (due to use of experimental features); exposing a feature
> toggle entrenches that setting permanently within scalar and makes a user
> aware of implementation details that were intended to be hidden. At a high
> level, it pushes scalar towards simply being an "opinionated" 'git
> config'-configurator, which was a model I explicitly tried to move away from
> while upstreaming last year.  

I personally do not think "opinionated configurator" is a bad model
at all.  And "this does not seem to work here, so let's silently
disable it, as the user does not want to hear about minute details"
is a valid opinion to have for such a tool.

I too share the aversion to command line option for this one.
Disabled periodic task support is most likely system-wide, and
passing --no-whatever every time you touch a new repository on the
same system does not make much sense.

Thanks.

^ permalink raw reply

* Re: [PATCH v2] grep: fall back to interpreter if JIT memory allocation fails
From: Junio C Hamano @ 2023-01-30 18:54 UTC (permalink / raw)
  To: Mathias Krause
  Cc: git, Ævar Arnfjörð Bjarmason,
	Carlo Marcelo Arenas Belón
In-Reply-To: <9b5a1113-84f1-1651-bffc-6382462057dd@grsecurity.net>

Mathias Krause <minipli@grsecurity.net> writes:

> My patch doesn't make it worse than what 'git grep' would currently be
> doing. On the contrary, actually. It allows me to use PaX's MPROTECT and
> have a functional 'git grep' as well.

I know.  But then without the "why did it fail?" logic (i.e. the v1
patch), it does not make it worse than the current code, either, and
of course allows you to use JIT-enabled pcre2 even where JIT is
impossible due to MPROTECT and whatever reasson.

> Maybe the missing piece here is simply something like below to make
> users more aware of the possibility to disable the JIT for the more
> complex cases?:

If we were to keep that "die", it is absolutely required, I would
think.  Users who got their Git with JIT-enabled pcre2 may be
viewing JIT merely as "a clever optimization the implementation is
allowed to use when able", without knowing and more importantly
without wanting to know how to disable it from within their
patterns.

But can't we drop that die() if we took the v1 route?

> diff --git a/grep.c b/grep.c
> index 59afc3f07fc9..1422f168b087 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -357,7 +357,8 @@ static void compile_pcre2_pattern(struct grep_pat
> *p, const struct grep_opt *opt
>                         p->pcre2_jit_on = 0;
>                         return;
>                 } else if (jitret) {
> -                       die("Couldn't JIT the PCRE2 pattern '%s', got
> '%d'\n", p->pattern, jitret);
> +                       die("Couldn't JIT the PCRE2 pattern '%s', got
> '%d'%s\n", p->pattern, jitret,
> +                           pcre2_jit_functional() ? "\nYou might retry
> by prefixing the pattern with '(*NO_JIT)'" : "");
>                 }
>
>                 /*
>
> (Sorry about the wrapped lines, my mailer is just broken. I'll make it a
> proper patch, if such functionality is indeed wanted.)
>
> Thanks,
> Mathias

^ permalink raw reply

* Re: [PATCH v2] grep: fall back to interpreter if JIT memory allocation fails
From: Junio C Hamano @ 2023-01-30 18:49 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Mathias Krause, git, Carlo Marcelo Arenas Belón
In-Reply-To: <230130.867cx4s2j4.gmgdl@evledraar.gmail.com>

Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> If I compile libpcre2 with JIT support I'm expecting Git to use that,
> and not fall back in those cases where the JIT engine would give up.

The thing is, the reason why their Git has JIT enabled pcre2 for
many users is not because they choose to compile their own Git for
themselves because they wanted to play with JIT.  To them, their
distro and/or their employer gave a precompiled Git, in the hope
that with JIT would be faster than without JIT when JIT is usable.

In that context, "Speed is a feature in itself" is correct but
"failing fast, forcing the user to try different things" is not a
"Speed" feature at all.  It may be interesting only for those who
are curious to see what pattern was rejected by JIT.  It is
especially true as (1) we are willing to fall back to interpreter in
the SELinux senario, and (2) for normal users who want to use Git,
and not necessarily interested in playing with JIT, there is no
other recourse than prefixing "I do not want this JITted" to their
pattern ANYWAY.  Why fail fast and force the user to take the only
recourse manually, when the machinery already knows what the user's
only viable alternative is (i.e. falling back to the interpreter)?

> Pathological regexes are pretty much only interesting to anyone in the
> context of DoS attacks where they're being used to cause intentional
> slowdowns.

Exactly.

> Here we're discussing an orthagonal case where the "JIT fails", but
> rather than some pathological pattern it's because SELinux has made it
> not work at runtime, and we're trying to tease the two cases apart.

s/and we're/but you're/.  And I do not think you want to.

> I don't think this is plausible at all per the above, and that we
> shouldn't harm realistic use-cases to satisfy hypothetical ones.

To me, what you are advocating is exactly the hypothetical ones that
harm end-users who did not choose to enable JIT themselves.  When JIT
fails for whatever reason (including the SELinux senario) for them,
they do not need to be told by Git failing, when the interpreter can
give them the correct answer.  Wanting to see the result of the
operation they asked Git to do, while allowing Git to use clever
optimizations WHEN ABLE, is what I see as realistic use-cases.

^ permalink raw reply

* Re: [PATCH v2 10/10] bundle-uri: test missing bundles with heuristic
From: Derrick Stolee @ 2023-01-30 18:47 UTC (permalink / raw)
  To: Victoria Dye, Derrick Stolee via GitGitGadget, git
  Cc: gitster, me, avarab, steadmon, chooglen
In-Reply-To: <b3f2992d-f614-5cc7-f606-d3607a154685@github.com>

On 1/27/2023 2:21 PM, Victoria Dye wrote:
> Derrick Stolee via GitGitGadget wrote:
>> From: Derrick Stolee <derrickstolee@github.com>

>> +	# Only base bundle unbundled.
>> +	git -C download-2 for-each-ref --format="%(refname)" "refs/bundles/*" >refs &&
>> +	cat >expect <<-EOF &&
>> +	refs/bundles/base
>> +	refs/bundles/right
>> +	EOF
>> +	test_cmp expect refs &&
> 
> Maybe I'm misreading, but I don't think the comment ("Only base bundle
> unbundled") lines up with the expected bundle refs (both bundle-1
> ('refs/bundles/base') and bundle-3 ('refs/bundles/right') seem to be
> unbundled). 

>> +	# All bundles failed to unbundle
>> +	git -C download-3 for-each-ref --format="%(refname)" "refs/bundles/*" >refs &&
>> +	cat >expect <<-EOF &&
>> +	refs/bundles/base
>> +	refs/bundles/left
>> +	refs/bundles/right
>> +	EOF
>> +	test_cmp expect refs
> 
> Similar issue with the comment here - it says that all bundles *failed* to
> unbundle, but the test case description ("Case 3: top bundle does not exist,
> rest unbundle fine.") and the result show bundle-1, bundle-2, and bundle-3
> all unbundling successfully.

Thank you for reading carefully. I'm sorry about not updating
the comments after copying these checks around the test file.

Thanks,
-Stolee

^ permalink raw reply

* Re: [PATCH v2 05/10] bundle-uri: download in creationToken order
From: Derrick Stolee @ 2023-01-30 18:43 UTC (permalink / raw)
  To: Junio C Hamano, Victoria Dye
  Cc: Derrick Stolee via GitGitGadget, git, me, avarab, steadmon,
	chooglen
In-Reply-To: <xmqqpmaz93k3.fsf@gitster.g>

On 1/27/2023 2:32 PM, Junio C Hamano wrote:
> Victoria Dye <vdye@github.com> writes:
> 
>>> +			/*
>>> +			 * Not downloaded yet. Try downloading.
>>> +			 *
>>> +			 * Note that bundle->file is non-NULL if a download
>>> +			 * was attempted, even if it failed to download.
>>> +			 */
>>> +			if (fetch_bundle_uri_internal(ctx.r, bundle, ctx.depth + 1, ctx.list)) {
>>> +				/* Mark as unbundled so we do not retry. */
>>> +				bundle->unbundled = 1;
>>
>> This implicitly shows that, unlike a failed unbundling, a failed download is
>> always erroneous behavior, with the added benefit of avoiding (potentially
>> expensive) download re-attempts.
> 
> Hmph, I somehow was hoping that we'd allow an option to use range
> requests to resume an interrupted download in the future, so
> outright "always avoid attempts to download again" may not be what
> we want in the longer run.  But being able to tell if download
> failed (and there will probably be more than "success/failure" bit,
> but something like "we got an explicit 401 not found" vs "we were
> disconnected after downloading a few megabytes"), and unbundling
> failed (where there is no point attempting) is a good idea.

I think there are two possible directions we can have when talking
about interrupted downloads:

1. The network connection was disconnected, and the client may want
   to respond to that with a retry and a ranged request.

2. The client process itself terminates for some reason, and a
   second process recognizes that some of the data already exists
   and could be used for a range request of the remainder.

I think both of these would not be handled at this layer, but
instead further down, inside fetch_bundle_uri_internal()
(specifically further down in download_https_uri_to_file()).

Any retry logic should happen there, closer to the connection,
and at the layer of the current patch, we should assume that any
retry logic that was attempted ended up failing in the end.

Does that satisfy your concerns here?

Thanks,
-Stolee

^ permalink raw reply

* Re: [PATCH 3/3] scalar: only warn when background maintenance fails
From: Victoria Dye @ 2023-01-30 17:42 UTC (permalink / raw)
  To: Derrick Stolee, Junio C Hamano; +Cc: Derrick Stolee via GitGitGadget, git
In-Reply-To: <b63611dc-a889-8900-403a-ec7c42a89705@github.com>

Derrick Stolee wrote:
> On 1/27/2023 7:32 PM, Junio C Hamano wrote:
>> Derrick Stolee <derrickstolee@github.com> writes:
>>
>>>>  The "maintain
>>>> their clone" certainly should include running periodic maintenance
>>>> tasks without them having to worry about it.  It feels like this is
>>>> calling for an explicit "disable periodic maintenance tasks in this
>>>> repository" option to help these esoteric environments that disable
>>>> cron-like system services, while keeping the default safer,
>>>> i.e. fail loudly when the periodic maintenance tasks that the users
>>>> expect to happen cannot be enabled, or something.
>>>>
>>>> Perhaps I am not the primary audience, but hmph, I have a feeling
>>>> that this is not exactly going into a healthy direction.
>>>
>>> Here, we are in an environment where background maintenance is
>>> unavailable in an unexpected way. If that feature is not available
>>> to the user, should they not get the benefits of the others?
>>
>> That is not what I was saying.  I just have expected to see a way
>> for the user to give scalar an explicit "I understand that periodic
>> maintenance does not happen in this repository" consent, instead of
>> demoting an error detection for everybody to a warning that users
>> will just ignore.
> 
> Ah, so you'd prefer a --no-maintenance option for users who have
> this problem instead of just a warning. I'll do that in v2.

I mentioned this earlier [1], but I want to reiterate that I really don't
think a dedicated '--no-maintenance' option is a good approach to this
problem. I understand wanting more active user acknowledgement that "I
understand that periodic maintenance does not happen in this repository";
without that, users may (rightfully) be confused when they find their
scalar-cloned repository full of loose objects. But, in the use case you've
presented (where no scheduler is available), the user would need to -
somewhat redundantly, I feel - acknowledge that for *every* repository they
clone. 

I'm also still worried about cluttering scalar's UX with options that toggle
use of its internally-configured options and features. One of the big
selling points for including scalar in the upstream project ([2], [3]) was
its ability to "intelligently" configure all of the settings a user would
need to optimize a large repository *without* a user needing to know what
any of those options are/what they mean. These settings are inherently
subject to change (due to use of experimental features); exposing a feature
toggle entrenches that setting permanently within scalar and makes a user
aware of implementation details that were intended to be hidden. At a high
level, it pushes scalar towards simply being an "opinionated" 'git
config'-configurator, which was a model I explicitly tried to move away from
while upstreaming last year.  

I still believe treating maintenance like FSMonitor - pre-determining
whether the feature is available and only enabling it if possible - is the
most consistent and user-friendly solution to the given problem within the
context of scalar. But, if you feel that user acknowledgement is absolutely
critical, I'd strongly prefer a config setting like 'maintenance.enabled';
it could be set globally (the appropriate scope in the case of a system that
has no scheduler), or with '-c' with Scalar clone if it really needs to be
per-repo.

[1] https://lore.kernel.org/git/3ade6d9f-8477-40c2-d683-d629e863c6ab@github.com/
[2] https://lore.kernel.org/git/pull.1005.git.1630359290.gitgitgadget@gmail.com/
[3] https://lore.kernel.org/git/pull.1275.git.1656521925.gitgitgadget@gmail.com/

> 
> This could be a good time for me to upstream the --no-src option
> while I'm messing with arguments in 'scalar clone'.

For what it's worth, my concerns about option clutter don't really apply to
'--no-src' (cloning directly into a given directory, rather than
'<directory>/src'). Unlike features like FSMonitor and maintenance, the
'src/' directory is a user-facing Scalar design decision. It's also
something that seems to exist primarily for backward-compatibility reasons
(if I'm interpreting your earlier comments [4] correctly). This could be a
step on a deprecation path to make '--no-src' the default and remove the
legacy enlistment structure? At the very least, it's sufficiently outside
scalar's "configure for a large repo" scope for me to not worry about it
setting a bad precedent.

[4] https://lore.kernel.org/git/82716e5b-3522-68f5-7479-1b39811e0cb2@github.com/

> 
> Thanks,
> -Stolee

^ permalink raw reply

* Re: [PATCH 3/3] scalar: only warn when background maintenance fails
From: Junio C Hamano @ 2023-01-30 15:40 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Derrick Stolee via GitGitGadget, git, vdye
In-Reply-To: <b63611dc-a889-8900-403a-ec7c42a89705@github.com>

Derrick Stolee <derrickstolee@github.com> writes:

>>> Here, we are in an environment where background maintenance is
>>> unavailable in an unexpected way. If that feature is not available
>>> to the user, should they not get the benefits of the others?
>> 
>> That is not what I was saying.  I just have expected to see a way
>> for the user to give scalar an explicit "I understand that periodic
>> maintenance does not happen in this repository" consent, instead of
>> demoting an error detection for everybody to a warning that users
>> will just ignore.
>
> Ah, so you'd prefer a --no-maintenance option for users who have
> this problem instead of just a warning. I'll do that in v2.

Or a repository-local configuration to declare "no need to do the
maintenance stuff here", probably?  The expected use case you gave
does not match per-invocation command line option very well, right?

> This could be a good time for me to upstream the --no-src option
> while I'm messing with arguments in 'scalar clone'.

OK.  Thanks.


^ permalink raw reply

* Re: [PATCH v2] ls-tree: add --sparse-filter-oid argument
From: William Sprent @ 2023-01-30 15:28 UTC (permalink / raw)
  To: Elijah Newren
  Cc: William Sprent via GitGitGadget, git, Eric Sunshine,
	Ævar Arnfjörð Bjarmason, Derrick Stolee,
	Victoria Dye
In-Reply-To: <CABPp-BGs2wG6a3oR8dmT9dkeakoiZ+w-Tf=4A-GXeDVkJ9QNMA@mail.gmail.com>

On 28/01/2023 17.45, Elijah Newren wrote:
> On Fri, Jan 27, 2023 at 3:59 AM William Sprent <williams@unity3d.com> wrote:
>>
>> On 26/01/2023 04.25, Elijah Newren wrote:
>>> On Wed, Jan 25, 2023 at 8:16 AM William Sprent <williams@unity3d.com> wrote:
>>>>
>>>> On 25/01/2023 06.11, Elijah Newren wrote:
>>>>> It looks like Ævar and Victoria have both given really good reviews
>>>>> already, but I think I spotted some additional things to comment on.
>>>>>
>>>>> On Mon, Jan 23, 2023 at 3:46 AM William Sprent via GitGitGadget
>>>>> <gitgitgadget@gmail.com> wrote:
>>>>>>
>>>>>> From: William Sprent <williams@unity3d.com>
>>>>>>
>>>>>> There is currently no way to ask git the question "which files would be
>>>>>> part of a sparse checkout of commit X with sparse checkout patterns Y".
>>>>>> One use-case would be that tooling may want know whether sparse checkouts
>>>>>> of two commits contain the same content even if the full trees differ.
>>>>>
>>>>> Could you say more about this usecase?  Why does tooling need or want
>>>>> to know this; won't a checkout of the new commit end up being quick
>>>>> and simple?  (I'm not saying your usecase is bad, just curious that it
>>>>> had never occurred to me, and I'm afraid I'm still not sure what your
>>>>> purpose might be.)
>>>>>
>>>>
>>>> I'm thinking mainly about a monorepo context where there are a number of
>>>> distinct 'units' that can be described with sparse checkout patterns.
>>>> And perhaps there's some tooling that only wants to perform an action if
>>>> the content of a 'unit' changes.
>>>
>>> So, you're basically wanting to do something like
>>>      git ls-tree --paths-matching-sparsity-file=<pattern-file> $COMMIT1
>>>> sparse-files
>>>      git ls-tree --paths-matching-sparsity-file=<pattern-file> $COMMIT2
>>>>> sparse-files
>>>      sort sparse-files | uniq >relevant-files
>>>      git diff --name-only $COMMIT1 $COMMIT2 >changed-files
>>> and then checking whether relevant-files and changed-files have a
>>> non-empty intersection?
>>
>> Well, the concrete use-case I'm exploring is something along the lines
>> of using the content hashes of sparse checkouts as cache keys for resource
>> heavy jobs (builds/tests/etc).
>>
>> So, that would be something along the lines of,
>>
>>       git ls-tree -r --paths-matching-sparsity-file=<pattern-file> \
>>       | sha1sum > cache-key
>>
>> and then performing a lookup before performing an action (which would
>> then only be done in the context of the sparse checkout). My thinking
>> is that this only would require git and no additional tooling, which in
>> turn makes it very easy to reproduce the state where the job took place.
>>
>>
>>> Would that potentially be better handled by
>>>      git diff --name-only $COMMIT1 $COMMIT2 | git check-ignore
>>> --ignore-file=<pattern-file> --stdin
>>> and seeing whether the output is non-empty?  We'd have to add an
>>> "--ignore-file" option to check-ignore to override reading of
>>> .gitignore files and such, and it'd be slightly confusing because the
>>> documentation talks about "ignored" files rather than "selected"
>>> files, but that's a confusion point that has been with us ever since
>>> the gitignore mechanism was repurposed for sparse checkouts.  Or maybe
>>> we could just add a check-sparsity helper, and then allow it to take
>>> directories in-lieu of patterns.
>>
>> I don't think it necessarily would be better handled by that. But it would
>> be workable. It would be a matter of collating the output of
>>
>>     git ls-tree -r <commit>
>>
>> with
>>
>>     git ls-tree --name-only -r <commit> | git check-ignore ...
>>
>> Which is less ergonomic. But it is also a less intrusive change.
>>
>> Really, the main thing is to expose the sparse filtering logic somehow, and
>> allow for building tooling on top of it.
>>> This seems nicer than opening a can of worms about letting every git
>>> command specify a different set of sparsity rules.
>>
>> I think you are the better judge of how much of a can of worms that would
>> be. I don't think it would be too out of line with how git acts in general
>> though, as we have things like the the 'GIT_INDEX_FILE' env-var.
> 
> I agree you've got a reasonable usecase here.  The integration you've
> described sounds like something that could be independently composable
> with several other commands, and you alluded to that in your commit
> message.  But is adding it to ls-tree the best spot?  If we add it
> there, then folks who want to similarly integrate such capabilities
> with other commands (archive, diff, grep, rev-list --objects, etc.),
> seem more likely to do so via direct integration.  We already have a
> very large can of worms to work on to make commands behave in ways
> that are limited to sparse paths (see
> Documentation/techncial/sparse-checkout.txt, namely the "behavior A"
> stuff).  As can be seen in that document, what to do for limiting
> commands to sparse paths is obvious with some commands but has lots of
> interesting edge cases for others (even with years of experience with
> sparse checkouts, we had 3- and 4- way differing opinions on the
> intended behavior for some commands when we started composing that
> document a few months ago).  If we had jumped straight to
> implementation for some commands, we would have likely painted
> ourselves into a corner for other commands.  Adding another layer of
> specifying an alternate set of sparsity rules will likely have
> interesting knock-on effects that we should think through for all the
> commands to ensure we aren't painting ourselves into a similar corner,
> if we go down this route.
> 
> However, in the cases that sparse-checkout.txt document was
> addressing, the behavior fundamentally needs to be integrated into all
> the relevant commands to get user experience right.  In your case, you
> merely need a separate tool to be able to compose the output of
> different commands together.  So, exposing whether sparsity rules
> would select various paths in a single separate command (maybe git
> check-ignore, or a new sparse-checkout subcommand, or maybe just a new
> subcommand similar to check-ignore) would avoid a lot of these issues,
> and give us a single place to extend/improve as we learn about further
> usecases.

I do think that 'ls-tree' is ultimately (eventually?) the right place for
something like this. But you do make some good points about it perhaps being a
bit early.

> 
> [...]
>>>> I agree that it is a bit awkward to have to "translate" the directories
>>>> into patterns when wanting to use cone mode. I can try adding
>>>> '--[no]-cone' flags and see how that feels. Together with Victoria's
>>>> suggestions that would result in having the following flags:
>>>>
>>>> * --scope=(sparse|all)
>>>> * --sparse-patterns-file=<path>
>>>> * --[no]-cone: used together with --sparse-patterns-file to tell git
>>>>      whether to interpret the patterns given as directories (cone) or
>>>>      patterns (no-cone).
>>>>
>>>> Which seems like a lot at first glance. But it allows for passing
>>>> directories instead of patterns for cone mode, and is similar to the
>>>> behaviour of 'sparse-checkout set'.
>>>>
>>>> Does that seem like something that would make sense?
>>>
>>> --sparse-patterns-file still implies patterns; I think that would need
>>> some rewording.
>>
>> Yeah. After sleeping on it, I also think that it becomes a difficult
>> interface to work with, and you'll get different results with the same
>> patterns whether you pass --cone or --no-cone, which seems error prone
>> to me.
>>
>> For better or for worse, both cone and non-cone uses of sparse-checkouts
>> end up producing pattern files. And those pattern files do unambiguously
>> describe a filtering of the worktree whether it is in cone-mode or not.
> 
> Back when cone mode was introduced, I objected to reusing the existing
> pattern format and/or files...but was assuaged that it was just an
> implementation detail that wouldn't be exposed to users (since people
> would interact with the 'sparse-checkout' command instead of direct
> editing of $GIT_DIR/info/sparse-checkout).  It's still a performance
> penalty, because we waste time both turning directory names into
> patterns when we write them out, and when we read them in by parsing
> the patterns to see if they match cone mode rules and if so discard
> the patterns and just build up our hashes.  The patterns are nothing
> more than an intermediary format we waste time translating to and
> from, though once upon a time they existed so that if someone suddenly
> started using an older (now ancient) version of git on their current
> checkout then it could still hobble along with degraded performance
> instead of providing an error.  (We have introduced other changes to
> the config and git index which would cause older git clients to just
> fail to parse and throw an error, and even have defined mechanisms for
> such erroring out.  We could have taken advantage of that for this
> feature too.)
> 
> Anyway, long story short, if you're going to start exposing users to
> this implementation detail that was meant to be hidden (and do it in a
> way that may be copied into several commands to boot), then I
> definitely object.

Alright. Fair. I think with that additional context, I agree. I was coming from
a "this is a common format to both cone and non-cone modes" not a "this is a
leftover implementation detail from now-deprecated use cases" which is the vibe
I'm getting here.

I still think it would be unfortunate to have a single input parameter that
takes both kinds of specifications and interprets them either as cone or non-
cone depending on config and/or a flag. With cone mode specifications like
'a/b/c' being valid gitignore patterns there's no way of knowing whether the
user actually is supplying a pattern or accidentally supplied a directory in
non-cone mode.

With that in mind, I'd probably suggest something along the lines of having
'ls-tree' only accept cone-mode specifications (e.g. --cone-rules-file or so) to
avoid having the non-cone/cone distinction spread outside the 'sparse-checkout'
command. But that would be leaving non-cone users behind, and I guess we are not
quite there yet.
  
>> Given that 'ls-tree' is more of a plumbing command, I think it might still
>> make sense to use the patterns. That would also make the interaction
>> a bit more logical to me -- e.g. if you want to override the patterns
>> you have to pass them in the same format as the ones that would be read
>> by default.
> 
> No, sparsity specification should be provided by users the same way
> they normally specify it (i.e. the way they pass it to
> `sparse-checkout {add,set}`), not the way it's stored via some hidden
> implementation detail.
> 
> I'd make an exception if you ended up using `git check-ignore` because
> that command was specifically written about gitignore-style rules, and
> git-ignore-style rules just happen to have been reused as-is for
> non-cone-mode sparse-checkout rules.  But if you go that route:
>     (1) you have to frame any extensions to check-ignore as things that
> are useful for gitignore checking, and only incidentally useful to
> sparse-checkouts
>     (2) you'd have to forgo any cone-mode optimizations
> Going the check-ignore route seems like the easiest path to providing
> the basic functionality you need right now.  If your usecases remain
> niche, that might be good enough for all time too.  If your usecases
> become popular, though, I expect someone would eventually tire of
> using `check-ignore` and implement similar functionality along with
> supporting cone-mode in a `git check-sparsity` or `git sparse-checkout
> debug-rules` command or something like that.

I guess that would be workable. But I would only want to use the command with
cone-mode patterns, so losing the cone-mode optimisations would be suboptimal.

If the 'ls-tree' path isn't the best way forward right now, do you think it
would be viable to add a subcommand along the lines of 'debug-rules/check-rules'
to sparse-checkout?

Then

     git ls-tree -r HEAD | git sparse-checkout debug-rules

would output only the path that match the current patterns. And

     <...> | git sparse-checkout debug-rules --rules-file <file> --[no]-cone

would be equivalent to setting the patterns with

    cat <file> | git sparse-checkout set --stdin --[no]-cone

before running the command above.

Then all the cone/no-cone pattern specification stuff is still contained within
the 'sparse-checkout' command and the documentation for the sub-command would
reside in the sparse-checkout.tx file next to all the caveats and discussion about
sparse-checkout.



^ permalink raw reply

* Re: [PATCH 3/3] scalar: only warn when background maintenance fails
From: Derrick Stolee @ 2023-01-30 13:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Derrick Stolee via GitGitGadget, git, vdye
In-Reply-To: <xmqq357v8poc.fsf@gitster.g>

On 1/27/2023 7:32 PM, Junio C Hamano wrote:
> Derrick Stolee <derrickstolee@github.com> writes:
> 
>>>  The "maintain
>>> their clone" certainly should include running periodic maintenance
>>> tasks without them having to worry about it.  It feels like this is
>>> calling for an explicit "disable periodic maintenance tasks in this
>>> repository" option to help these esoteric environments that disable
>>> cron-like system services, while keeping the default safer,
>>> i.e. fail loudly when the periodic maintenance tasks that the users
>>> expect to happen cannot be enabled, or something.
>>>
>>> Perhaps I am not the primary audience, but hmph, I have a feeling
>>> that this is not exactly going into a healthy direction.
>>
>> Here, we are in an environment where background maintenance is
>> unavailable in an unexpected way. If that feature is not available
>> to the user, should they not get the benefits of the others?
> 
> That is not what I was saying.  I just have expected to see a way
> for the user to give scalar an explicit "I understand that periodic
> maintenance does not happen in this repository" consent, instead of
> demoting an error detection for everybody to a warning that users
> will just ignore.

Ah, so you'd prefer a --no-maintenance option for users who have
this problem instead of just a warning. I'll do that in v2.

This could be a good time for me to upstream the --no-src option
while I'm messing with arguments in 'scalar clone'.

Thanks,
-Stolee

^ permalink raw reply

* Re: [PATCH v2] grep: fall back to interpreter if JIT memory allocation fails
From: Mathias Krause @ 2023-01-30 11:08 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Ævar Arnfjörð Bjarmason,
	Carlo Marcelo Arenas Belón
In-Reply-To: <xmqqk0156z55.fsf@gitster.g>

On 29.01.23 18:15, Junio C Hamano wrote:
> Mathias Krause <minipli@grsecurity.net> writes:
> 
>> ... While we might be able to compile the pattern and run it in
>> interpreter mode, it'll likely have a *much* higher runtime.
>> ...
>> So this grep run eat up ~9.5 *hours* of CPU time. Do we really want to
>> fall back to something like this for the pathological cases? ...Yeah, I
>> don't think so either.
> 
> You may not, but I do not agree with you at all.  The code should
> not outsmart the user in such a case.

It doesn't. My rhetoric question was just missing "automatically" to
state that I would dislike an *automatic* fallback to the interpreter
for *pathological cases.* But I'm fine with (and that's what this patch
is all about!) a fallback to the interpreter for patterns that simply
fail the JIT because it's broken. Sorry for the confusion.

> Even if the pattern the user came up with is impossible to grok for
> a working JIT compiler, and it might be hard to grok for the
> interpreter, what is the next step you recommend the user if you
> refuse to fall back on the interprete?  "Rewrite it to please the
> JIT compiler"?

Not at all. A user is still free to disable the JIT and enforce using
the interpreter by using the "(*NO_JIT)" prefix. My patch doesn't
disable this behavior. My patch only tries to avoid having to specify it
for "regular" patterns when the JIT is broken anyways.

The key here is that this would be a manual step (in contrast to an
automatic fallback), i.e. we require explicit user consent to accept the
worse runtime performance. And, IMHO, that should be acceptable from a
usability point of view as this would only be required for the
pathological cases an otherwise functional JIT simply cannot handle.

> If that is the best pattern the user can produce to solve the
> problem at hand, being able to give the user an answer in 9 hours is
> much better than not being able to give anything at all.

Sure, fully agree.

> Maybe after waiting for 5 minutes, the user gets bored and ^C, or
> without killing it, open another terminal and try a different
> patern, and in 9 hours, perhaps comes up with an equivalent (or
> different but close enough) pattern that happens to run much faster,
> at which time the user may kill the original one.  In any of these
> cases, by refusing to run, the code is not doing any service to the
> user.

My patch doesn't make it worse than what 'git grep' would currently be
doing. On the contrary, actually. It allows me to use PaX's MPROTECT and
have a functional 'git grep' as well.

Maybe the missing piece here is simply something like below to make
users more aware of the possibility to disable the JIT for the more
complex cases?:

diff --git a/grep.c b/grep.c
index 59afc3f07fc9..1422f168b087 100644
--- a/grep.c
+++ b/grep.c
@@ -357,7 +357,8 @@ static void compile_pcre2_pattern(struct grep_pat
*p, const struct grep_opt *opt
                        p->pcre2_jit_on = 0;
                        return;
                } else if (jitret) {
-                       die("Couldn't JIT the PCRE2 pattern '%s', got
'%d'\n", p->pattern, jitret);
+                       die("Couldn't JIT the PCRE2 pattern '%s', got
'%d'%s\n", p->pattern, jitret,
+                           pcre2_jit_functional() ? "\nYou might retry
by prefixing the pattern with '(*NO_JIT)'" : "");
                }

                /*

(Sorry about the wrapped lines, my mailer is just broken. I'll make it a
proper patch, if such functionality is indeed wanted.)

Thanks,
Mathias

^ permalink raw reply related

* Re: [PATCH v2] grep: fall back to interpreter if JIT memory allocation fails
From: Ævar Arnfjörð Bjarmason @ 2023-01-30 10:56 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Mathias Krause, git, Carlo Marcelo Arenas Belón
In-Reply-To: <xmqqk0156z55.fsf@gitster.g>

On Sun, Jan 29 2023, Junio C Hamano wrote:

> Mathias Krause <minipli@grsecurity.net> writes:
>
>> ... While we might be able to compile the pattern and run it in
>> interpreter mode, it'll likely have a *much* higher runtime.
>> ...
>> So this grep run eat up ~9.5 *hours* of CPU time. Do we really want to
>> fall back to something like this for the pathological cases? ...Yeah, I
>> don't think so either.
>
> You may not, but I do not agree with you at all.  The code should
> not outsmart the user in such a case.

It's the falling back in the nominal case that would be outsmarting the
user.

If I compile libpcre2 with JIT support I'm expecting Git to use that,
and not fall back in those cases where the JIT engine would give up.

> Even if the pattern the user came up with is impossible to grok for
> a working JIT compiler, and it might be hard to grok for the
> interpreter, what is the next step you recommend the user if you
> refuse to fall back on the interprete?  "Rewrite it to please the
> JIT compiler"?

I'd argue that it's pretty much impossible to unintentionally write such
pathological patterns, the edge cases where e.g. the JIT would run out
of resources v.s. the normal engine are a non-issue for any "normal"
use.

Pathological regexes are pretty much only interesting to anyone in the
context of DoS attacks where they're being used to cause intentional
slowdowns.

Here we're discussing an orthagonal case where the "JIT fails", but
rather than some pathological pattern it's because SELinux has made it
not work at runtime, and we're trying to tease the two cases apart.

> If that is the best pattern the user can produce to solve the
> problem at hand, being able to give the user an answer in 9 hours is
> much better than not being able to give anything at all.

Speed is a feature in itself, and in a lot of cases (e.g. user-supplied
patterns vulnerable to a DoS attack) continuing on the slow path is much
worse.

Even just using my terminal for ad-hoc "git grep", I'd *much* rather get
an early error about the pattern exceeding JIT resources than continuing
on the fallback path.

If I had somehow written one by accident (and this is stretching
credulity) you can usually apply some minor tweaks to the pattern, and
then execute it in seconds instead of minutes/hours.

> Maybe after waiting for 5 minutes, the user gets bored and ^C, or
> without killing it, open another terminal and try a different
> patern, and in 9 hours, perhaps comes up with an equivalent (or
> different but close enough) pattern that happens to run much faster,
> at which time the user may kill the original one.  In any of these
> cases, by refusing to run, the code is not doing any service to the
> user.

I don't think this is plausible at all per the above, and that we
shouldn't harm realistic use-cases to satisfy hypothetical ones.

^ permalink raw reply

* Re: [PATCH v9 0/2] send-email: expose header information to git-send-email's sendemail-validate hook
From: Ævar Arnfjörð Bjarmason @ 2023-01-30 10:40 UTC (permalink / raw)
  To: Michael Strawbridge; +Cc: git
In-Reply-To: <5758ffc7-eb8c-4c16-d226-dd882cb2406b@amd.com>


On Mon, Jan 23 2023, Michael Strawbridge wrote:

> On 2023-01-23 08:51, Ævar Arnfjörð Bjarmason wrote:
  * Aside from that, shouldn't we have a new "validate-headers" or
>>    whatever hook, instead of assuming that we can add another argument
>>    to existing users?...
>
> While it's true we could (and I don't have a super strong opinion here),
> I suppose I was foreseeing the potential that a user may want to have
> logic that requires both the email headers and contents.  For example,
> only checking contents for a specific mailing list.  If we split the
> hooks, a user would then need to figure out how to have them coordinate.

...

>>
>>  * ...except can we do it safely? Now, it seems to me like you have
>>    potential correctness issues here. We call format_2822_time() to make
>>    the headers, but that's based on "$time", which we save away earlier.
>>
>>    But for the rest (e.g. "Message-Id" are we sure that we're giving the
>>    hook the same headers as the one we actually end up sending?
>>
>>    But regardless of that, something that would bypass this entire
>>    stdin/potential correctness etc. problem is if we just pass an offset
>>    to the the, i.e. currently we have a "validate" which gets the
>>    contents, if we had a "validate-raw" or whatever we could just pass:
>
> I think there might be a part missing here: "problem is if we just pass
> an offset to the ___."  So there's a chance I may not fully grasp your
> suggestion.

Sorry, a byte offset into the file to indicate the boundary between the
headers and the content.

>
>> 	<headers>
>> 	\n\n
>> 	<content>
>>
>>    Where the current "validate" just gets "content", no? We could then
>>    either pass the offset to the "\n\n", or just trust that such a hook
>>    knows to find the "\n\n".
>>
>>    I also think that would be more generally usable, as the tiny
>>    addition of some exit code interpretation would allow us to say "I
>>    got this, and consider this sent", which would also satisfy some who
>>    have wanted e.g. a way to intrecept it before it invokes "sendmail"
>>    (I remember a recent thread about that in relation to using "mutt" to
>>    send it directly)
>>
>>    
>
> Are you suggesting to simply add the header to the current
> sendemail-validate hook?

No, I'm saying that we currently don't pass them at all, and your patch
adds another argument to a file with the headers.

That *may* break some existing users if they're only expecting the
current argument(s) (although that's probably unlikely), more
importantly we're now doing extra work for all existing hook users, for
the benefit of only some new users.

So I'm suggesting having some opt-in mechanism for the new semantics,
both to preserve the existing semantics for existing users, and for
current and new users avoid writing out the file etc. when we don't need
to.

Which we could do with a config variable,
e.g. hooks."sendemail-validate".includeHeaders=true, or just by having a
new "sendemail-validate-raw" (or whatever we'd call it).

I think it's fine to enforce that if such a new hook exists we'd take it
over the "sendemail-validate" (if any), i.e. we wouldn't need to support
both.


^ permalink raw reply

* Re: [PATCH v15 4/6] fsmonitor: deal with synthetic firmlinks on macOS
From: Ævar Arnfjörð Bjarmason @ 2023-01-30 10:08 UTC (permalink / raw)
  To: Eric DeCosta via GitGitGadget
  Cc: git, Jeff Hostetler, Eric Sunshine, Torsten Bögershausen,
	Ramsay Jones, Johannes Schindelin, Eric DeCosta
In-Reply-To: <863063aefeeecfd23bb50eb111fcfbf5879a8ee3.1664904752.git.gitgitgadget@gmail.com>

On Tue, Oct 04 2022, Eric DeCosta via GitGitGadget wrote:

> From: Eric DeCosta <edecosta@mathworks.com>

> +	struct strbuf alias;

Rather than init-ing here...
> +	struct strbuf points_to = STRBUF_INIT;
> +
> +	dir = opendir(root);
> +	if (!dir)
> +		return error_errno(_("opendir('%s') failed"), root);
> +
> +	strbuf_init(&alias, 256);

...we pre-grow here...

> +	while ((de = readdir(dir)) != NULL) {

...but as shown here, we may not even use this at all, but even then is
this micro-optimization worth it? If it is a reader would be helped with
an explanation of what the 256 is, is this meant to be some OSX-specific
PATH_MAX, but hardcoded?

> +		strbuf_reset(&alias);
> +		strbuf_addf(&alias, "%s%s", root, de->d_name);
> +
> +		if (lstat(alias.buf, &st) < 0) {
> +			error_errno(_("lstat('%s') failed"), alias.buf);
> +			goto done;
> +		}
> +
> +		if (!S_ISLNK(st.st_mode))
> +			continue;
> +
> +		if (strbuf_readlink(&points_to, alias.buf, st.st_size) < 0) {
> +			error_errno(_("strbuf_readlink('%s') failed"), alias.buf);
> +			goto done;
> +		}
> +

Maybe this code would be simpler if you split it into a trivial static
function, so you could pass in the "alias" and "points_to", and just do
"return error...(...)" here and in the other places.

> +		if (!strncmp(points_to.buf, path, points_to.len) &&
> +			(path[points_to.len] == '/')) {
> +			strbuf_addbuf(&info->alias, &alias);
> +			strbuf_addbuf(&info->points_to, &points_to);

Earlier you use strbuf_addf() for a "append two strings", maybe we could
save ourselves a line and do the same here...

> +			trace_printf_key(&trace_fsmonitor,
> +				"Found alias for '%s' : '%s' -> '%s'",
> +				path, info->alias.buf, info->points_to.buf);

...except we're only doing this to emit this, and then we'll free() it?
Can't we just use %s%s here instead of %s, and e.g. pass
"info->alias.buf, alias" instead of the now-appende-to
"info->alias.buf"?

> +char *fsmonitor__resolve_alias(const char *path,
> +	const struct alias_info *info)

I commented on this in a few other places, and I'll stop noting these
now, but you're mis-indenting function decls consistently, also in a *.h
change later in this commit.

Please look through those for this series.

> +{
> +	if (!info->alias.len)
> +		return NULL;

Maybe "check if we have a zero-length string" should belong in the
caller, as "resolve it as an alias" for "\0" is nonsense?

> +
> +	if ((!strncmp(info->alias.buf, path, info->alias.len))
> +		&& path[info->alias.len] == '/') {
> +		struct strbuf tmp = STRBUF_INIT;
> +		const char *remainder = path + info->alias.len;
> +
> +		strbuf_addbuf(&tmp, &info->points_to);
> +		strbuf_add(&tmp, remainder, strlen(remainder));

There's no point in strbuf_add() if you have to dynamically use strlen()
over just using strbuf_addstr() (which inline resolves to the same),
let's just use that.

Or just a single strbuf_addf()...

> +		return strbuf_detach(&tmp, NULL);

...Or actually, these last 3 lines can be replaced by a mere:

	return xstrfmt("%s%s", info->points_to.buf, remainder);

Please just use that. It's not exactly the same due to the pre-sizing
with "addbuf", but again (I commented on a similar case in an earlier
commit), is such micro-optimization worth it here over brevity?

> +/*
> + * No-op for now.
> + */

Please just drop this comment, it doesn't add any information. It would
be useful to say why we're seemingly faking "no aliase on win32", or to
say that it does have them, but implementing them is a "TODO".

But we can see it's a "no-op for now" from the code....

> +int fsmonitor__get_alias(const char *path, struct alias_info *info)
> +{
> +	return 0;
> +}
> +
> +/*
> + * No-op for now.
> + */

...ditto.

> +/*
> + * Get the alias in given path, if any.
> + *
> + * Sets alias to the first alias that matches any part of the path.
> + *
> + * If an alias is found, info.alias and info.points_to are set to the
> + * found mapping.
> + *
> + * Returns -1 on error, 0 otherwise.
> + *
> + * The caller owns the storage that is occupied by info.alias and
> + * info.points_to and is responsible for releasing it.
> + */
> +int fsmonitor__get_alias(const char *path, struct alias_info *info);

I have not looked carefully at this interface, but instead of all of
this explanation & the caller needing to carefully reason about what
parts of this struct it can and can't peek into couldn't we just make it
take two "char **" arguments, one for "alias" and another for
"points_to".

It would then be obvious what the semantics are, and who owns the
memory.

But maybe retaining the strbuf-ness is important (but even then, we
could pass a "struct strbuf *" to populate.
> +
> +/*
> + * Resolve the path against the given alias.
> + *
> + * Returns the resolved path if there is one, NULL otherwise.
> + *
> + * The caller owns the storage that the returned string occupies and
> + * is responsible for releasing it.
> + */
> +char *fsmonitor__resolve_alias(const char *path,
> +	const struct alias_info *info);
> +

Here we don't say anything about the ownership & freeing of "info", does
the same apply? But the API design comment above also applies.

^ permalink raw reply

* Re: [PATCH v15 6/6] fsmonitor: add documentation for allowRemote and socketDir options
From: Ævar Arnfjörð Bjarmason @ 2023-01-30 10:04 UTC (permalink / raw)
  To: Eric DeCosta via GitGitGadget
  Cc: git, Jeff Hostetler, Eric Sunshine, Torsten Bögershausen,
	Ramsay Jones, Johannes Schindelin, Eric DeCosta
In-Reply-To: <af7309745f759532fdb79794289d9e02de0e035c.1664904752.git.gitgitgadget@gmail.com>


On Tue, Oct 04 2022, Eric DeCosta via GitGitGadget wrote:

> From: Eric DeCosta <edecosta@mathworks.com>
> [...]
> +include::config/fsmonitor--daemon.txt[]

We tend to name these files after the config namespace, not the name of
the built-in that's (mostly?) using it.
> +
>  include::config/gc.txt[]
>  
>  include::config/gitcvs.txt[]
> diff --git a/Documentation/config/fsmonitor--daemon.txt b/Documentation/config/fsmonitor--daemon.txt
> new file mode 100644
> index 00000000000..c225c6c9e74
> --- /dev/null
> +++ b/Documentation/config/fsmonitor--daemon.txt
> @@ -0,0 +1,11 @@

So this should be just .../config/fsmonitor.txt


^ permalink raw reply

* Re: [PATCH v15 2/6] fsmonitor: relocate socket file if .git directory is remote
From: Ævar Arnfjörð Bjarmason @ 2023-01-30  9:58 UTC (permalink / raw)
  To: Eric DeCosta via GitGitGadget
  Cc: git, Jeff Hostetler, Eric Sunshine, Torsten Bögershausen,
	Ramsay Jones, Johannes Schindelin, Eric DeCosta
In-Reply-To: <7bf1cdfe3b259f7135f3a50dcdd653563d5e19f6.1664904751.git.gitgitgadget@gmail.com>


On Tue, Oct 04 2022, Eric DeCosta via GitGitGadget wrote:

> From: Eric DeCosta <edecosta@mathworks.com>
>
> If the .git directory is on a remote filesystem, create the socket
> file in 'fsmonitor.socketDir' if it is defined, else create it in $HOME.
>
> Signed-off-by: Eric DeCosta <edecosta@mathworks.com>
> ---
>  Makefile                               |  1 +
>  builtin/fsmonitor--daemon.c            |  3 +-
>  compat/fsmonitor/fsm-ipc-darwin.c      | 52 ++++++++++++++++++++++++++
>  compat/fsmonitor/fsm-ipc-win32.c       |  9 +++++
>  compat/fsmonitor/fsm-settings-darwin.c |  2 +-
>  contrib/buildsystems/CMakeLists.txt    |  2 +
>  fsmonitor-ipc.c                        | 18 ++++-----
>  fsmonitor-ipc.h                        |  4 +-
>  8 files changed, 78 insertions(+), 13 deletions(-)
>  create mode 100644 compat/fsmonitor/fsm-ipc-darwin.c
>  create mode 100644 compat/fsmonitor/fsm-ipc-win32.c
>
> diff --git a/Makefile b/Makefile
> index ffab427ea5b..feb675a6959 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -2039,6 +2039,7 @@ ifdef FSMONITOR_DAEMON_BACKEND
>  	COMPAT_CFLAGS += -DHAVE_FSMONITOR_DAEMON_BACKEND
>  	COMPAT_OBJS += compat/fsmonitor/fsm-listen-$(FSMONITOR_DAEMON_BACKEND).o
>  	COMPAT_OBJS += compat/fsmonitor/fsm-health-$(FSMONITOR_DAEMON_BACKEND).o
> +	COMPAT_OBJS += compat/fsmonitor/fsm-ipc-$(FSMONITOR_DAEMON_BACKEND).o
>  endif
>  
>  ifdef FSMONITOR_OS_SETTINGS
> diff --git a/builtin/fsmonitor--daemon.c b/builtin/fsmonitor--daemon.c
> index 2c109cf8b37..0123fc33ed2 100644
> --- a/builtin/fsmonitor--daemon.c
> +++ b/builtin/fsmonitor--daemon.c
> @@ -1343,7 +1343,8 @@ static int fsmonitor_run_daemon(void)
>  	 * directory.)
>  	 */
>  	strbuf_init(&state.path_ipc, 0);
> -	strbuf_addstr(&state.path_ipc, absolute_path(fsmonitor_ipc__get_path()));
> +	strbuf_addstr(&state.path_ipc,
> +		absolute_path(fsmonitor_ipc__get_path(the_repository)));
>  
>  	/*
>  	 * Confirm that we can create platform-specific resources for the
> diff --git a/compat/fsmonitor/fsm-ipc-darwin.c b/compat/fsmonitor/fsm-ipc-darwin.c
> new file mode 100644
> index 00000000000..ce843d63348
> --- /dev/null
> +++ b/compat/fsmonitor/fsm-ipc-darwin.c
> @@ -0,0 +1,52 @@
> +#include "cache.h"
> +#include "config.h"
> +#include "strbuf.h"
> +#include "fsmonitor.h"
> +#include "fsmonitor-ipc.h"
> +#include "fsmonitor-path-utils.h"
> +
> +static GIT_PATH_FUNC(fsmonitor_ipc__get_default_path, "fsmonitor--daemon.ipc")
> +
> +const char *fsmonitor_ipc__get_path(struct repository *r)
> +{
> +	static const char *ipc_path = NULL;
> +	SHA_CTX sha1ctx;
> +	char *sock_dir = NULL;

...don't init this to NULL...

> +	struct strbuf ipc_file = STRBUF_INIT;
> +	unsigned char hash[SHA_DIGEST_LENGTH];
> +
> +	if (!r)
> +		BUG("No repository passed into fsmonitor_ipc__get_path");
> +
> +	if (ipc_path)
> +		return ipc_path;
> +
> +
> +	/* By default the socket file is created in the .git directory */
> +	if (fsmonitor__is_fs_remote(r->gitdir) < 1) {
> +		ipc_path = fsmonitor_ipc__get_default_path();
> +		return ipc_path;
> +	}
> +
> +	SHA1_Init(&sha1ctx);
> +	SHA1_Update(&sha1ctx, r->worktree, strlen(r->worktree));
> +	SHA1_Final(hash, &sha1ctx);
> +
> +	repo_config_get_string(r, "fsmonitor.socketdir", &sock_dir);

...instead check the return value here. So:

	if (!repo_config_get_string(..., &sock_dir))
        	...

> +	/* Create the socket file in either socketDir or $HOME */
> +	if (sock_dir && *sock_dir) {
> +		strbuf_addf(&ipc_file, "%s/.git-fsmonitor-%s",
> +					sock_dir, hash_to_hex(hash));

^ Add the body of this branch to the "..." above.

> +	} else {
> +		strbuf_addf(&ipc_file, "~/.git-fsmonitor-%s", hash_to_hex(hash));

...and keep this in the "else".

> +	}
> +	free(sock_dir);

You'd then add this free() to the first branch, but better yet in this
case avoid this allocation, use "const char *" and use
repo_config_get_string_tmp(). It's made for exactly this sort of use-case.

> +
> +	ipc_path = interpolate_path(ipc_file.buf, 1);
> +	if (!ipc_path)
> +		die(_("Invalid path: %s"), ipc_file.buf);
> +
> +	strbuf_release(&ipc_file);
> +	return ipc_path;
> +}
> diff --git a/compat/fsmonitor/fsm-ipc-win32.c b/compat/fsmonitor/fsm-ipc-win32.c
> new file mode 100644
> index 00000000000..e08c505c148
> --- /dev/null
> +++ b/compat/fsmonitor/fsm-ipc-win32.c
> @@ -0,0 +1,9 @@
> +#include "config.h"
> +#include "fsmonitor-ipc.h"
> +
> +const char *fsmonitor_ipc__get_path(struct repository *r) {
> +	static char *ret;

Missing \n here.

>  try_again:
> -	state = ipc_client_try_connect(fsmonitor_ipc__get_path(), &options,
> -				       &connection);
> +	state = ipc_client_try_connect(fsmonitor_ipc__get_path(the_repository),
> +						&options, &connection);

This post-image is mis-indented.

>  #include "simple-ipc.h"
>  
> +struct repository;
> +

I think we'd usually forard-declare such things, if needed...

>  /*
>   * Returns true if built-in file system monitor daemon is defined
>   * for this platform.
> @@ -16,7 +18,7 @@ int fsmonitor_ipc__is_supported(void);
>   *
>   * Returns NULL if the daemon is not supported on this platform.
>   */
> -const char *fsmonitor_ipc__get_path(void);

..right before they're needed, so before this line?

> +const char *fsmonitor_ipc__get_path(struct repository *r);
>  
>  /*
>   * Try to determine whether there is a `git-fsmonitor--daemon` process


^ permalink raw reply

* Re: [PATCH v15 1/6] fsmonitor: refactor filesystem checks to common interface
From: Ævar Arnfjörð Bjarmason @ 2023-01-30  9:37 UTC (permalink / raw)
  To: Eric DeCosta via GitGitGadget
  Cc: git, Jeff Hostetler, Eric Sunshine, Torsten Bögershausen,
	Ramsay Jones, Johannes Schindelin, Eric DeCosta
In-Reply-To: <ec49a74086dee4545a3bb7b24c8da22e77aef579.1664904751.git.gitgitgadget@gmail.com>


On Tue, Oct 04 2022, Eric DeCosta via GitGitGadget wrote:

> From: Eric DeCosta <edecosta@mathworks.com>
>
> Provide a common interface for getting basic filesystem information
> including filesystem type and whether the filesystem is remote.
>
> Refactor existing code for getting basic filesystem info and detecting
> remote file systems to the new interface.
>
> Refactor filesystem checks to leverage new interface. For macOS,
> error-out if the Unix Domain socket (UDS) file is on a remote
> filesystem.
>
> Signed-off-by: Eric DeCosta <edecosta@mathworks.com>
> ---
>  Makefile                                 |   1 +
>  compat/fsmonitor/fsm-path-utils-darwin.c |  43 ++++++
>  compat/fsmonitor/fsm-path-utils-win32.c  | 128 +++++++++++++++++
>  compat/fsmonitor/fsm-settings-darwin.c   |  69 +++------
>  compat/fsmonitor/fsm-settings-win32.c    | 172 +----------------------
>  contrib/buildsystems/CMakeLists.txt      |   2 +
>  fsmonitor-path-utils.h                   |  26 ++++
>  fsmonitor-settings.c                     |  50 +++++++
>  8 files changed, 272 insertions(+), 219 deletions(-)
>  create mode 100644 compat/fsmonitor/fsm-path-utils-darwin.c
>  create mode 100644 compat/fsmonitor/fsm-path-utils-win32.c
>  create mode 100644 fsmonitor-path-utils.h
>
> diff --git a/Makefile b/Makefile
> index cac3452edb9..ffab427ea5b 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -2044,6 +2044,7 @@ endif
>  ifdef FSMONITOR_OS_SETTINGS
>  	COMPAT_CFLAGS += -DHAVE_FSMONITOR_OS_SETTINGS
>  	COMPAT_OBJS += compat/fsmonitor/fsm-settings-$(FSMONITOR_OS_SETTINGS).o
> +	COMPAT_OBJS += compat/fsmonitor/fsm-path-utils-$(FSMONITOR_OS_SETTINGS).o
>  endif
>  
>  ifeq ($(TCLTK_PATH),)
> diff --git a/compat/fsmonitor/fsm-path-utils-darwin.c b/compat/fsmonitor/fsm-path-utils-darwin.c
> new file mode 100644
> index 00000000000..d46d7f13538
> --- /dev/null
> +++ b/compat/fsmonitor/fsm-path-utils-darwin.c
> @@ -0,0 +1,43 @@
> +#include "fsmonitor.h"
> +#include "fsmonitor-path-utils.h"
> +#include <sys/param.h>
> +#include <sys/mount.h>
> +
> +int fsmonitor__get_fs_info(const char *path, struct fs_info *fs_info)
> +{
> +	struct statfs fs;

Should have a \n here after variable decls.

> +	if (statfs(path, &fs) == -1) {
> +		int saved_errno = errno;
> +		trace_printf_key(&trace_fsmonitor, "statfs('%s') failed: %s",
> +				 path, strerror(saved_errno));
> +		errno = saved_errno;
> +		return -1;
> +	}
> +
> +	trace_printf_key(&trace_fsmonitor,
> +			 "statfs('%s') [type 0x%08x][flags 0x%08x] '%s'",
> +			 path, fs.f_type, fs.f_flags, fs.f_fstypename);
> +
> +	if (!(fs.f_flags & MNT_LOCAL))
> +		fs_info->is_remote = 1;
> +	else
> +		fs_info->is_remote = 0;

Instead:

	fs_info->is_remote = !(fs.f_flags & MNT_LOCAL)

?

> +int fsmonitor__is_fs_remote(const char *path)
> +{
> +	struct fs_info fs;

Same style issue as above.

> +	if (fsmonitor__get_fs_info(path, &fs))
> +		return -1;
> +
> +	free(fs.typename);

Can we get a "free_fs_info()" function or something instead, this is one
of N codepaths where we now peek into that struct. If we ever add
another field that needs malloc'ing altering all callers will be
painful.


> +	if (xutftowcs_path(wpath, path) < 0) {
> +		return -1;
> +	}

Maybe drop the braces here as this code is being modified-while-moved
anyway?

> +
> +	/*
> +	 * GetDriveTypeW() requires a final slash.  We assume that the
> +	 * worktree pathname points to an actual directory.
> +	 */
> +	wlen = wcslen(wpath);
> +	if (wpath[wlen - 1] != L'\\' && wpath[wlen - 1] != L'/') {
> +		wpath[wlen++] = L'\\';
> +		wpath[wlen] = 0;
> +	}
> +
> +	/*
> +	 * Normalize the path.  If nothing else, this converts forward
> +	 * slashes to backslashes.  This is essential to get GetDriveTypeW()
> +	 * correctly handle some UNC "\\server\share\..." paths.
> +	 */
> +	if (!GetFullPathNameW(wpath, MAX_PATH, wfullpath, NULL)) {
> +		return -1;
> +	}

Here you have added needless braces while moving this, let's not do
that.

> +
> +	driveType = GetDriveTypeW(wfullpath);
> +	trace_printf_key(&trace_fsmonitor,
> +			 "DriveType '%s' L'%ls' (%u)",
> +			 path, wfullpath, driveType);
> +
> +	if (driveType == DRIVE_REMOTE) {
> +		fs_info->is_remote = 1;
> +		if (check_remote_protocol(wfullpath) < 0)
> +			return -1;

Maybe this code should be more careful not to modify the passed-in
struct if we're returning an error?

I.e. do this "return -1" before assigning "is_remote = 1"?

> +	} else {
> +		fs_info->is_remote = 0;
> +	}

Maybe just at the start: "struct fs_info = { 0 }" and skip this "else" ?

> +
> +	trace_printf_key(&trace_fsmonitor,
> +				"'%s' is_remote: %d",
> +				path, fs_info->is_remote);
> +
> +	return 0;
> +}
> +
> +int fsmonitor__is_fs_remote(const char *path)
> +{
> +	struct fs_info fs;
> +	if (fsmonitor__get_fs_info(path, &fs))
> +		return -1;
> +	return fs.is_remote;

I find this and the resulting "switch/case" caller rather un-idiomatic,
i.e. you end up checking 1/0/default.

Why not in your check_remote() instead:

	int is_remote;

	if (fsmonitor__check_fs_remote(r->worktree, &is_remote))
		return FSMONITOR_REASON_ERROR;
	else if (!is_remote)
		return FSMONITOR_REASON_OK;
	...

Where the "..." is the "repo_config_get_bool()" etc. code I suggest
below.

I.e. having an "is" function return non-boolean is somewhat confusing,
better to write to a variable (which the config API you're using does).

> +#ifdef HAVE_FSMONITOR_OS_SETTINGS
> +static enum fsmonitor_reason check_remote(struct repository *r)
> +{
> +	int allow_remote = -1; /* -1 unset, 0 not allowed, 1 allowed */

Let's not init this to -1, and instead...

> +	int is_remote = fsmonitor__is_fs_remote(r->worktree);
> +
> +	switch (is_remote) {
> +		case 0:

The usual coding style is to not indent the "case" further than the
"switch".

> +			return FSMONITOR_REASON_OK;
> +		case 1:
> +			repo_config_get_bool(r, "fsmonitor.allowremote", &allow_remote);
> +			if (allow_remote < 1)

...continued from above: We can scope this variable to this case
statement, as "case 1: { int v; ... }", but furthermore you don't need
to init it to -1 and check this -1 case if you just check the return
value of repo_config_get_bool(), which IMO is also more obvious:

	int v;

	if (!repo...(..., &v))
		return v ? FSMONITOR_REASON_OK : FSMONITOR_REASON_REMOTE:
	else
		return FSMONITOR_REASON_REMOTE;

I.e. you clearly separate the cases where it's un-init'd by checking the
return value.

Maybe better (but would result in more churn later if you ever want to
change the default):

	int v;

	return !repo...(..., &v) && v ? FSMONITOR_REASON_OK :
		FSMONITOR_REASON_REMOTE:

^ permalink raw reply

* Re: [PATCH v4 3/5] notes.c: drop unreachable code in 'append_edit()'
From: Eric Sunshine @ 2023-01-30  5:38 UTC (permalink / raw)
  To: Teng Long; +Cc: avarab, git, tenglong.tl
In-Reply-To: <20230128115027.57250-1-tenglong.tl@alibaba-inc.com>

On Sat, Jan 28, 2023 at 6:50 AM Teng Long <dyroneteng@gmail.com> wrote:
> Eric Sunshine writes:
> > > Situation of note "removing" shouldn't happen in 'append_edit()',
> > > unless it's a bug. So, let's drop the unreachable "else" code
> > > in "append_edit()".
> > >
> > > Signed-off-by: Teng Long <dyroneteng@gmail.com>
> > > ---
> > > -       } else {
> > > -               fprintf(stderr, _("Removing note for object %s\n"),
> > > -                       oid_to_hex(&object));
> > > -               remove_note(t, object.hash);
> > > -               logmsg = xstrfmt("Notes removed by 'git notes %s'", argv[0]);
> > > +               commit_notes(the_repository, t, logmsg);
> > >         }
> > > -       commit_notes(the_repository, t, logmsg);
> >
> > This change breaks removal of notes using "git notes edit". Prior to
> > this change, if you delete the content of a note using "git notes
> > edit", then the note is removed. Following this change, the note
> > remains, which contradicts documented[1] behavior.
>
> As the commit msg describes, the subcommands I understand should have clear
> responsibilities as possible (documentaion may have some effect in my mind). So,
> the removal opertion under "append subcommand" here is little wired to me, but
> your suggestion makes sense, this may have compatibility issues. Although I
> think it's weird that someone would use this in the presence of the remove
> subcommand, and my feeling is that this code is actually copied from the add
> method (introduced by 52694cdabbf68f19c8289416e7bb3bbef41d8d27), but I'm not
> sure.
>
> So, it's ok for me to drop this one.

It's unfortunate, perhaps, that the git-notes command-set is not
orthogonal, but it's another case of historic accretion. According to
the history, as originally implemented, git-notes only supported two
commands, "edit" and "show", and "edit" was responsible for _all_
mutation operations, including deletion. git-notes only grew a more
complete set of commands a couple years later.

At any rate, even though the original behaviors may not make as much
sense these days now that git-notes has a more complete set of
commands, we still need to be careful not to break existing workflows
and scripts (tooling). That's why it would be nice to beef up the
tests so that the existing behaviors don't get broken by changes such
as this patch wants to make. But, as mentioned earlier, the additional
tests probably shouldn't be part of this series, but rather can be
done as a separate series (by someone; it doesn't have to be you).

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox