* Re: [PATCH v1] convert: add "status=delayed" to filter process protocol
From: Taylor Blau @ 2017-01-10 23:33 UTC (permalink / raw)
To: Jakub Narębski; +Cc: Junio C Hamano, Lars Schneider, git, Eric Wong
In-Reply-To: <ec8078ef-8ff2-d26f-ef73-5ef612737eee@gmail.com>
On Tue, Jan 10, 2017 at 11:11:01PM +0100, Jakub Narębski wrote:
> W dniu 09.01.2017 o 00:42, Junio C Hamano pisze:
> > larsxschneider@gmail.com writes:
> >> From: Lars Schneider <larsxschneider@gmail.com>
> >>
> >> Some `clean` / `smudge` filters might require a significant amount of
> >> time to process a single blob. During this process the Git checkout
> >> operation is blocked and Git needs to wait until the filter is done to
> >> continue with the checkout.
>
> Lars, what is expected use case for this feature; that is when do you
> think this problem may happen? Is it something that happened IRL?
>
> >>
> >> Teach the filter process protocol (introduced in edcc858) to accept the
> >> status "delayed" as response to a filter request. Upon this response Git
> >> continues with the checkout operation and asks the filter to process the
> >> blob again after all other blobs have been processed.
> >
> > Hmm, I would have expected that the basic flow would become
> >
> > for each paths to be processed:
> > convert-to-worktree to buf
> > if not delayed:
> > do the caller's thing to use buf
> > else:
> > remember path
> >
> > for each delayed paths:
> > ensure filter process finished processing for path
> > fetch the thing to buf from the process
> > do the caller's thing to use buf
>
> I would expect here to have a kind of event loop, namely
>
> while there are delayed paths:
> get path that is ready from filter
> fetch the thing to buf (supporting "delayed")
> if path done
> do the caller's thing to use buf
> (e.g. finish checkout path, eof convert, etc.)
>
> We can either trust filter process to tell us when it finished sending
> delayed paths, or keep list of paths that are being delayed in Git.
This makes a lot of sense to me. The "get path that is ready from filter" should
block until the filter has data that it is ready to send. This way Git isn't
wasting time in a busy-loop asking whether the filter has data ready to be sent.
It also means that if the filter has one large chunk that it's ready to write,
Git can work on that while the filter continues to process more data,
theoretically improving the performance of checkouts with many large delayed
objects.
>
> >
> > and that would make quite a lot of sense. However, what is actually
> > implemented is a bit disappointing from that point of view. While
> > its first part is the same as above, the latter part instead does:
> >
> > for each delayed paths:
> > checkout the path
> >
> > Presumably, checkout_entry() does the "ensure that the process is
> > done converting" (otherwise the result is simply buggy), but what
> > disappoints me is that this does not allow callers that call
> > "convert-to-working-tree", whose interface is obtain the bytestream
> > in-core in the working tree representation, given an object in the
> > object-db representation in an in-core buffer, to _use_ the result
> > of the conversion. The caller does not have a chance to even see
> > the result as it is written straight to the filesystem, once it
> > calls checkout_delayed_entries().
> >
>
In addition to the above, I'd also like to investigate adding a "no more items"
message into the filter protocol. This would be useful for filters that
batch delayed items into groups. In particular, if the batch size is `N`, and Git
sends `2N-1` items, the second batch will be under-filled. The filter on the
other end needs some mechanism to send the second batch, even though it hasn't
hit max capacity.
Specifically, this is how Git LFS implements object transfers for data it does
not have locally, but I'm sure that this sort of functionality would be useful
for other filter implementations as well.
--
Thanks,
Taylor Blau
ttaylorr@github.com
^ permalink raw reply
* What's cooking in git.git (Jan 2017, #01; Tue, 10)
From: Junio C Hamano @ 2017-01-10 23:48 UTC (permalink / raw)
To: git
Here are the topics that have been cooking. Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'. The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.
You can find the changes described here in the integration branches
of the repositories listed at
http://git-blame.blogspot.com/p/git-public-repositories.html
--------------------------------------------------
[Graduated to "master"]
* bw/grep-recurse-submodules (2016-12-22) 12 commits
(merged to 'next' on 2016-12-22 at 1ede815b8d)
+ grep: search history of moved submodules
+ grep: enable recurse-submodules to work on <tree> objects
+ grep: optionally recurse into submodules
+ grep: add submodules as a grep source type
+ submodules: load gitmodules file from commit sha1
+ submodules: add helper to determine if a submodule is initialized
+ submodules: add helper to determine if a submodule is populated
(merged to 'next' on 2016-12-22 at fea8fa870f)
+ real_path: canonicalize directory separators in root parts
+ real_path: have callers use real_pathdup and strbuf_realpath
+ real_path: create real_pathdup
+ real_path: convert real_path_internal to strbuf_realpath
+ real_path: resolve symlinks by hand
(this branch is tangled with bw/realpath-wo-chdir.)
"git grep" learns to optionally recurse into submodules.
* dt/smart-http-detect-server-going-away (2016-11-18) 2 commits
(merged to 'next' on 2016-12-05 at 3ea70d01af)
+ upload-pack: optionally allow fetching any sha1
+ remote-curl: don't hang when a server dies before any output
Originally merged to 'next' on 2016-11-21
When the http server gives an incomplete response to a smart-http
rpc call, it could lead to client waiting for a full response that
will never come. Teach the client side to notice this condition
and abort the transfer.
An improvement counterproposal has failed.
cf. <20161114194049.mktpsvgdhex2f4zv@sigill.intra.peff.net>
* jc/abbrev-autoscale-config (2016-12-22) 1 commit
(merged to 'next' on 2016-12-27 at 631e4200e2)
+ config.abbrev: document the new default that auto-scales
Recent update to the default abbreviation length that auto-scales
lacked documentation update, which has been corrected.
* jc/compression-config (2016-11-15) 1 commit
(merged to 'next' on 2016-12-05 at 323769ca07)
+ compression: unify pack.compression configuration parsing
Originally merged to 'next' on 2016-11-23
Compression setting for producing packfiles were spread across
three codepaths, one of which did not honor any configuration.
Unify these so that all of them honor core.compression and
pack.compression variables the same way.
* jc/git-open-cloexec (2016-11-02) 3 commits
(merged to 'next' on 2016-12-27 at 487682eb6e)
+ sha1_file: stop opening files with O_NOATIME
+ git_open_cloexec(): use fcntl(2) w/ FD_CLOEXEC fallback
+ git_open(): untangle possible NOATIME and CLOEXEC interactions
The codeflow of setting NOATIME and CLOEXEC on file descriptors Git
opens has been simplified.
We may want to drop the tip one, but we'll see.
* jc/latin-1 (2016-09-26) 2 commits
(merged to 'next' on 2016-12-05 at fb549caa12)
+ utf8: accept "latin-1" as ISO-8859-1
+ utf8: refactor code to decide fallback encoding
Originally merged to 'next' on 2016-09-28
Some platforms no longer understand "latin-1" that is still seen in
the wild in e-mail headers; replace them with "iso-8859-1" that is
more widely known when conversion fails from/to it.
* jc/retire-compaction-heuristics (2016-12-23) 1 commit
(merged to 'next' on 2016-12-27 at c69c2f50cf)
+ diff: retire "compaction" heuristics
"git diff" and its family had two experimental heuristics to shift
the contents of a hunk to make the patch easier to read. One of
them turns out to be better than the other, so leave only the
"--indent-heuristic" option and remove the other one.
* jt/fetch-no-redundant-tag-fetch-map (2016-11-11) 1 commit
(merged to 'next' on 2016-12-05 at 432f9469a7)
+ fetch: do not redundantly calculate tag refmap
Originally merged to 'next' on 2016-11-16
Code cleanup to avoid using redundant refspecs while fetching with
the --tags option.
* mh/fast-import-notes-fix-new (2016-12-20) 1 commit
(merged to 'next' on 2016-12-27 at b63805e6f6)
+ fast-import: properly fanout notes when tree is imported
"git fast-import" sometimes mishandled while rebalancing notes
tree, which has been fixed.
* mm/gc-safety-doc (2016-11-16) 1 commit
(merged to 'next' on 2016-12-05 at 031ecc1886)
+ git-gc.txt: expand discussion of races with other processes
Originally merged to 'next' on 2016-11-17
Doc update.
* mm/push-social-engineering-attack-doc (2016-11-14) 1 commit
(merged to 'next' on 2016-12-05 at 9a2b5bd1a9)
+ doc: mention transfer data leaks in more places
Originally merged to 'next' on 2016-11-16
Doc update on fetching and pushing.
* nd/config-misc-fixes (2016-12-22) 3 commits
(merged to 'next' on 2016-12-27 at 6be64a8671)
+ config.c: handle lock file in error case in git_config_rename_...
+ config.c: rename label unlock_and_out
+ config.c: handle error case for fstat() calls
Leakage of lockfiles in the config subsystem has been fixed.
* sb/submodule-embed-gitdir (2016-12-27) 7 commits
(merged to 'next' on 2016-12-27 at 2b43c15479)
+ worktree: initialize return value for submodule_uses_worktrees
(merged to 'next' on 2016-12-21 at e6cdbcf013)
+ submodule: add absorb-git-dir function
+ move connect_work_tree_and_git_dir to dir.h
+ worktree: check if a submodule uses worktrees
+ test-lib-functions.sh: teach test_commit -C <dir>
+ submodule helper: support super prefix
+ submodule: use absolute path for computing relative path connecting
(this branch is used by sb/submodule-rm-absorb.)
A new submodule helper "git submodule embedgitdirs" to make it
easier to move embedded .git/ directory for submodules in a
superproject to .git/modules/ (and point the latter with the former
that is turned into a "gitdir:" file) has been added.
--------------------------------------------------
[New Topics]
* ls/p4-retry-thrice (2016-12-29) 1 commit
(merged to 'next' on 2017-01-10 at c733e27410)
+ git-p4: do not pass '-r 0' to p4 commands
A recent updates to "git p4" was not usable for older p4 but it
could be made to work with minimum changes. Do so.
Will merge to 'master'.
* mh/ref-remove-empty-directory (2017-01-07) 23 commits
- files_transaction_commit(): clean up empty directories
- try_remove_empty_parents(): teach to remove parents of reflogs, too
- try_remove_empty_parents(): don't trash argument contents
- try_remove_empty_parents(): rename parameter "name" -> "refname"
- delete_ref_loose(): inline function
- delete_ref_loose(): derive loose reference path from lock
- log_ref_write_1(): inline function
- log_ref_setup(): manage the name of the reflog file internally
- log_ref_write_1(): don't depend on logfile argument
- log_ref_setup(): pass the open file descriptor back to the caller
- log_ref_setup(): improve robustness against races
- log_ref_setup(): separate code for create vs non-create
- log_ref_write(): inline function
- rename_tmp_log(): improve error reporting
- rename_tmp_log(): use raceproof_create_file()
- lock_ref_sha1_basic(): use raceproof_create_file()
- lock_ref_sha1_basic(): inline constant
- raceproof_create_file(): new function
- safe_create_leading_directories(): set errno on SCLD_EXISTS
- safe_create_leading_directories_const(): preserve errno
- t5505: use "for-each-ref" to test for the non-existence of references
- refname_is_safe(): correct docstring
- files_rename_ref(): tidy up whitespace
Deletion of a branch "foo/bar" could remove .git/refs/heads/foo
once there no longer is any other branch whose name begins with
"foo/", but we didn't do so so far. Now we do.
Expecting a reroll.
cf. <5051c78e-51f9-becd-e1a6-9c0b781d6912@alum.mit.edu>
* pb/test-must-fail-is-for-git (2017-01-09) 2 commits
(merged to 'next' on 2017-01-10 at 5f24a98779)
+ t9813: avoid using pipes
+ don't use test_must_fail with grep
Test cleanup.
Will merge to 'master'.
* jk/archive-zip-userdiff-config (2017-01-07) 1 commit
(merged to 'next' on 2017-01-10 at ac42e4958c)
+ archive-zip: load userdiff config
"git archive" did not read the standard configuration files, and
failed to notice a file that is marked as binary via the userdiff
driver configuration.
Will merge to 'master'.
* jk/blame-fixes (2017-01-07) 3 commits
(merged to 'next' on 2017-01-10 at 18f909da61)
+ blame: output porcelain "previous" header for each file
+ blame: handle --no-abbrev
+ blame: fix alignment with --abbrev=40
"git blame --porcelain" misidentified the "previous" <commit, path>
pair (aka "source") when contents came from two or more files.
Will merge to 'master'.
* jk/rebase-i-squash-count-fix (2017-01-07) 1 commit
(merged to 'next' on 2017-01-10 at d6cfc6ace2)
+ rebase--interactive: count squash commits above 10 correctly
"git rebase -i" with a recent update started showing an incorrect
count when squashing more than 10 commits.
Will merge to 'master'.
* js/asciidoctor-tweaks (2017-01-07) 1 commit
(merged to 'next' on 2017-01-10 at 087da7b7c1)
+ giteveryday: unbreak rendering with AsciiDoctor
Adjust documentation to help AsciiDoctor render better while not
breaking the rendering done by AsciiDoc.
Will merge to 'master'.
* km/branch-get-push-while-detached (2017-01-07) 1 commit
(merged to 'next' on 2017-01-10 at a7f8af8c55)
+ branch_get_push: do not segfault when HEAD is detached
"git <cmd> @{push}" on a detached HEAD used to segfault; it has
been corrected to error out with a message.
Will merge to 'master'.
* sb/remove-gitview (2017-01-07) 1 commit
(merged to 'next' on 2017-01-10 at dcb3abd146)
+ contrib: remove gitview
Will merge to 'master'.
* sb/submodule-cleanup-export-git-dir-env (2017-01-07) 1 commit
(merged to 'next' on 2017-01-10 at 2d5db6821e)
+ submodule.c: use GIT_DIR_ENVIRONMENT consistently
Code cleanup.
Will merge to 'master'.
* sb/pathspec-errors (2017-01-09) 1 commit
(merged to 'next' on 2017-01-10 at 432375cb62)
+ pathspec: give better message for submodule related pathspec error
(this branch uses bw/pathspec-cleanup.)
Running "git add a/b" when "a" is a submodule correctly errored
out, but without a meaningful error message.
Will merge to 'master'.
* ls/filter-process-delayed (2017-01-08) 1 commit
. convert: add "status=delayed" to filter process protocol
Ejected, as does not build when merged to 'pu'.
* sp/cygwin-build-fixes (2017-01-09) 2 commits
(merged to 'next' on 2017-01-10 at 2010fb6c03)
+ Makefile: put LIBS after LDFLAGS for imap-send
+ Makefile: POSIX windres
Build updates for Cygwin.
Will merge to 'master'.
* jk/execv-dashed-external (2017-01-09) 3 commits
(merged to 'next' on 2017-01-10 at 117b506cb0)
+ execv_dashed_external: wait for child on signal death
+ execv_dashed_external: stop exiting with negative code
+ execv_dashed_external: use child_process struct
Typing ^C to pager, which usually does not kill it, killed Git and
took the pager down as a collateral damage in certain process-tree
structure. This has been fixed.
Will merge to 'master'.
* rh/mergetool-regression-fix (2017-01-10) 14 commits
(merged to 'next' on 2017-01-10 at e8e00c798b)
+ mergetool: fix running in subdir when rerere enabled
+ mergetool: take the "-O" out of $orderfile
+ t7610: add test case for rerere+mergetool+subdir bug
+ t7610: spell 'git reset --hard' consistently
+ t7610: don't assume the checked-out commit
+ t7610: always work on a test-specific branch
+ t7610: delete some now-unnecessary 'git reset --hard' lines
+ t7610: run 'git reset --hard' after each test to clean up
+ t7610: don't rely on state from previous test
+ t7610: use test_when_finished for cleanup tasks
+ t7610: move setup code to the 'setup' test case
+ t7610: update branch names to match test number
+ rev-parse doc: pass "--" to rev-parse in the --prefix example
+ .mailmap: record canonical email for Richard Hansen
"git mergetool" without any pathspec on the command line that is
run from a subdirectory became no-op in Git v2.11 by mistake, which
has been fixed.
Will merge to 'master'.
* sb/unpack-trees-cleanup (2017-01-10) 3 commits
(merged to 'next' on 2017-01-10 at 95a5f3127c)
+ unpack-trees: factor progress setup out of check_updates
+ unpack-trees: remove unneeded continue
+ unpack-trees: move checkout state into check_updates
Code cleanup.
Will merge to 'master'.
--------------------------------------------------
[Stalled]
* jk/nofollow-attr-ignore (2016-11-02) 5 commits
- exclude: do not respect symlinks for in-tree .gitignore
- attr: do not respect symlinks for in-tree .gitattributes
- exclude: convert "check_index" into a flags field
- attr: convert "macro_ok" into a flags field
- add open_nofollow() helper
As we do not follow symbolic links when reading control files like
.gitignore and .gitattributes from the index, match the behaviour
and not follow symbolic links when reading them from the working
tree. This also tightens security a bit by not leaking contents of
an unrelated file in the error messages when it is pointed at by
one of these files that is a symbolic link.
Perhaps we want to cover .gitmodules too with the same mechanism?
* jc/bundle (2016-03-03) 6 commits
- index-pack: --clone-bundle option
- Merge branch 'jc/index-pack' into jc/bundle
- bundle v3: the beginning
- bundle: keep a copy of bundle file name in the in-core bundle header
- bundle: plug resource leak
- bundle doc: 'verify' is not about verifying the bundle
The beginning of "split bundle", which could be one of the
ingredients to allow "git clone" traffic off of the core server
network to CDN.
While I think it would make it easier for people to experiment and
build on if the topic is merged to 'next', I am at the same time a
bit reluctant to merge an unproven new topic that introduces a new
file format, which we may end up having to support til the end of
time. It is likely that to support a "prime clone from CDN", it
would need a lot more than just "these are the heads and the pack
data is over there", so this may not be sufficient.
Will discard.
* jc/diff-b-m (2015-02-23) 5 commits
. WIPWIP
. WIP: diff-b-m
- diffcore-rename: allow easier debugging
- diffcore-rename.c: add locate_rename_src()
- diffcore-break: allow debugging
"git diff -B -M" produced incorrect patch when the postimage of a
completely rewritten file is similar to the preimage of a removed
file; such a resulting file must not be expressed as a rename from
other place.
The fix in this patch is broken, unfortunately.
Will discard.
--------------------------------------------------
[Cooking]
* nd/worktree-move (2017-01-09) 6 commits
- worktree remove: new command
- worktree move: refuse to move worktrees with submodules
- worktree move: accept destination as directory
- worktree move: new command
- worktree.c: add update_worktree_location()
- worktree.c: add validate_worktree()
"git worktree" learned move and remove subcommands.
* dt/disable-bitmap-in-auto-gc (2016-12-29) 2 commits
(merged to 'next' on 2017-01-10 at 9f4e89e15d)
+ repack: die on incremental + write-bitmap-index
+ auto gc: don't write bitmaps for incremental repacks
It is natural that "git gc --auto" may not attempt to pack
everything into a single pack, and there is no point in warning
when the user has configured the system to use the pack bitmap,
leading to disabling further "gc".
Will merge to 'master'.
* js/mingw-test-push-unc-path (2017-01-07) 1 commit
(merged to 'next' on 2017-01-10 at 249d9f26f3)
+ mingw: add a regression test for pushing to UNC paths
"git push \\server\share\dir" has recently regressed and then
fixed. A test has retroactively been added for this breakage.
Will merge to 'master'.
* nd/log-graph-configurable-colors (2017-01-08) 1 commit
- log --graph: customize the graph lines with config log.graphColors
Some people feel the default set of colors used by "git log --graph"
rather limiting. A mechanism to customize the set of colors has
been introduced.
Waiting for review comments to be addressed.
cf. <20170109103258.25341-1-pclouds@gmail.com>
* sb/submodule-rm-absorb (2016-12-27) 4 commits
(merged to 'next' on 2017-01-10 at 1fc2000a92)
+ rm: absorb a submodules git dir before deletion
+ submodule: rename and add flags to ok_to_remove_submodule
+ submodule: modernize ok_to_remove_submodule to use argv_array
+ submodule.h: add extern keyword to functions
"git rm" used to refuse to remove a submodule when it has its own
git repository embedded in its working tree. It learned to move
the repository away to $GIT_DIR/modules/ of the superproject
instead, and allow the submodule to be deleted (as long as there
will be no loss of local modifications, that is).
Will merge to 'master'.
* cc/split-index-config (2016-12-26) 21 commits
- Documentation/git-update-index: explain splitIndex.*
- Documentation/config: add splitIndex.sharedIndexExpire
- read-cache: use freshen_shared_index() in read_index_from()
- read-cache: refactor read_index_from()
- t1700: test shared index file expiration
- read-cache: unlink old sharedindex files
- config: add git_config_get_expiry() from gc.c
- read-cache: touch shared index files when used
- sha1_file: make check_and_freshen_file() non static
- Documentation/config: add splitIndex.maxPercentChange
- t1700: add tests for splitIndex.maxPercentChange
- read-cache: regenerate shared index if necessary
- config: add git_config_get_max_percent_split_change()
- Documentation/git-update-index: talk about core.splitIndex config var
- Documentation/config: add information for core.splitIndex
- t1700: add tests for core.splitIndex
- update-index: warn in case of split-index incoherency
- read-cache: add and then use tweak_split_index()
- split-index: add {add,remove}_split_index() functions
- config: add git_config_get_split_index()
- config: mark an error message up for translation
The experimental "split index" feature has gained a few
configuration variables to make it easier to use.
Waiting for review comments to be addressed.
cf. <20161226102222.17150-1-chriscool@tuxfamily.org>
cf. <a1a44640-ff6c-2294-72ac-46322eff8505@ramsayjones.plus.com>
* bw/push-submodule-only (2016-12-20) 3 commits
- push: add option to push only submodules
- submodules: add RECURSE_SUBMODULES_ONLY value
- transport: reformat flag #defines to be more readable
"git submodule push" learned "--recurse-submodules=only option to
push submodules out without pushing the top-level superproject.
* ls/p4-path-encoding (2016-12-18) 1 commit
- git-p4: fix git-p4.pathEncoding for removed files
When "git p4" imports changelist that removes paths, it failed to
convert pathnames when the p4 used encoding different from the one
used on the Git side. This has been corrected.
Will be rerolled.
cf. <7E1C7387-4F37-423F-803D-3B5690B49D40@gmail.com>
* bw/pathspec-cleanup (2017-01-08) 16 commits
(merged to 'next' on 2017-01-10 at 79291ff506)
+ pathspec: rename prefix_pathspec to init_pathspec_item
+ pathspec: small readability changes
+ pathspec: create strip submodule slash helpers
+ pathspec: create parse_element_magic helper
+ pathspec: create parse_long_magic function
+ pathspec: create parse_short_magic function
+ pathspec: factor global magic into its own function
+ pathspec: simpler logic to prefix original pathspec elements
+ pathspec: always show mnemonic and name in unsupported_magic
+ pathspec: remove unused variable from unsupported_magic
+ pathspec: copy and free owned memory
+ pathspec: remove the deprecated get_pathspec function
+ ls-tree: convert show_recursive to use the pathspec struct interface
+ dir: convert fill_directory to use the pathspec struct interface
+ dir: remove struct path_simplify
+ mv: remove use of deprecated 'get_pathspec()'
(this branch is used by sb/pathspec-errors.)
Code clean-up in the pathspec API.
Will merge to 'master'.
* js/prepare-sequencer-more (2017-01-09) 38 commits
- sequencer (rebase -i): write out the final message
- sequencer (rebase -i): write the progress into files
- sequencer (rebase -i): show the progress
- sequencer (rebase -i): suggest --edit-todo upon unknown command
- sequencer (rebase -i): show only failed cherry-picks' output
- sequencer (rebase -i): show only failed `git commit`'s output
- sequencer: use run_command() directly
- sequencer: make reading author-script more elegant
- sequencer (rebase -i): differentiate between comments and 'noop'
- sequencer (rebase -i): implement the 'drop' command
- sequencer (rebase -i): allow rescheduling commands
- sequencer (rebase -i): respect strategy/strategy_opts settings
- sequencer (rebase -i): respect the rebase.autostash setting
- sequencer (rebase -i): run the post-rewrite hook, if needed
- sequencer (rebase -i): record interrupted commits in rewritten, too
- sequencer (rebase -i): copy commit notes at end
- sequencer (rebase -i): set the reflog message consistently
- sequencer (rebase -i): refactor setting the reflog message
- sequencer (rebase -i): allow fast-forwarding for edit/reword
- sequencer (rebase -i): implement the 'reword' command
- sequencer (rebase -i): leave a patch upon error
- sequencer (rebase -i): update refs after a successful rebase
- sequencer (rebase -i): the todo can be empty when continuing
- sequencer (rebase -i): skip some revert/cherry-pick specific code path
- sequencer (rebase -i): remove CHERRY_PICK_HEAD when no longer needed
- sequencer (rebase -i): allow continuing with staged changes
- sequencer (rebase -i): write an author-script file
- sequencer (rebase -i): implement the short commands
- sequencer (rebase -i): add support for the 'fixup' and 'squash' commands
- sequencer (rebase -i): write the 'done' file
- sequencer (rebase -i): learn about the 'verbose' mode
- sequencer (rebase -i): implement the 'exec' command
- sequencer (rebase -i): implement the 'edit' command
- sequencer (rebase -i): implement the 'noop' command
- sequencer: support a new action: 'interactive rebase'
- sequencer: use a helper to find the commit message
- sequencer: move "else" keyword onto the same line as preceding brace
- sequencer: avoid unnecessary curly braces
The sequencer has further been extended in preparation to act as a
back-end for "rebase -i".
Waiting for review comments to be addressed.
* bw/realpath-wo-chdir (2017-01-09) 7 commits
(merged to 'next' on 2017-01-10 at ed315a40c8)
+ real_path: set errno when max number of symlinks is exceeded
+ real_path: prevent redefinition of MAXSYMLINKS
(merged to 'next' on 2016-12-22 at fea8fa870f)
+ real_path: canonicalize directory separators in root parts
+ real_path: have callers use real_pathdup and strbuf_realpath
+ real_path: create real_pathdup
+ real_path: convert real_path_internal to strbuf_realpath
+ real_path: resolve symlinks by hand
(this branch is tangled with bw/grep-recurse-submodules.)
The implementation of "real_path()" was to go there with chdir(2)
and call getcwd(3), but this obviously wouldn't be usable in a
threaded environment. Rewrite it to manually resolve relative
paths including symbolic links in path components.
Will merge to 'master'.
* js/difftool-builtin (2017-01-09) 5 commits
- t7800: run both builtin and scripted difftool, for now
- difftool: implement the functionality in the builtin
- difftool: add a skeleton for the upcoming builtin
- git_exec_path: do not return the result of getenv()
- git_exec_path: avoid Coverity warning about unfree()d result
Rewrite a scripted porcelain "git difftool" in C.
Expecting a reroll.
cf. <alpine.DEB.2.20.1701091228460.3469@virtualbox>
* sb/push-make-submodule-check-the-default (2016-11-29) 2 commits
(merged to 'next' on 2016-12-12 at 1863e05af5)
+ push: change submodule default to check when submodules exist
+ submodule add: extend force flag to add existing repos
Turn the default of "push.recurseSubmodules" to "check" when
submodules seem to be in use.
Will cook in 'next'.
* kn/ref-filter-branch-list (2017-01-10) 21 commits
- SQUASH???
- branch: implement '--format' option
- branch: use ref-filter printing APIs
- branch, tag: use porcelain output
- ref-filter: allow porcelain to translate messages in the output
- ref-filter: add an 'rstrip=<N>' option to atoms which deal with refnames
- ref-filter: modify the 'lstrip=<N>' option to work with negative '<N>'
- ref-filter: Do not abruptly die when using the 'lstrip=<N>' option
- ref-filter: rename the 'strip' option to 'lstrip'
- ref-filter: make remote_ref_atom_parser() use refname_atom_parser_internal()
- ref-filter: introduce refname_atom_parser()
- ref-filter: introduce refname_atom_parser_internal()
- ref-filter: make "%(symref)" atom work with the ':short' modifier
- ref-filter: add support for %(upstream:track,nobracket)
- ref-filter: make %(upstream:track) prints "[gone]" for invalid upstreams
- ref-filter: introduce format_ref_array_item()
- ref-filter: move get_head_description() from branch.c
- ref-filter: modify "%(objectname:short)" to take length
- ref-filter: implement %(if:equals=<string>) and %(if:notequals=<string>)
- ref-filter: include reference to 'used_atom' within 'atom_value'
- ref-filter: implement %(if), %(then), and %(else) atoms
The code to list branches in "git branch" has been consolidated
with the more generic ref-filter API.
I think this is almost ready. Will wait for a few days, squash
fixes in if needed and merge to 'next'.
* jk/no-looking-at-dotgit-outside-repo-final (2016-10-26) 1 commit
(merged to 'next' on 2016-12-05 at 0c77e39cd5)
+ setup_git_env: avoid blind fall-back to ".git"
Originally merged to 'next' on 2016-10-26
This is the endgame of the topic to avoid blindly falling back to
".git" when the setup sequence said we are _not_ in Git repository.
A corner case that happens to work right now may be broken by a
call to die("BUG").
Will cook in 'next'.
* pb/bisect (2016-10-18) 27 commits
- bisect--helper: remove the dequote in bisect_start()
- bisect--helper: retire `--bisect-auto-next` subcommand
- bisect--helper: retire `--bisect-autostart` subcommand
- bisect--helper: retire `--bisect-write` subcommand
- bisect--helper: `bisect_replay` shell function in C
- bisect--helper: `bisect_log` shell function in C
- bisect--helper: retire `--write-terms` subcommand
- bisect--helper: retire `--check-expected-revs` subcommand
- bisect--helper: `bisect_state` & `bisect_head` shell function in C
- bisect--helper: `bisect_autostart` shell function in C
- bisect--helper: retire `--next-all` subcommand
- bisect--helper: retire `--bisect-clean-state` subcommand
- bisect--helper: `bisect_next` and `bisect_auto_next` shell function in C
- t6030: no cleanup with bad merge base
- bisect--helper: `bisect_start` shell function partially in C
- bisect--helper: `get_terms` & `bisect_terms` shell function in C
- bisect--helper: `bisect_next_check` & bisect_voc shell function in C
- bisect--helper: `check_and_set_terms` shell function in C
- bisect--helper: `bisect_write` shell function in C
- bisect--helper: `is_expected_rev` & `check_expected_revs` shell function in C
- bisect--helper: `bisect_reset` shell function in C
- wrapper: move is_empty_file() and rename it as is_empty_or_missing_file()
- t6030: explicitly test for bisection cleanup
- bisect--helper: `bisect_clean_state` shell function in C
- bisect--helper: `write_terms` shell function in C
- bisect: rewrite `check_term_format` shell function in C
- bisect--helper: use OPT_CMDMODE instead of OPT_BOOL
Move more parts of "git bisect" to C.
Expecting a reroll.
cf. <CAFZEwPPXPPHi8KiEGS9ggzNHDCGhuqMgH9Z8-Pf9GLshg8+LPA@mail.gmail.com>
cf. <CAFZEwPM9RSTGN54dzaw9gO9iZmsYjJ_d1SjUD4EzSDDbmh-XuA@mail.gmail.com>
* st/verify-tag (2016-10-10) 7 commits
- t/t7004-tag: Add --format specifier tests
- t/t7030-verify-tag: Add --format specifier tests
- builtin/tag: add --format argument for tag -v
- builtin/verify-tag: add --format to verify-tag
- tag: add format specifier to gpg_verify_tag
- ref-filter: add function to print single ref_array_item
- gpg-interface, tag: add GPG_VERIFY_QUIET flag
"git tag" and "git verify-tag" learned to put GPG verification
status in their "--format=<placeholders>" output format.
Waiting for a reroll.
cf. <20161007210721.20437-1-santiago@nyu.edu>
* sb/attr (2016-11-11) 35 commits
. completion: clone can initialize specific submodules
. clone: add --init-submodule=<pathspec> switch
. submodule update: add `--init-default-path` switch
. pathspec: allow escaped query values
. pathspec: allow querying for attributes
. pathspec: move prefix check out of the inner loop
. pathspec: move long magic parsing out of prefix_pathspec
- Documentation: fix a typo
- attr: keep attr stack for each check
- attr: convert to new threadsafe API
- attr: make git_check_attr_counted static
- attr.c: outline the future plans by heavily commenting
- attr.c: always pass check[] to collect_some_attrs()
- attr.c: introduce empty_attr_check_elems()
- attr.c: correct ugly hack for git_all_attrs()
- attr.c: rename a local variable check
- attr.c: pass struct git_attr_check down the callchain
- attr.c: add push_stack() helper
- attr: support quoting pathname patterns in C style
- attr: expose validity check for attribute names
- attr: add counted string version of git_check_attr()
- attr: retire git_check_attrs() API
- attr: convert git_check_attrs() callers to use the new API
- attr: convert git_all_attrs() to use "struct git_attr_check"
- attr: (re)introduce git_check_attr() and struct git_attr_check
- attr: rename function and struct related to checking attributes
- attr.c: plug small leak in parse_attr_line()
- attr.c: tighten constness around "git_attr" structure
- attr.c: simplify macroexpand_one()
- attr.c: mark where #if DEBUG ends more clearly
- attr.c: complete a sentence in a comment
- attr.c: explain the lack of attr-name syntax check in parse_attr()
- attr.c: update a stale comment on "struct match_attr"
- attr.c: use strchrnul() to scan for one line
- commit.c: use strchrnul() to scan for one line
The attributes API has been updated so that it can later be
optimized using the knowledge of which attributes are queried.
Building on top of the updated API, the pathspec machinery learned
to select only paths with given attributes set.
The parts near the tip about pathspec would need to work better
with bw/pathspec-cleanup topic and has been dropped for now.
* sg/fix-versioncmp-with-common-suffix (2016-12-08) 8 commits
- versioncmp: generalize version sort suffix reordering
- squash! versioncmp: use earliest-longest contained suffix to determine sorting order
- versioncmp: use earliest-longest contained suffix to determine sorting order
- versioncmp: cope with common part overlapping with prerelease suffix
- versioncmp: pass full tagnames to swap_prereleases()
- t7004-tag: add version sort tests to show prerelease reordering issues
- t7004-tag: use test_config helper
- t7004-tag: delete unnecessary tags with test_when_finished
The prereleaseSuffix feature of version comparison that is used in
"git tag -l" did not correctly when two or more prereleases for the
same release were present (e.g. when 2.0, 2.0-beta1, and 2.0-beta2
are there and the code needs to compare 2.0-beta1 and 2.0-beta2).
Will merge to 'next' after squashing.
cf. <20161208142401.1329-1-szeder.dev@gmail.com>
* jc/merge-drop-old-syntax (2015-04-29) 1 commit
(merged to 'next' on 2016-12-05 at 041946dae0)
+ merge: drop 'git merge <message> HEAD <commit>' syntax
Originally merged to 'next' on 2016-10-11
Stop supporting "git merge <message> HEAD <commit>" syntax that has
been deprecated since October 2007, and issues a deprecation
warning message since v2.5.0.
Will cook in 'next'.
--------------------------------------------------
[Discarded]
* rs/unpack-trees-reduce-file-scope-global (2016-12-31) 1 commit
. unpack-trees: move checkout state into check_updates
Code cleanup.
Superseded by sb/unpack-trees-cleanup
* jc/reset-unmerge (2016-10-24) 1 commit
. reset: --unmerge
After "git add" is run prematurely during a conflict resolution,
"git diff" can no longer be used as a way to sanity check by
looking at the combined diff. "git reset" learned a new
"--unmerge" option to recover from this situation.
This may not be needed, given that update-index has a similar
option.
* jc/merge-base-fp-only (2016-10-19) 8 commits
. merge-base: fp experiment
. merge: allow to use only the fp-only merge bases
. merge-base: limit the output to bases that are on first-parent chain
. merge-base: mark bases that are on first-parent chain
. merge-base: expose get_merge_bases_many_0() a bit more
. merge-base: stop moving commits around in remove_redundant()
. sha1_name: remove ONELINE_SEEN bit
. commit: simplify fastpath of merge-base
An experiment of merge-base that ignores common ancestors that are
not on the first parent chain.
The whole premise feels wrong.
* tb/convert-stream-check (2016-10-27) 2 commits
. convert.c: stream and fast search for binary
. read-cache: factor out get_sha1_from_index() helper
End-of-line conversion sometimes needs to see if the current blob
in the index has NULs and CRs to base its decision. We used to
always get a full statistics over the blob, but in many cases we
can return early when we have seen "enough" (e.g. if we see a
single NUL, the blob will be handled as binary). The codepaths
have been optimized by using streaming interface.
Retracted.
cf. <20161102071646.GA5094@tb-raspi>
* mh/connect (2016-06-06) 10 commits
. connect: [host:port] is legacy for ssh
. connect: move ssh command line preparation to a separate function
. connect: actively reject git:// urls with a user part
. connect: change the --diag-url output to separate user and host
. connect: make parse_connect_url() return the user part of the url as a separate value
. connect: group CONNECT_DIAG_URL handling code
. connect: make parse_connect_url() return separated host and port
. connect: re-derive a host:port string from the separate host and port variables
. connect: call get_host_and_port() earlier
. connect: document why we sometimes call get_port after get_host_and_port
Rewrite Git-URL parsing routine (hopefully) without changing any
behaviour.
It has been months without any support.
* ec/annotate-deleted (2015-11-20) 1 commit
. annotate: skip checking working tree if a revision is provided
Usability fix for annotate-specific "<file> <rev>" syntax with deleted
files.
Has been waiting for a review for too long without seeing anything.
* dk/gc-more-wo-pack (2016-01-13) 4 commits
. gc: clean garbage .bitmap files from pack dir
. t5304: ensure non-garbage files are not deleted
. t5304: test .bitmap garbage files
. prepare_packed_git(): find more garbage
Follow-on to dk/gc-idx-wo-pack topic, to clean up stale
.bitmap and .keep files.
Has been waiting for a reroll for too long.
cf. <xmqq60ypbeng.fsf@gitster.mtv.corp.google.com>
^ permalink raw reply
* git cat-file on a submodule
From: David Turner @ 2017-01-11 0:11 UTC (permalink / raw)
To: git
Why does git cat-file -t $sha:foo, where foo is a submodule, not work?
git rev-parse $sha:foo works.
By "why", I mean "would anyone complain if I fixed it?" FWIW, I think
-p should just return the submodule's sha.
^ permalink raw reply
* Re: git cat-file on a submodule
From: Stefan Beller @ 2017-01-11 0:25 UTC (permalink / raw)
To: David Turner; +Cc: git@vger.kernel.org
In-Reply-To: <1484093500.17967.6.camel@frank>
On Tue, Jan 10, 2017 at 4:11 PM, David Turner <novalis@novalis.org> wrote:
> Why does git cat-file -t $sha:foo, where foo is a submodule, not work?
>
> git rev-parse $sha:foo works.
>
> By "why", I mean "would anyone complain if I fixed it?"
$ git log -- builtin/cat-file.c |grep -i -e gitlink -e submodule
$ # no result
I think nobody cared so far. Go for it!
> FWIW, I think
> -p should just return the submodule's sha.
That sounds right as the sha1 is also printed for the tree already, i.e.
in Gerrit you can get the submodules via
$ git cat-file -p HEAD:plugins/
100644 blob c6bb7f182440d6ab860bbcfadc9901b0d94d1ee3 BUCK
160000 commit 9b163e113de9f3a49219a02d388f7f46ea2559d3
commit-message-length-validator
160000 commit 69b8f9f413ce83a71593a4068a3b8e81f684cbad cookbook-plugin
160000 commit 7b41f3a413b46140b050ae5324cbbcdd467d2b3a download-commands
160000 commit 3acc14d10d26678eae6489038fe0d4dad644a9b4 hooks
160000 commit c5123d6a5604cc740d6f42485235c0d3ec141c4e replication
160000 commit 3f3d572e9618f268b19cc54856deee4c96180e4c reviewnotes
160000 commit 3ca1167edda713f4bfdcecd9c0e2626797d7027f singleusergroup
"commit <sha1>" is the correct answer already :)
Thanks,
Stefan
^ permalink raw reply
* Re: What's cooking in git.git (Jan 2017, #01; Tue, 10)
From: Stefan Beller @ 2017-01-11 0:55 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git@vger.kernel.org
In-Reply-To: <xmqqd1fupjbs.fsf@gitster.mtv.corp.google.com>
On Tue, Jan 10, 2017 at 3:48 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Here are the topics that have been cooking.
These two are not included:
A bug fix (regression from rewriting submodule stuff in C)
http://public-inbox.org/git/20170107001953.3196-1-sbeller@google.com/
And another cleanup series
http://public-inbox.org/git/20161227193605.12413-1-sbeller@google.com
I just assume you're still back-logged due to your travel around new year,
Thanks,
Stefan
^ permalink raw reply
* Re: [PATCH 2/2] diff: document the pattern format for diff.orderFile
From: Richard Hansen @ 2017-01-11 1:14 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
In-Reply-To: <xmqq8tqismdx.fsf@gitster.mtv.corp.google.com>
On 2017-01-10 15:14, Junio C Hamano wrote:
> Richard Hansen <hansenr@google.com> writes:
>
>> Document the format of the patterns used for the diff.orderFile
>> setting and diff's '-O' option by referring the reader to the
>> gitignore[5] page.
>>
>> Signed-off-by: Richard Hansen <hansenr@google.com>
>> ---
>> Documentation/diff-config.txt | 3 ++-
>> Documentation/diff-options.txt | 3 ++-
>> 2 files changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/diff-config.txt b/Documentation/diff-config.txt
>> index 875212045..a35ecdd6b 100644
>> --- a/Documentation/diff-config.txt
>> +++ b/Documentation/diff-config.txt
>> @@ -100,7 +100,8 @@ diff.noprefix::
>>
>> diff.orderFile::
>> File indicating how to order files within a diff, using
>> - one shell glob pattern per line.
>> + one glob pattern per line.
>> + See linkgit:gitignore[5] for the pattern format.
>
>
> I do not think it is wise to suggest referring to gitignore, as the
> logic of matching is quite different, other than the fact that they
> both use wildmatch() internally. Also, unlike gitignore, orderfile
> does not allow any negative matching i.e. "!<pattern>".
I was looking at the code to see how the two file formats differed and
noticed that match_order() doesn't set the WM_PATHNAME flag when it
calls wildmatch(). That's unintentional (a bug), right?
-Richard
>
>> If `diff.orderFile` is a relative pathname, it is treated as
>> relative to the top of the work tree.
>> Can be overridden by the '-O' option to linkgit:git-diff[1].
>> diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
>> index e6215c372..dc6b1af71 100644
>> --- a/Documentation/diff-options.txt
>> +++ b/Documentation/diff-options.txt
>> @@ -467,7 +467,8 @@ endif::git-format-patch[]
>>
>> -O<orderfile>::
>> Output the patch in the order specified in the
>> - <orderfile>, which has one shell glob pattern per line.
>> + <orderfile>, which has one glob pattern per line.
>> + See linkgit:gitignore[5] for the pattern format.
>> This overrides the `diff.orderFile` configuration variable
>> (see linkgit:git-config[1]). To cancel `diff.orderFile`,
>> use `-O/dev/null`.
^ permalink raw reply
* [PATCH v2 0/2] diff orderfile documentation improvements
From: Richard Hansen @ 2017-01-11 1:57 UTC (permalink / raw)
To: git; +Cc: gitster
In-Reply-To: <20170110004031.57985-1-hansenr@google.com>
Changes from v1:
* Don't reference gitignore for the file format because they're not
quite the same.
Richard Hansen (2):
diff: document behavior of relative diff.orderFile
diff: document the format of the -O (diff.orderFile) file
Documentation/diff-config.txt | 7 +++---
Documentation/diff-options.txt | 54 ++++++++++++++++++++++++++++++++++++++++--
2 files changed, 56 insertions(+), 5 deletions(-)
--
2.11.0.390.gc69c2f50cf-goog
^ permalink raw reply
* [PATCH v2 1/2] diff: document behavior of relative diff.orderFile
From: Richard Hansen @ 2017-01-11 1:57 UTC (permalink / raw)
To: git; +Cc: gitster
In-Reply-To: <20170111015720.111223-1-hansenr@google.com>
Document that a relative pathname for diff.orderFile is interpreted as
relative to the top-level work directory.
Signed-off-by: Richard Hansen <hansenr@google.com>
---
Documentation/diff-config.txt | 2 ++
1 file changed, 2 insertions(+)
diff --git a/Documentation/diff-config.txt b/Documentation/diff-config.txt
index 58f4bd6af..875212045 100644
--- a/Documentation/diff-config.txt
+++ b/Documentation/diff-config.txt
@@ -101,6 +101,8 @@ diff.noprefix::
diff.orderFile::
File indicating how to order files within a diff, using
one shell glob pattern per line.
+ If `diff.orderFile` is a relative pathname, it is treated as
+ relative to the top of the work tree.
Can be overridden by the '-O' option to linkgit:git-diff[1].
diff.renameLimit::
--
2.11.0.390.gc69c2f50cf-goog
^ permalink raw reply related
* [PATCH v2 2/2] diff: document the format of the -O (diff.orderFile) file
From: Richard Hansen @ 2017-01-11 1:57 UTC (permalink / raw)
To: git; +Cc: gitster
In-Reply-To: <20170111015720.111223-1-hansenr@google.com>
Signed-off-by: Richard Hansen <hansenr@google.com>
---
Documentation/diff-config.txt | 5 ++--
Documentation/diff-options.txt | 54 ++++++++++++++++++++++++++++++++++++++++--
2 files changed, 54 insertions(+), 5 deletions(-)
diff --git a/Documentation/diff-config.txt b/Documentation/diff-config.txt
index 875212045..9e4111320 100644
--- a/Documentation/diff-config.txt
+++ b/Documentation/diff-config.txt
@@ -99,11 +99,10 @@ diff.noprefix::
If set, 'git diff' does not show any source or destination prefix.
diff.orderFile::
- File indicating how to order files within a diff, using
- one shell glob pattern per line.
+ File indicating how to order files within a diff.
+ See the '-O' option to linkgit:git-diff[1] for details.
If `diff.orderFile` is a relative pathname, it is treated as
relative to the top of the work tree.
- Can be overridden by the '-O' option to linkgit:git-diff[1].
diff.renameLimit::
The number of files to consider when performing the copy/rename
diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index e6215c372..e57e9f810 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -466,11 +466,61 @@ information.
endif::git-format-patch[]
-O<orderfile>::
- Output the patch in the order specified in the
- <orderfile>, which has one shell glob pattern per line.
+ Control the order in which files appear in the output.
This overrides the `diff.orderFile` configuration variable
(see linkgit:git-config[1]). To cancel `diff.orderFile`,
use `-O/dev/null`.
++
+The output order is determined by the order of glob patterns in
+<orderfile>.
+All files with pathnames that match the first pattern are output
+first, all files with pathnames that match the second pattern (but not
+the first) are output next, and so on.
+All files with pathnames that do not match any pattern are output
+last, as if there was an implicit match-all pattern at the end of the
+file.
+If multiple pathnames have the same rank, their output order relative
+to each other is the normal order.
++
+<orderfile> is parsed as follows:
++
+--
+ - Blank lines are ignored, so they can be used as separators for
+ readability.
+
+ - Lines starting with a hash ("`#`") are ignored, so they can be used
+ for comments. Add a backslash ("`\`") to the beginning of the
+ pattern if it starts with a hash.
+
+ - Each other line contains a single pattern.
+--
++
+Patterns have the same syntax and semantics as patterns used for
+fnmantch(3) with the FNM_PATHNAME flag, except multiple consecutive
+unescaped asterisks (e.g., "`**`") have a special meaning:
++
+--
+ - A pattern beginning with "`**/`" means match in all directories.
+ For example, "`**/foo`" matches filename "`foo`" anywhere, and
+ "`**/foo/bar`" matches filename "`bar`" anywhere that is directly
+ under directory "`foo`".
+
+ - A pattern ending with "`/**`" matches everything inside a
+ directory, with infinite depth. For example, "`abc/**`" matches
+ "`abc/def/ghi`" but not "`foo/abc/def`".
+
+ - A slash followed by two consecutive asterisks then a slash
+ ("`/**/`") matches zero or more directory components. For example,
+ "`a/**/b`" matches "`a/b`", "`a/x/b`", "`a/x/y/b`" and so on.
+
+ - A pattern with more than one consecutive unescaped asterisk is
+ invalid.
+--
++
+In addition, a pathname matches a pattern if the pathname with any
+number of its final pathname components removed matches the pattern.
+For example, the pattern "`foo/*bar`" matches "`foo/asdfbar`" and
+"`foo/bar/baz`" but not "`foo/barx`".
ifndef::git-format-patch[]
-R::
--
2.11.0.390.gc69c2f50cf-goog
^ permalink raw reply related
* Re: [PATCH 2/2] diff: document the pattern format for diff.orderFile
From: Junio C Hamano @ 2017-01-11 2:46 UTC (permalink / raw)
To: Richard Hansen; +Cc: git
In-Reply-To: <17d48ccd-fd19-3922-8ee8-af6558d22632@google.com>
Richard Hansen <hansenr@google.com> writes:
> I was looking at the code to see how the two file formats differed and
> noticed that match_order() doesn't set the WM_PATHNAME flag when it
> calls wildmatch(). That's unintentional (a bug), right?
It has been that way from day one IIRC even before we introduced
wildmatch()---IOW it may be intentional that the current code that
uses wildmatch() does not use WM_PATHNAME.
^ permalink raw reply
* [PATCHv2 2/2] builtin/commit.c: drop use snprintf via dynamic allocation
From: Elia Pinto @ 2017-01-11 7:10 UTC (permalink / raw)
To: git; +Cc: Elia Pinto
In-Reply-To: <20170111071032.27797-1-gitter.spiros@gmail.com>
In general snprintf is bad because it may silently truncate results
if we're wrong. In this patch, instead of using xnprintf, which asserts
that we don't truncate, we are switching to dynamic allocation, so we can
avoid dealing with magic numbers in the code.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
---
This is the second version of the patch.
I have split the original commit in two, as discussed here
http://public-inbox.org/git/20161213132717.42965-1-gitter.spiros@gmail.com/.
builtin/commit.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/builtin/commit.c b/builtin/commit.c
index 09bcc0f13..37228330c 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1526,12 +1526,10 @@ static int git_commit_config(const char *k, const char *v, void *cb)
static int run_rewrite_hook(const unsigned char *oldsha1,
const unsigned char *newsha1)
{
- /* oldsha1 SP newsha1 LF NUL */
- static char buf[2*40 + 3];
+ char *buf;
struct child_process proc = CHILD_PROCESS_INIT;
const char *argv[3];
int code;
- size_t n;
argv[0] = find_hook("post-rewrite");
if (!argv[0])
@@ -1547,11 +1545,11 @@ static int run_rewrite_hook(const unsigned char *oldsha1,
code = start_command(&proc);
if (code)
return code;
- n = snprintf(buf, sizeof(buf), "%s %s\n",
- sha1_to_hex(oldsha1), sha1_to_hex(newsha1));
+ buf = xstrfmt("%s %s\n", sha1_to_hex(oldsha1), sha1_to_hex(newsha1));
sigchain_push(SIGPIPE, SIG_IGN);
- write_in_full(proc.in, buf, n);
+ write_in_full(proc.in, buf, strlen(buf));
close(proc.in);
+ free(buf);
sigchain_pop(SIGPIPE);
return finish_command(&proc);
}
--
2.11.0.154.g5f5f154
^ permalink raw reply related
* [PATCHv2 1/2] builtin/commit.c: drop use snprintf via dynamic allocation
From: Elia Pinto @ 2017-01-11 7:10 UTC (permalink / raw)
To: git; +Cc: Elia Pinto
In general snprintf is bad because it may silently truncate results if we're
wrong. In this patch where we use PATH_MAX, we'd want to handle larger
paths anyway, so we switch to dynamic allocation.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
---
This is the second version of the patch.
I have split the original commit in two, as discussed here
http://public-inbox.org/git/20161213132717.42965-1-gitter.spiros@gmail.com/.
builtin/commit.c | 22 +++++++++++-----------
1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/builtin/commit.c b/builtin/commit.c
index 0ed634b26..09bcc0f13 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -960,15 +960,16 @@ static int prepare_to_commit(const char *index_file, const char *prefix,
return 0;
if (use_editor) {
- char index[PATH_MAX];
- const char *env[2] = { NULL };
- env[0] = index;
- snprintf(index, sizeof(index), "GIT_INDEX_FILE=%s", index_file);
- if (launch_editor(git_path_commit_editmsg(), NULL, env)) {
+ struct argv_array env = ARGV_ARRAY_INIT;
+
+ argv_array_pushf(&env, "GIT_INDEX_FILE=%s", index_file);
+ if (launch_editor(git_path_commit_editmsg(), NULL, env.argv)) {
fprintf(stderr,
_("Please supply the message using either -m or -F option.\n"));
+ argv_array_clear(&env);
exit(1);
}
+ argv_array_clear(&env);
}
if (!no_verify &&
@@ -1557,23 +1558,22 @@ static int run_rewrite_hook(const unsigned char *oldsha1,
int run_commit_hook(int editor_is_used, const char *index_file, const char *name, ...)
{
- const char *hook_env[3] = { NULL };
- char index[PATH_MAX];
+ struct argv_array hook_env = ARGV_ARRAY_INIT;
va_list args;
int ret;
- snprintf(index, sizeof(index), "GIT_INDEX_FILE=%s", index_file);
- hook_env[0] = index;
+ argv_array_pushf(&hook_env, "GIT_INDEX_FILE=%s", index_file);
/*
* Let the hook know that no editor will be launched.
*/
if (!editor_is_used)
- hook_env[1] = "GIT_EDITOR=:";
+ argv_array_push(&hook_env, "GIT_EDITOR=:");
va_start(args, name);
- ret = run_hook_ve(hook_env, name, args);
+ ret = run_hook_ve(hook_env.argv,name, args);
va_end(args);
+ argv_array_clear(&hook_env);
return ret;
}
--
2.11.0.154.g5f5f154
^ permalink raw reply related
* Re: [PATCH v1] convert: add "status=delayed" to filter process protocol
From: Lars Schneider @ 2017-01-11 9:43 UTC (permalink / raw)
To: Junio C Hamano
Cc: Git mailing list, Eric Wong, Jakub Narębski,
Torsten Bögershausen, Taylor Blau
In-Reply-To: <xmqqa8b115ll.fsf@gitster.mtv.corp.google.com>
> On 09 Jan 2017, at 00:42, Junio C Hamano <gitster@pobox.com> wrote:
>
> larsxschneider@gmail.com writes:
>
>> From: Lars Schneider <larsxschneider@gmail.com>
>>
>> Some `clean` / `smudge` filters might require a significant amount of
>> time to process a single blob. During this process the Git checkout
>> operation is blocked and Git needs to wait until the filter is done to
>> continue with the checkout.
>>
>> Teach the filter process protocol (introduced in edcc858) to accept the
>> status "delayed" as response to a filter request. Upon this response Git
>> continues with the checkout operation and asks the filter to process the
>> blob again after all other blobs have been processed.
>
> Hmm, I would have expected that the basic flow would become
>
> for each paths to be processed:
> convert-to-worktree to buf
> if not delayed:
> do the caller's thing to use buf
> else:
> remember path
>
> for each delayed paths:
> ensure filter process finished processing for path
> fetch the thing to buf from the process
> do the caller's thing to use buf
>
> and that would make quite a lot of sense. However, what is actually
> implemented is a bit disappointing from that point of view. While
> its first part is the same as above, the latter part instead does:
>
> for each delayed paths:
> checkout the path
>
> Presumably, checkout_entry() does the "ensure that the process is
> done converting" (otherwise the result is simply buggy), but what
> disappoints me is that this does not allow callers that call
> "convert-to-working-tree", whose interface is obtain the bytestream
> in-core in the working tree representation, given an object in the
> object-db representation in an in-core buffer, to _use_ the result
> of the conversion. The caller does not have a chance to even see
> the result as it is written straight to the filesystem, once it
> calls checkout_delayed_entries().
I am not sure I can follow you here. A caller of "convert_to_working_tree"
would indeed see filtered result. Consider the following example. The
filter delays the conversion twice and responds with the filtered results
on the third call:
CALL: int convert_to_working_tree(*src=='CONTENT', *dst, *delayed==0)
RESPONSE: return == 1; *delayed == 1, *dst==''
CALL: int convert_to_working_tree(*src=='CONTENT', *dst, *delayed==0)
RESPONSE: return == 1; *delayed == 1, *dst==''
CALL: int convert_to_working_tree(*src=='CONTENT', *dst, *delayed==0)
RESPONSE: return == 1; *delayed == 0, *dst=='FILTERED_CONTENT'
I implemented the "checkout_delayed_entries" function in v1 because
it solved the problem with minimal changes in the existing code. Our previous
discussion made me think that this is the preferred way:
I do not think we want to see such a rewrite all over the
codepaths. It might be OK to add such a "these entries are known
to be delayed" list in struct checkout so that the above becomes
more like this:
for (i = 0; i < active_nr; i++)
checkout_entry(active_cache[i], state, NULL);
+ checkout_entry_finish(state);
That is, addition of a single "some of the checkout_entry() calls
done so far might have been lazy, and I'll give them a chance to
clean up" might be palatable. Anything more than that on the
caller side is not.
c.f. http://public-inbox.org/git/xmqqvavotych.fsf@gitster.mtv.corp.google.com/
Thanks,
Lars
^ permalink raw reply
* Re: [PATCH v1] convert: add "status=delayed" to filter process protocol
From: Lars Schneider @ 2017-01-11 9:48 UTC (permalink / raw)
To: Torsten Bögershausen
Cc: Git mailing list, Junio C Hamano, Eric Wong, Jakub Narębski,
Taylor Blau
In-Reply-To: <20170108201415.GA3569@tb-raspi>
> On 08 Jan 2017, at 21:14, Torsten Bögershausen <tboegi@web.de> wrote:
>
> On Sun, Jan 08, 2017 at 08:17:36PM +0100, larsxschneider@gmail.com wrote:
>> From: Lars Schneider <larsxschneider@gmail.com>
>>
>> Some `clean` / `smudge` filters might require a significant amount of
>> time to process a single blob. During this process the Git checkout
>> operation is blocked and Git needs to wait until the filter is done to
>> continue with the checkout.
>>
>> Teach the filter process protocol (introduced in edcc858) to accept the
>> status "delayed" as response to a filter request. Upon this response Git
>> continues with the checkout operation and asks the filter to process the
>> blob again after all other blobs have been processed.
>>
>> Git has a multiple code paths that checkout a blob. Support delayed
>> checkouts only in `clone` (in unpack-trees.c) and `checkout` operations.
>>
>> Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
>> ---
>>
>
> Some feeling tells me that it may be better to leave convert_to_working_tree() as it is.
> And change convert_to_working_tree_internal as suggested:
>
> int convert_to_working_tree(const char *path, const char *src, size_t len, struct strbuf *dst)
> {
> - return convert_to_working_tree_internal(path, src, len, dst, 0);
> + return convert_to_working_tree_internal(path, src, len, dst, NULL, 0);
> }
If I do this then I would have no way to communicate to the caller that the
processing is delayed. Consequently the caller would not know that an additional
call is necessary to fetch the result.
Thanks,
Lars
^ permalink raw reply
* Re: [PATCH v1] convert: add "status=delayed" to filter process protocol
From: Lars Schneider @ 2017-01-11 9:51 UTC (permalink / raw)
To: Eric Wong; +Cc: git, gitster, jnareb
In-Reply-To: <20170108204517.GA13779@starla>
> On 08 Jan 2017, at 21:45, Eric Wong <e@80x24.org> wrote:
>
> larsxschneider@gmail.com wrote:
>> +++ b/t/t0021/rot13-filter.pl
>
>> +$DELAY{'test-delay1.r'} = 1;
>> +$DELAY{'test-delay3.r'} = 3;
>>
>> open my $debug, ">>", "rot13-filter.log" or die "cannot open log file: $!";
>>
>> @@ -166,6 +176,15 @@ while (1) {
>> packet_txt_write("status=abort");
>> packet_flush();
>> }
>> + elsif ( $command eq "smudge" and
>> + exists $DELAY{$pathname} and
>> + $DELAY{$pathname} gt 0 ) {
>
> Use '>' for numeric comparisons. 'gt' is for strings (man perlop)
Still learning Perl :-)
> Sidenote, staying <= 80 columns for the rest of the changes is
> strongly preferred, some of us need giant fonts. I think what
> Torsten said about introducing a new *_internal function can
> also help with that.
OK!
Thank you,
Lars
^ permalink raw reply
* Re: [musl] Re: Test failures when Git is built with libpcre and grep is built without it
From: Jeff King @ 2017-01-11 10:04 UTC (permalink / raw)
To: git; +Cc: musl, Andreas Schwab, A. Wilcox
In-Reply-To: <20170110113959.GL17692@port70.net>
On Tue, Jan 10, 2017 at 12:40:00PM +0100, Szabolcs Nagy wrote:
> > > I'm not sure if musl is wrong for failing to complain about a
> > > bogus regex. Generally making something that would break into
> > > something that works is an OK way to extend the standard. So our
> > > test is at fault for assuming that the regex will fail. I guess
>
> \x is undefined in posix and musl is based on tre which
> supports \x{hexdigits} in ere.
Thanks for confirming; I figured it was something like that.
> > > we'd need to find some more exotic syntax that pcre supports, but
> > > that ERE doesn't. Maybe "(?:)" or something.
>
> i think you would have to use something that's invalid
> in posix ere, ? after empty expression is undefined,
> not an error so "(?:)" is a valid ere extension.
Reading through POSIX[1], hardly anything is explicitly labeled as
"invalid". Most things are just "undefined", which leaves rooms for
implementations to do what they like.
That's a good thing for a standard to do, but a bad thing when you are
trying to find behavior that differs reliably between PCRE and ERE. :)
In most cases, PCRE constructs could be viable extensions to ERE.
> since most syntax is either defined or undefined in ere
> instead of being invalid, distinguishing pcre using
> syntax is not easy.
>
> there are semantic differences in subexpression matching:
> leftmost match has higher priority in pcre, longest match
> has higher priority in ere.
>
> $ echo ab | grep -o -E '(a|ab)'
> ab
> $ echo ab | grep -o -P '(a|ab)'
> a
>
> unfortunately grep -o is not portable.
In this case we're testing whether Git has internally fed the regex to
pcre or to regcomp(), not a system grep. So we'd need something like
"-o" for "git grep", which I don't think exists.
Another difference I found is that "[\d]" matches a literal "\" or "d"
in ERE, but behaves like "[0-9]" in PCRE. I'll work up a patch based on
that.
Thanks for your answer. I'll drop the musl list from the cc when I
follow-up, as this is most definitely not a musl problem, but a git one.
-Peff
^ permalink raw reply
* Re: [PATCH v1] convert: add "status=delayed" to filter process protocol
From: Lars Schneider @ 2017-01-11 10:13 UTC (permalink / raw)
To: Taylor Blau; +Cc: Junio C Hamano, git, e, jnareb
In-Reply-To: <20170109233816.GA70151@Ida>
> On 10 Jan 2017, at 00:38, Taylor Blau <ttaylorr@github.com> wrote:
>
> I've been considering some alternative approaches in order to make the
> communication between Git and any extension that implements this protocol more
> intuitive.
>
> In particular, I'm considering alternatives to:
>
>> for each delayed paths:
>> ensure filter process finished processing for path
>> fetch the thing to buf from the process
>> do the caller's thing to use buf
>
> As I understand it, the above sequence of steps would force Git to either:
>
> a) loop over all delayed paths and ask the filter if it's done processing,
> creating a busy-loop between the filter and Git, or...
> b) loop over all delayed paths sequentially, checking out each path in sequence
>
> I would like to avoid both of those situations, and instead opt for an
> asynchronous approach. In (a), the protocol is far too chatty. In (b), the
> protocol is much less chatty, but forces the checkout to be the very last step,
> which has negative performance implications on checkouts with many large files.
>
> For instance, checking out several multi-gigabyte files one after the other
> means that a significant amount of time is lost while the filter has some of the
> items ready. Instead of checking them out as they become available, Git waits
> until the very end when they are all available.
>
> I think it would be preferable for the protocol to specify a sort of "done"
> signal against each path such that Git could check out delayed paths as they
> become available. If implemented this way, Git could checkout files
> asynchronously, while the filter continues to do work on the other end.
In v1 I implemented a) with the busy-loop problem in mind.
My thinking was this:
If the filter sees at least one filter request twice then the filter knows that
Git has already requested all files that require filtering. At that point the
filter could just block the "delayed" answer to the latest filter request until
at least one of the previously delayed requests can be fulfilled. Then the filter
answers "delay" to Git until Git requests the blob that can be fulfilled. This
process cycles until all requests can be fulfilled. Wouldn't that work?
I think a "done" message by the filter is not easy. Right now the protocol works
in a mode were Git always asks and the filter always answers. I believe changing
the filter to be able to initiate a "done" message would complicated the protocol.
> Additionally, the protocol should specify a sentinel "no more entries" value
> that could be sent from Git to the filter to signal that there are no more files
> to checkout. Some filters may implement mechanisms for converting files that
> require a signal to know when all files have been sent. Specifically, Git LFS
> (https://git-lfs.github.com) batches files to be transferred together, and needs
> to know when all files have been announced to truncate and send the last batch,
> if it is not yet full. I'm sure other filter implementations use a similar
> mechanism and would benefit from this as well.
I agree. I think the filter already has this info implicitly as explained above
but an explicit message would be better!
Thanks,
Lars
^ permalink raw reply
* Re: [PATCH v1] convert: add "status=delayed" to filter process protocol
From: Lars Schneider @ 2017-01-11 10:20 UTC (permalink / raw)
To: Jakub Narębski
Cc: Junio C Hamano, Git mailing list, Eric Wong, Taylor Blau
In-Reply-To: <ec8078ef-8ff2-d26f-ef73-5ef612737eee@gmail.com>
> On 10 Jan 2017, at 23:11, Jakub Narębski <jnareb@gmail.com> wrote:
>
> W dniu 09.01.2017 o 00:42, Junio C Hamano pisze:
>> larsxschneider@gmail.com writes:
>>> From: Lars Schneider <larsxschneider@gmail.com>
>>>
>>> Some `clean` / `smudge` filters might require a significant amount of
>>> time to process a single blob. During this process the Git checkout
>>> operation is blocked and Git needs to wait until the filter is done to
>>> continue with the checkout.
>
> Lars, what is expected use case for this feature; that is when do you
> think this problem may happen? Is it something that happened IRL?
Yes, this problem happens every day with filters that perform network
requests (e.g. GitLFS). In GitLFS we even implemented Git wrapper
commands to address the problem: https://github.com/git-lfs/git-lfs/pull/988
The ultimate goal of this patch is to be able to get rid of the wrapper
commands.
>>> Teach the filter process protocol (introduced in edcc858) to accept the
>>> status "delayed" as response to a filter request. Upon this response Git
>>> continues with the checkout operation and asks the filter to process the
>>> blob again after all other blobs have been processed.
>>
>> Hmm, I would have expected that the basic flow would become
>>
>> for each paths to be processed:
>> convert-to-worktree to buf
>> if not delayed:
>> do the caller's thing to use buf
>> else:
>> remember path
>>
>> for each delayed paths:
>> ensure filter process finished processing for path
>> fetch the thing to buf from the process
>> do the caller's thing to use buf
>
> I would expect here to have a kind of event loop, namely
>
> while there are delayed paths:
> get path that is ready from filter
> fetch the thing to buf (supporting "delayed")
> if path done
> do the caller's thing to use buf
> (e.g. finish checkout path, eof convert, etc.)
>
> We can either trust filter process to tell us when it finished sending
> delayed paths, or keep list of paths that are being delayed in Git.
I could implement "get path that is ready from filter" but wouldn't
that complicate the filter protocol? I think we can use the protocol pretty
much as if with the strategy outlined here:
http://public-inbox.org/git/F533857D-9B51-44C1-8889-AA0542AD8250@gmail.com/
Thanks,
Lars
^ permalink raw reply
* [PATCH] t7810: avoid assumption about invalid regex syntax
From: Jeff King @ 2017-01-11 11:10 UTC (permalink / raw)
To: A. Wilcox; +Cc: git, Andreas Schwab
In-Reply-To: <20170111100400.vhd5ytarqpujigbn@sigill.intra.peff.net>
A few of the tests want to check that "git grep -P -E" will
override -P with -E, and vice versa. To do so, we use a
regex with "\x{..}", which is valid in PCRE but not defined
by POSIX (for basic or extended regular expressions).
However, POSIX declares quite a lot of syntax, including
"\x", as "undefined". That leaves implementations free to
extend the standard if they choose. At least one, musl libc,
implements "\x" in the same way as PCRE. Our tests check
that "-E" complains about "\x", which fails with musl.
We can fix this by finding some construct which behaves
reliably on both PCRE and POSIX, but differently in each
system.
One such construct is the use of backslash inside brackets.
In PCRE, "[\d]" interprets "\d" as it would outside the
brackets, matching a digit. Whereas in POSIX, the backslash
must be treated literally, and we match either it or a
literal "d". Moreover, implementations are not free to
change this according to POSIX, so we should be able to rely
on it.
Signed-off-by: Jeff King <peff@peff.net>
---
I've tested this with glibc, but I wasn't able to do so with musl. The
two complications are:
1. Recent versions of git won't build with musl's regex at all,
because it doesn't support the non-standard REG_STARTEND that we
rely on since b7d36ffca (regex: use regexec_buf(), 2016-09-21).
So if applied on an older git, this patch should help, but newer
versions need NO_REGEX (to use the fallback glibc regex code)
either way, which would also make the problem go away.
Still, I think it's the right thing to do, since we are relying on
something that POSIX clearly leaves up to the implementation. It
may also help on other systems, or if musl ends up supporting
REG_STARTEND in the future.
2. I tried to cherry-pick to v2.7.x and test it with musl. Debian
ships with a "musl-gcc" wrapper, but it doesn't work out of the
box. Both zlib and pcre are compiled against glibc, so I'd have to
rebuild those, too. At which point I gave up and decided to just
let you test it on your musl-based system. :)
t/t7810-grep.sh | 26 +++++++++++++++-----------
1 file changed, 15 insertions(+), 11 deletions(-)
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index de2405ccb..19f0108f8 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -39,6 +39,10 @@ test_expect_success setup '
echo "a+bc"
echo "abc"
} >ab &&
+ {
+ echo d &&
+ echo 0
+ } >d0 &&
echo vvv >v &&
echo ww w >w &&
echo x x xx x >x &&
@@ -1105,36 +1109,36 @@ test_expect_success 'grep pattern with grep.patternType=fixed, =basic, =extended
'
test_expect_success 'grep -G -F -P -E pattern' '
- >empty &&
- test_must_fail git grep -G -F -P -E "a\x{2b}b\x{2a}c" ab >actual &&
- test_cmp empty actual
+ echo "d0:d" >expected &&
+ git grep -G -F -P -E "[\d]" d0 >actual &&
+ test_cmp expected actual
'
test_expect_success 'grep pattern with grep.patternType=fixed, =basic, =perl, =extended' '
- >empty &&
- test_must_fail git \
+ echo "d0:d" >expected &&
+ git \
-c grep.patterntype=fixed \
-c grep.patterntype=basic \
-c grep.patterntype=perl \
-c grep.patterntype=extended \
- grep "a\x{2b}b\x{2a}c" ab >actual &&
- test_cmp empty actual
+ grep "[\d]" d0 >actual &&
+ test_cmp expected actual
'
test_expect_success LIBPCRE 'grep -G -F -E -P pattern' '
- echo "ab:a+b*c" >expected &&
- git grep -G -F -E -P "a\x{2b}b\x{2a}c" ab >actual &&
+ echo "d0:0" >expected &&
+ git grep -G -F -E -P "[\d]" d0 >actual &&
test_cmp expected actual
'
test_expect_success LIBPCRE 'grep pattern with grep.patternType=fixed, =basic, =extended, =perl' '
- echo "ab:a+b*c" >expected &&
+ echo "d0:0" >expected &&
git \
-c grep.patterntype=fixed \
-c grep.patterntype=basic \
-c grep.patterntype=extended \
-c grep.patterntype=perl \
- grep "a\x{2b}b\x{2a}c" ab >actual &&
+ grep "[\d]" d0 >actual &&
test_cmp expected actual
'
--
2.11.0.627.gfa6151259
^ permalink raw reply related
* Re: [RFC PATCH 0/5] Localise error headers
From: Jeff King @ 2017-01-11 11:37 UTC (permalink / raw)
To: Stefan Beller; +Cc: Michael J Gruber, git@vger.kernel.org
In-Reply-To: <CAGZ79kYVc0YQ4okrTHGiYQzPqfiVAm_f7orXdkhwgf5kMPXj-w@mail.gmail.com>
On Tue, Jan 10, 2017 at 10:28:42AM -0800, Stefan Beller wrote:
> > And then presumably that mix would gradually move to 100% consistency as
> > more messages are translated. But the implicit question is: are there
> > die() messages that should never be translated? I'm not sure.
>
> I would assume any plumbing command is not localizing?
> Because in plumbing land, (easily scriptable) you may find
> a grep on the output/stderr for a certain condition?
That's the assumption I'm challenging. Certainly the behavior and
certain aspects of the output of a plumbing command should remain the
same over time. But error messages to stderr?
It seems like they should be translated, because plumbing invoked on
behalf of porcelain scripts is going to send its stderr directly to the
user.
> To find a good example, "git grep die" giving me some food of though:
>
> die_errno(..) should always take a string marked up for translation,
> because the errno string is translated?
Yes, I would think die_errno() is a no-brainer for translation, since
the strerror() will be translated.
> apply.c: die(_("internal error"));
>
> That is funny, too. I think we should substitute that with
>
> die("BUG: untranslated, but what went wrong instead")
Yep. We did not consistently use "BUG:" in the early days. I would say
that "BUG" lines do not need to be translated. The point is that nobody
should ever see them, so it seems like there is little point in giving
extra work to translators.
-Peff
^ permalink raw reply
* Re: git cat-file on a submodule
From: Jeff King @ 2017-01-11 12:53 UTC (permalink / raw)
To: David Turner; +Cc: git
In-Reply-To: <1484093500.17967.6.camel@frank>
On Tue, Jan 10, 2017 at 07:11:40PM -0500, David Turner wrote:
> Why does git cat-file -t $sha:foo, where foo is a submodule, not work?
Because "cat-file" is about inspecting items in the object database, and
typically the submodule commit is not present in the superproject's
database. So we cannot know its type. You can infer what it _should_ be
from the surrounding tree, but you cannot actually do the object lookup.
Likewise, "git cat-file -t $sha1:Makefile" is not just telling you that
we found a 100644 entry in the tree, so we expect a blob. It's resolving
to a sha1, and then checking the type of that sha1 in the database. It
_should_ be a blob, but if it isn't, then cat-file is the tool that
should tell you that it is not.
> git rev-parse $sha:foo works.
Right. Because that command is about resolving a name to a sha1, which
we can do even without the object.
> By "why", I mean "would anyone complain if I fixed it?" FWIW, I think
> -p should just return the submodule's sha.
I'm not sure if I'm complaining or not. I can't immediately think of
something that would be horribly broken. But it really feels like you
are using the wrong tool, and patching the tool to handle this case will
probably lead to weird cognitive dissonance down the road.
Maybe it would help to describe your use case more fully. If what you
care about is the presumed type based on the surrounding tree, then
maybe:
git --literal-pathspecs ls-tree $sha -- foo
would be a better match.
-Peff
^ permalink raw reply
* Re: RFC: Enable delayed responses to Git clean/smudge filter requests
From: Lars Schneider @ 2017-01-11 12:57 UTC (permalink / raw)
To: Stefan Beller; +Cc: Git Mailing List
In-Reply-To: <CAGZ79kYDPLDU5Dg_CTnpEX+D9bs6BUSSNTHkqpW2nY-b=e9+SQ@mail.gmail.com>
> On 09 Jan 2017, at 21:44, Stefan Beller <sbeller@google.com> wrote:
>
> On Mon, Nov 14, 2016 at 1:09 PM, Lars Schneider
> <larsxschneider@gmail.com> wrote:
>> Hi,
>>
>> Git always performs a clean/smudge filter on files in sequential order.
>> Sometimes a filter operation can take a noticeable amount of time.
>> This blocks the entire Git process.
>>
>> I would like to give a filter process the possibility to answer Git with
>> "I got your request, I am processing it, ask me for the result later!".
>>
>> I see the following way to realize this:
>>
>> In unpack-trees.c:check_updates() [1] we loop through the cache
>> entries and "ask me later" could be an acceptable return value of the
>> checkout_entry() call. The loop could run until all entries returned
>> success or error.
>
> Late to this thread, but here is an answer nevertheless.
>
> I am currently working on getting submodules working
> for working tree modifying commands (prominently checkout, but
> also read-tree -u and any other caller that uses the code in
> unpack-trees.)
>
> Once the submodules are supported and used, I anticipate that
> putting the files in the working tree on disk will become a bottle neck,
> i.e. the checkout taking way too long for an oversized project.
>
> So in the future we have to do something to make checkout fast
> again, which IMHO is threading. My current vision is to have checkout
> automatically choose a number of threads based on expected workload,
> c.f. preload-index.c, line 18-25.
That sounds interesting! We are using "submodule.fetchjobs=0" to process
submodules in parallel already and it works great! Thanks a lot for
implementing this!
>> The filter machinery is triggered in various other places in Git and
>> all places that want to support "ask me later" would need to be patched
>> accordingly.
>
> I think this makes sense, even in a threaded git-checkout.
> I assume this idea is implemented before threading hits checkout,
> so a question on the design:
>
> Who determines the workload that is acceptable?
> From reading this email, it seems to be solely the filter that uses
> as many threads/processes as it thinks is ok.
Correct.
> Would it be possible to enhance the protocol further to have
> Git also mingle with the workload, i.e. tell the filter it is
> allowed to use up (N-M) threads, as it itself already uses
> M out of N configured threads?
>
> (I do not want to discuss the details here, but only if such a thing
> is viable with this approach as well)
Yes, I think we could give a filter these kind of hints. However, it
would, of course, be up to the filter implementation to follow the hints.
In case you curious, here is the discussion on v1 of the delay implementation:
http://public-inbox.org/git/20170108191736.47359-1-larsxschneider@gmail.com/
Cheers,
Lars
^ permalink raw reply
* [PATCH 0/2] sanitizing error message contents
From: Jeff King @ 2017-01-11 14:01 UTC (permalink / raw)
To: git
When adding a warning() call in 50d341374 (http: make redirects more
obvious, 2016-12-06), somebody brought up that evil servers can redirect
you to something like:
https://evil.example.com/some/repo?unused=\rwarning:+rainbows+and_unicorns_ahead
(where "\r" is a literal CR), and instead of seeing:
warning: redirecting to https://evil.example.com/...
you just get:
warning: rainbows and unicorns ahead
or whatever innocuous looking line they prefer (probably just ANSI
"clear to beginning of line" would be even more effective).
Since it's hard to figure out which error messages could potentially
contain malicious contents, and since spewing control characters to the
terminal is generally bad anyway, this series sanitizes at the lowest
level.
Note that this doesn't cover "remote:" lines coming over the sideband.
Those are already covered for "\r", as we have to parse it to handle
printing "remote:" consistently. But you can play tricks like putting:
printf '\0331K\033[0Efatal: this looks local\n'
into a pre-receive hook. I'm not sure if we would want to do more
sanitizing there. The goal of this series is not so much that a remote
can't send funny strings that may look local, but that they can't
prevent local strings from being displayed. OTOH, I suspect clever use
of ANSI codes (moving the cursor, clearing lines, etc) could get you
pretty far.
I'd be hesitant to disallow control codes entirely, though, as I suspect
some servers do send colors over the sideband. So I punted on that here,
but I think this is at least an incremental improvement.
[1/2]: Revert "vreportf: avoid intermediate buffer"
[2/2]: vreport: sanitize ASCII control chars
usage.c | 17 +++++++----------
1 file changed, 7 insertions(+), 10 deletions(-)
-Peff
^ permalink raw reply
* [PATCH 1/2] Revert "vreportf: avoid intermediate buffer"
From: Jeff King @ 2017-01-11 14:02 UTC (permalink / raw)
To: git
In-Reply-To: <20170111140138.5p647xuqpqrej63b@sigill.intra.peff.net>
This reverts commit f4c3edc0b156362a92bf9de4f0ec794e90a757fc.
The purpose of that commit was to let us write errors of
arbitrary length to stderr by skipping the intermediate
buffer and sending our varargs straight to fprintf. That
works, but it comes with a downside: we do not get access to
the varargs before they are sent to stderr.
On balance, it's not a good tradeoff. Error messages larger
than our 4K buffer are quite uncommon, and we've lost the
ability to make any modifications to the output (e.g., to
remove non-printable characters).
The only way to have both would be one of:
1. Write into a dynamic buffer. But this is a bad idea for
a low-level function that may be called when malloc()
has failed.
2. Do our own printf-format varargs parsing. This is too
complex to be worth the trouble.
Let's just revert that change and go back to a fixed buffer.
Signed-off-by: Jeff King <peff@peff.net>
---
usage.c | 15 +++------------
1 file changed, 3 insertions(+), 12 deletions(-)
diff --git a/usage.c b/usage.c
index 17f52c1b5..b1cbe6799 100644
--- a/usage.c
+++ b/usage.c
@@ -7,21 +7,13 @@
#include "cache.h"
static FILE *error_handle;
-static int tweaked_error_buffering;
void vreportf(const char *prefix, const char *err, va_list params)
{
+ char msg[4096];
FILE *fh = error_handle ? error_handle : stderr;
-
- fflush(fh);
- if (!tweaked_error_buffering) {
- setvbuf(fh, NULL, _IOLBF, 0);
- tweaked_error_buffering = 1;
- }
-
- fputs(prefix, fh);
- vfprintf(fh, err, params);
- fputc('\n', fh);
+ vsnprintf(msg, sizeof(msg), err, params);
+ fprintf(fh, "%s%s\n", prefix, msg);
}
static NORETURN void usage_builtin(const char *err, va_list params)
@@ -93,7 +85,6 @@ void set_die_is_recursing_routine(int (*routine)(void))
void set_error_handle(FILE *fh)
{
error_handle = fh;
- tweaked_error_buffering = 0;
}
void NORETURN usagef(const char *err, ...)
--
2.11.0.627.gfa6151259
^ permalink raw reply related
* [PATCH 2/2] vreport: sanitize ASCII control chars
From: Jeff King @ 2017-01-11 14:02 UTC (permalink / raw)
To: git
In-Reply-To: <20170111140138.5p647xuqpqrej63b@sigill.intra.peff.net>
Our error() and die() calls may report messages with
arbitrary data (e.g., filenames or even data from a remote
server). Let's make it harder to cause confusion with
mischievous filenames. E.g., try:
git rev-parse "$(printf "\rfatal: this argument is too sneaky")" --
or
git rev-parse "$(printf "\x1b[5mblinky\x1b[0m")" --
Let's block all ASCII control characters, with the exception
of TAB and LF. We use both in our own messages (and we are
necessarily sanitizing the complete output of snprintf here,
as we do not have access to the individual varargs). And TAB
and LF are unlikely to cause confusion (you could put
"\nfatal: sneaky\n" in your filename, but it would at least
not _cover up_ the message leading to it, unlike "\r").
We'll replace the characters with a "?", which is similar to
how "ls" behaves. It might be nice to do something less
lossy, like converting them to "\x" hex codes. But replacing
with a single character makes it easy to do in-place and
without worrying about length limitations. This feature
should kick in rarely enough that the "?" marks are almost
never seen.
We'll leave high-bit characters as-is, as they are likely to
be UTF-8 (though there may be some Unicode mischief you
could cause, which may require further patches).
Signed-off-by: Jeff King <peff@peff.net>
---
usage.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/usage.c b/usage.c
index b1cbe6799..ad6d2910f 100644
--- a/usage.c
+++ b/usage.c
@@ -12,7 +12,13 @@ void vreportf(const char *prefix, const char *err, va_list params)
{
char msg[4096];
FILE *fh = error_handle ? error_handle : stderr;
+ char *p;
+
vsnprintf(msg, sizeof(msg), err, params);
+ for (p = msg; *p; p++) {
+ if (iscntrl(*p) && *p != '\t' && *p != '\n')
+ *p = '?';
+ }
fprintf(fh, "%s%s\n", prefix, msg);
}
--
2.11.0.627.gfa6151259
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox