Git development
 help / color / mirror / Atom feed
* Re: non-US-ASCII file names (e.g. Hiragana) on Windows
From: Robin Rosenberg @ 2009-12-01 22:11 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Johannes Sixt, Thomas Singer, git
In-Reply-To: <20091201162627.GE21299@spearce.org>

tisdag 01 december 2009 17:26:27 skrev du:
> Johannes Sixt <j.sixt@viscovery.net> wrote:
> > Thomas Singer schrieb:
> > > To be more precise: Who is interpreting the bytes in the file names as
> > > characters? Windows, Git or Java?
> >
> > In the case of git: Windows does it, using the console's codepage to
> > convert between bytes and Unicode.
> >
> > I don't know about Java, but I guess that no conversion is necessary
> > because Java is Unicode-aware.
>
> Actually, conversion is necessary, and its something that is proving
> to be really painful within JGit.
>
> The Java IO APIs use UTF-16 for file names.  However we are reading
> a stream of unknown bytes from the index file and tree objects.
> Thus JGit must convert a stream of bytes into UTF-16 just to get
> to the OS.
>
> The JVM then turns around and converts from UTF-16 to some other
> encoding for the filesystem.
>
> On Win32 I suspect the JVM uses the native UTF-16 file APIs, so
> this translation is lossless.
>
> On POSIX, I suspect the JVM uses $LANG or some other related
> environment variable to guess the user's preferred encoding, and
> then converts from UTF-16 to bytes in that encoding.  And I have
> no idea how they handle normalization of composed code points.
>
> All of these layers make for a *very* confusing situation for us
> within JGit:
>
>   git tree
>   +---------+
>
>   | bytes   | -+
>
>   +---------+   \
>                  \             +--------+            +---------+
>                   +-- JGit --> | UTF-16 | -- JVM --> | OS call |
>   .git/index     /             +--------+            +---------+
>   +---------+   /
>
>   | bytes   | -+
>
>   +---------+
>
> Its impossible for us to do what C git does, which is just use the
> bytes used by the OS call within the git datastructure.  Which of
> course also isn't always portable, e.g. the Mac OS X HFS+ mess.

We can decode the index anyway we like but not file names coming from
the file system. On Windows, any sane name (it does allow invalid UTF-16 too, 
but...) will be readable by JGit, but on a UTF-8 posix that may not be so, if 
the filename is actually Latin.-1 encoded. In that case the Java runtime will 
return a decoded filename containing an "invalid" code point and any attempt to 
access the file from java will fail. I can see some horribly expensive ways to 
work around that but...

As for the more sane cases I have a compare routine that works on mixed 
encodings that may help to solve some of the problems. Ideally it would not
only be able to compare filenames with unknown encodings to handling case 
folding and composing characters in one go too. I guess one could make it
fall back to another encoding than Latin-1, but with lesser certainty, but
it will not (for sure) work with any arbitrary set of encodings. You'll have 
to choose, so it's only a legacy workaround, as opposed to a solution. 

-- robin

^ permalink raw reply

* What's cooking in git.git (Dec 2009, #01; Tue, 01)
From: Junio C Hamano @ 2009-12-01 22:10 UTC (permalink / raw)
  To: git

I am this close to actually tagging 1.6.6-rc1, but I am reasonably sure
that I missed and did not pick up a few important fixes that should go
into it, so here is the current status.



What's cooking in git.git (Dec 2009, #01; Tue, 01)
--------------------------------------------------

Here are the topics that have been cooking.  Commits prefixed with '-' are
only in 'pu' while commits prefixed with '+' are in 'next'.  The ones
marked with '.' do not appear in any of the integration branches, but I am
still holding onto them.

In 1.7.0, we plan to correct handful of warts in the interfaces everybody
agrees that they were mistakes.  The resulting system may not be strictly
backward compatible.  Currently planned changes are:

 * refuse push to update the checked out branch in a non-bare repo by
   default

   Make "git push" into a repository to update the branch that is checked
   out fail by default.  You can countermand this default by setting a
   configuration variable in the receiving repository.

   http://thread.gmane.org/gmane.comp.version-control.git/107758/focus=108007

 * refuse push to delete the current branch by default

   Make "git push $there :$killed" to delete the branch that is pointed at
   by its HEAD fail by default.  You can countermand this default by
   setting a configuration variable in the receiving repository.

   http://thread.gmane.org/gmane.comp.version-control.git/108862/focus=108936

 * "git send-email" won't make deep threads by default

   Many people said that by default when sending more than 2 patches the
   threading git-send-email makes by default is hard to read, and they
   prefer the default be one cover letter and each patch as a direct
   follow-up to the cover letter.  You can countermand this by setting a
   configuration variable.

   http://article.gmane.org/gmane.comp.version-control.git/109790

 * "git status" won't be "git-commit --dry-run" anymore

   http://thread.gmane.org/gmane.comp.version-control.git/125989/focus=125993

 * "git diff -w --exit-code" will exit success if only differences it
   found are whitespace changes that are stripped away from the output.

   http://thread.gmane.org/gmane.comp.version-control.git/119731/focus=119751

 * "git diff -w/-b" won't even produce "diff --git" header when all changes
   are about whitespaces.

   http://thread.gmane.org/gmane.comp.version-control.git/133256

--------------------------------------------------
[Graduated to "master"]

* fc/maint-format-patch-pathspec-dashes (2009-11-26) 2 commits.
 + format-patch: add test for parsing of "--"
 + format-patch: fix parsing of "--" on the command line

* bw/diff-color-hunk-header (2009-11-27) 2 commits
  (merged to 'next' on 2009-11-29 at c446977)
 + Give the hunk comment its own color
  (merged to 'next' on 2009-11-27 at 42ab131)
 + emit_line(): don't emit an empty <SET><RESET> followed by a newline

* jc/maint-am-keep (2009-11-27) 1 commit.
  (merged to 'next' on 2009-11-27 at 7663874)
 + Remove dead code from "git am"

* ns/send-email-no-chain-reply-to (2009-11-29) 1 commit
 + prepare send-email for smoother change of --chain-reply-to default
  (this branch is used by ns/1.7.0-send-email-no-chain-reply-to.)

This starts warning about the change to --no-chain-reply-to
in 1.7.0 for smoother transition.

* uk/maint-shortlog-encoding (2009-11-25) 1 commit.
 - shortlog: respect commit encoding

* fc/send-email-envelope (2009-11-26) 2 commits.
  (merged to 'next' on 2009-11-27 at 2d0257d)
 + send-email: automatic envelope sender
 + t9001: test --envelope-sender option of send-email

* jc/mailinfo-remove-brackets (2009-07-15) 1 commit.
  (merged to 'next' on 2009-11-25 at 09d498f)
 + mailinfo: -b option keeps [bracketed] strings that is not a [PATCH] marker

Jim Meyering sent a patch to do a subset of what this does; to allow
keeping '[SECURITY]' when the subject says '[SECURITY][PATCH]', you need
to also teach "am" to pass the new -b option, but that is independent of
what Jim showed the need in real-world, so I think this can go in as-is.

* jn/gitweb-blame (2009-11-24) 8 commits.
  (merged to 'next' on 2009-11-25 at 0a5b649)
 + gitweb.js: fix padLeftStr() and its usage
 + gitweb.js: Harden setting blamed commit info in incremental blame
 + gitweb.js: fix null object exception in initials calculation
 + gitweb: Minify gitweb.js if JSMIN is defined
 + gitweb: Create links leading to 'blame_incremental' using JavaScript
  (merged to 'next' on 2009-10-11 at 73c4a83)
 + gitweb: Colorize 'blame_incremental' view during processing
 + gitweb: Incremental blame (using JavaScript)
 + gitweb: Add optional "time to generate page" info in footer

With two more changes to disable this by default to make it
suitable as "new feature with known breakages" for 1.6.6

* em/commit-claim (2009-11-04) 1 commit
  (merged to 'next' on 2009-11-23 at b5df6fd)
 + commit -c/-C/--amend: reset timestamp and authorship to committer with --reset-author

* cc/bisect-doc (2009-11-08) 1 commit
  (merged to 'next' on 2009-11-27 at c46d648)
 + Documentation: add "Fighting regressions with git bisect" article

* jc/pretty-lf (2009-10-04) 1 commit.
  (merged to 'next' on 2009-11-27 at 73651c4)
 + Pretty-format: %[+-]x to tweak inter-item newlines

--------------------------------------------------
[New Topics]

* ap/merge-backend-opts (2008-07-18) 6 commits
 - Document that merge strategies can now take their own options
 - Extend merge-subtree tests to test -Xsubtree=dir.
 - Make "subtree" part more orthogonal to the rest of merge-recursive.
 - Teach git-pull to pass -X<option> to git-merge
 - git merge -X<option>
 - git-merge-file --ours, --theirs

"git pull" patch needs sq-then-eval fix but otherwise seemed good.

* mo/bin-wrappers (2009-11-29) 3 commits
 - INSTALL: document a simpler way to run uninstalled builds
 - run test suite without dashed git-commands in PATH
 - build dashless "bin-wrappers" directory similar to installed bindir

--------------------------------------------------
[Stalled]

* je/send-email-no-subject (2009-08-05) 1 commit.
  (merged to 'next' on 2009-10-11 at 1b99c56)
 + send-email: confirm on empty mail subjects

The existing tests cover the positive case (i.e. as long as the user says
"yes" to the "do you really want to send this message that lacks subject",
the message is sent) of this feature, but the feature itself needs its own
test to verify the negative case (i.e. does it correctly stop if the user
says "no"?)

* jn/rfc-pull-rebase-error-message (2009-11-12) 1 commit
 - git-pull.sh --rebase: overhaul error handling when no candidates are found

I heard this needs at least retitling among other changes?

* jh/notes (2009-11-20) 10 commits
 - Add more testcases to test fast-import of notes
 - Rename t9301 to t9350, to make room for more fast-import tests
 - fast-import: Proper notes tree manipulation using the notes API
 - Refactor notes concatenation into a flexible interface for combining notes
 - Notes API: Allow multiple concurrent notes trees with new struct notes_tree
 - Notes API: for_each_note(): Traverse the entire notes tree with a callback
 - Notes API: get_note(): Return the note annotating the given object
 - Notes API: add_note(): Add note objects to the internal notes tree structure
 - Notes API: init_notes(): Initialize the notes tree from the given notes ref
 - Notes API: get_commit_notes() -> format_note() + remove the commit restriction

Johan waits for an Ack from Shawn on "fast-import" one.

* tr/maint-merge-ours-clarification (2009-11-15) 1 commit
  (merged to 'next' on 2009-11-21 at fadaf7b)
 + rebase: refuse to rebase with -s ours

I do not think we reached a concensus for solving conflicts between "give
them rope" and "protect users from clearly meaningless combinations".  The
author obviously is for the latter (and I am inclined to agree); Dscho
seems to think otherwise.

* jc/fix-tree-walk (2009-10-22) 8 commits
  (merged to 'next' on 2009-10-22 at 10c0c8f)
 + Revert failed attempt since 353c5ee
 + read-tree --debug-unpack
  (merged to 'next' on 2009-10-11 at 0b058e2)
 + unpack-trees.c: look ahead in the index
 + unpack-trees.c: prepare for looking ahead in the index
 + Aggressive three-way merge: fix D/F case
 + traverse_trees(): handle D/F conflict case sanely
 + more D/F conflict tests
 + tests: move convenience regexp to match object names to test-lib.sh

This has some stupid bugs and reverted from 'next' until I can fix it, but
the "temporarily" turned out to be very loooong.  Sigh.  We won't have a
proper fix in 1.6.6.

* sr/gfi-options (2009-09-06) 6 commits.
 - fast-import: test the new option command
 - fast-import: add option command
 - fast-import: test the new feature command
 - fast-import: add feature command
 - fast-import: put marks reading in it's own function
 - fast-import: put option parsing code in separate functions

Sverre is working on a re-roll to address comments from Shawn.

--------------------------------------------------
[Cooking]

* tr/http-updates (2009-11-27) 2 commits
 - Add an option for using any HTTP authentication scheme, not only basic
 - http: maintain curl sessions

It seems that this is still under discussion...

* jc/diff-whitespace-prepare (2009-11-28) 2 commits
 - diff: flip the default diff.bwoutputonly to true
 - diff: optionally allow traditional "-b/-w affects only output" semantics
 (this branch uses gb/1.7.0-diff-whitespace-only-output and jc/1.7.0-diff-whitespace-only-status; is used by jc/1.7.0-diff-whitespace-prepare.)

This is to redo the two -b/-w semantic changes to prepare the migration of
existing users before 1.7.0 happens.

* sr/vcs-helper (2009-11-18) 12 commits
  (merged to 'next' on 2009-11-27 at 83268ab)
 + Add Python support library for remote helpers
 + Basic build infrastructure for Python scripts
 + Allow helpers to report in "list" command that the ref is unchanged
 + Fix various memory leaks in transport-helper.c
 + Allow helper to map private ref names into normal names
 + Add support for "import" helper command
 + Allow specifying the remote helper in the url
 + Add a config option for remotes to specify a foreign vcs
 + Allow fetch to modify refs
 + Use a function to determine whether a remote is valid
 + Allow programs to not depend on remotes having urls
 + Fix memory leak in helper method for disconnect

Should be among the first to graduate after 1.6.6 final.

* jc/grep-full-tree (2009-11-24) 1 commit.
 - grep: --full-tree

The interaction with this option and pathspecs need to be worked out
better.  I _think_ "grep --full-tree -e pattern -- '*.h'" should find from
all the header files in the tree, for example.

* jc/checkout-merge-base (2009-11-20) 2 commits
 - "rebase --onto A...B" replays history on the merge base between A and B
 - "checkout A...B" switches to the merge base between A and B

I've been using the first one for a while myself but do not see many users
want this (yet); the new feature is not urgent anyway.

* tr/reset-checkout-patch (2009-11-19) 1 commit.
  (merged to 'next' on 2009-11-22 at b224950)
 + {checkout,reset} -p: make patch direction configurable

I do not particularly like a configuration like this that changes the
behaviour of a command in a drastic way---it will make helping others
much harder.

* nd/sparse (2009-11-25) 20 commits.
  (merged to 'next' on 2009-11-25 at 71380f5)
 + tests: rename duplicate t1009
  (merged to 'next' on 2009-11-23 at f712a41)
 + sparse checkout: inhibit empty worktree
 + Add tests for sparse checkout
 + read-tree: add --no-sparse-checkout to disable sparse checkout support
 + unpack-trees(): ignore worktree check outside checkout area
 + unpack_trees(): apply $GIT_DIR/info/sparse-checkout to the final index
 + unpack-trees(): "enable" sparse checkout and load $GIT_DIR/info/sparse-checkout
 + unpack-trees.c: generalize verify_* functions
 + unpack-trees(): add CE_WT_REMOVE to remove on worktree alone
 + Introduce "sparse checkout"
 + dir.c: export excluded_1() and add_excludes_from_file_1()
 + excluded_1(): support exclude files in index
 + unpack-trees(): carry skip-worktree bit over in merged_entry()
 + Read .gitignore from index if it is skip-worktree
 + Avoid writing to buffer in add_excludes_from_file_1()
 + Teach Git to respect skip-worktree bit (writing part)
 + Teach Git to respect skip-worktree bit (reading part)
 + Introduce "skip-worktree" bit in index, teach Git to get/set this bit
 + Add test-index-version
 + update-index: refactor mark_valid() in preparation for new options

There were some test glitches reported and at least one test seems to 
be broken in the sense that it is not testing what it is trying to.
Fix-up expected.

--------------------------------------------------
[For 1.7.0]

* jk/1.7.0-status (2009-11-27) 7 commits.
  (merged to 'next' on 2009-11-27 at 91691ec)
 + t7508-status.sh: Add tests for status -s
 + status -s: respect the status.relativePaths option
  (merged to 'next' on 2009-11-21 at 884bb56)
 + docs: note that status configuration affects only long format
  (merged to 'next' on 2009-10-11 at 65c8513)
 + commit: support alternate status formats
 + status: add --porcelain output format
 + status: refactor format option parsing
 + status: refactor short-mode printing to its own function
 (this branch uses jc/1.7.0-status.)

Gives the --short output format to post 1.7.0 "git commit --dry-run" that
is similar to that of post 1.7.0 "git status".

Immediately after 1.6.6 while rebuilding 'next', we may want to reorder a
few commits at the tip, as "docs: affects only long format" describes a
limitation that will disappear soon.

* jc/1.7.0-status (2009-09-05) 4 commits.
  (merged to 'next' on 2009-10-11 at 9558627)
 + status: typo fix in usage
 + git status: not "commit --dry-run" anymore
 + git stat -s: short status output
 + git stat: the beginning of "status that is not a dry-run of commit"
 (this branch is used by jk/1.7.0-status.)

With this, "git status" is no longer "git commit --dry-run".

* jc/1.7.0-send-email-no-thread-default (2009-08-22) 1 commit.
  (merged to 'next' on 2009-10-11 at 043acdf)
 + send-email: make --no-chain-reply-to the default

As the title says.

* jc/1.7.0-push-safety (2009-02-09) 2 commits.
  (merged to 'next' on 2009-10-11 at 81b8128)
 + Refuse deleting the current branch via push
 + Refuse updating the current branch in a non-bare repository via push

* jc/1.7.0-diff-whitespace-only-status (2009-08-30) 4 commits.
  (merged to 'next' on 2009-10-11 at 546c74d)
 + diff.c: fix typoes in comments
 + Make test case number unique
 + diff: Rename QUIET internal option to QUICK
 + diff: change semantics of "ignore whitespace" options
 (this branch is used by jc/1.7.0-diff-whitespace-prepare and jc/diff-whitespace-prepare.)

This changes exit code from "git diff --ignore-whitespace" and friends
when there is no actual output.  It is a backward incompatible change,
and jc/diff-whitespace-prepare topic is meant to ease the transition.

* gb/1.7.0-diff-whitespace-only-output (2009-11-19) 1 commit
  (merged to 'next' on 2009-11-21 at 3375bf4)
 + No diff -b/-w output for all-whitespace changes
 (this branch is used by jc/1.7.0-diff-whitespace-prepare and jc/diff-whitespace-prepare.)

Likewise but for the output of "diff --git" headers.

* jc/1.7.0-diff-whitespace-prepare (2009-11-28) 2 commits
 - diff: disable diff.bwoutputonly warning
 - diff: flip the diff.bwoutputonly default to false
 (this branch uses gb/1.7.0-diff-whitespace-only-output, jc/1.7.0-diff-whitespace-only-status and jc/diff-whitespace-prepare.)

And this is to actually flip the default and eventually remove the warning.

* ns/1.7.0-send-email-no-chain-reply-to (2009-08-22) 1 commit
 - send-email: make --no-chain-reply-to the default

And this is to actually flip the default in 1.7.0.


--------------------------------------------------
[Reverted from 'next']

* jc/botched-maint-cygwin-count-objects (2009-11-24) 2 commits
  (merged to 'next' on 2009-11-25 at 8aa62a0)
 + Revert "ST_BLOCKS_COUNTS_IN_BLKSIZE to say on-disk size is (st_blksize * st_blocks)"
  (merged to 'next' on 2009-11-22 at 4ba5880)
 + ST_BLOCKS_COUNTS_IN_BLKSIZE to say on-disk size is (st_blksize * st_blocks)

This is a revert of the tip one I merged prematurely to 'next'.  The real
fix from Ramsay is already in 'master'.

* ks/precompute-completion (2009-11-15) 4 commits.
  (merged to 'next' on 2009-11-15 at 23cdb96)
 + Revert ks/precompute-completion series
  (merged to 'next' on 2009-10-28 at cd5177f)
 + completion: ignore custom merge strategies when pre-generating
  (merged to 'next' on 2009-10-22 at f46a28a)
 + bug: precomputed completion includes scripts sources
  (merged to 'next' on 2009-10-14 at adf722a)
 + Speedup bash completion loading

Reverted out of 'next', to be replaced with jn/faster-completion-startup
topic.

--------------------------------------------------
[I have been too busy to purge these]

* jc/log-tz (2009-03-03) 1 commit.
 - Allow --date=local --date=other-format to work as expected

Maybe some people care about this.  I dunno.

* jc/1.7.0-no-commit-no-ff-2 (2009-10-22) 1 commit.
 . git-merge: forbid fast-forward and up-to-date when --no-commit is given

This makes "git merge --no-commit" fail when it results in fast-forward or
up-to-date.  It appears nobody wants to have this, so I dropped it.

* ne/rev-cache (2009-10-19) 7 commits.
 . support for commit grafts, slight change to general mechanism
 . support for path name caching in rev-cache
 . full integration of rev-cache into git, completed test suite
 . administrative functions for rev-cache, start of integration into git
 . support for non-commit object caching in rev-cache
 . basic revision cache system, no integration or features
 . man page and technical discussion for rev-cache

The author indicated that there is another round coming.  Does not seem to
pass the tests when merged to 'pu', so it has been ejected for now.

* pb/gitweb-no-project-list (2009-11-06) 3 commits.
 . gitweb: Polish the content tags support
 . gitweb: Support for no project list on gitweb front page
 . gitweb: Refactor project list routines

I picked these up but didn't queue as Warthog9's comments made certain
amount of sense to me.

^ permalink raw reply

* Re: "git merge" merges too much!
From: Greg A. Woods @ 2009-12-01 21:58 UTC (permalink / raw)
  To: The Git Mailing List
In-Reply-To: <20091201205057.GD11235@dpotapov.dyndns.org>

[-- Attachment #1: Type: text/plain, Size: 5801 bytes --]

At Tue, 1 Dec 2009 23:50:57 +0300, Dmitry Potapov <dpotapov@gmail.com> wrote:
Subject: Re: "git merge" merges too much!
> 
> > > 
> > > $ git branch new-foo foo
> > > 
> > > $ git rebase --onto newbase oldbase new-foo
> > 
> > Hmmm.... I'll have to think about that.  It makes some sense, but I
> > don't intuitively read the command-line parameters well enough to
> > predict the outcome in all of the scenarios I'm interested in.
> > 
> > what is "oldbase" there?  I'm guessing it means "base of foo" (and for
> > the moment, "new-foo" too)?
> 
> You have:
> 
>  o---o---o---o---o  newbase
>        \
>         o---o---o---o---o  oldbase
>                          \
>                           o---o---o  foo

Yes, sort of -- in the ideal situation, but not in my particular example
where "oldbase" is just a tag, not a real branch.

So yes, "oldbase" is in fact "base of foo".  Trickier still is when the
"oldbase" branch has one or more commits newer then "base of foo".  Does
Git not have a symbolic name for the true base of a branch?  I.e. is
there not some form of symbolic name for "N" in the following?

   o---o---o---o---o---o---o---o  master
            \
             o---o---N---o---o  release-1
                      \
                       o---o---o  local-release-1

(now of course if it is discovered that "release-1" has progressed since
the base of "foo" then "foo" should be rebased first, but perhaps there
is not time to do this before the other release has to be supported)


> and you want this:
> 
>  o---o---o---o---o  newbase
>      |            \
>      |             o'--o'--o'  new-foo
>       \
>        o---o---o---o---o  oldbase
>                          \
>                           o---o---o  foo

Yes, sort of I suppose, if you trim all the non-relevant branches.

What I really want, I think, is something like this where at least the
non-relevant "master" branch is still shown:

                   1'--2'--3'  new-foo
                  /
         o---o---o  newbase
        /
   o---o---o---o---o---o---o---o  master
                \
                 o---o---o  oldbase
                          \
                           1---2---3  foo

Here's part of my confusion -- "newbase" as used above is actually older
than "oldbase".  :-) so ideally "oldbase" should always be described in
terms of "foo", not just given an arbitrary unrelated name.

Of course that doesn't rule out the following scenario either where
"newbase" really is newer than "oldbase" -- in my world a given project
might become locally supported first on either a newer release, or an
older release, so both above and below might happen:

                           1'--2'--3'  new-foo
                          /
                 o---o---o  newbase
                /
   o---o---o---o---o---o---o---o  master
        \
         o---o---o  oldbase
                  \
                   1---2---3  foo

And eventually I want to also merge whatever is still relevant from foo
to a "local" branch off master so that those changes can be sent
(usually as patches) upstream.

Sometimes I want to do development on a topic branch as close to the tip
of "master" so that it can most easily be pushed upstream, and then
back-port those changes to older release branches.

In fact the latter is exactly how I picture release branches to work in
normal development, and this is how several of the big projects I'd like
to get using Git are doing development (now usually with CVS).

Note too that in these kinds of projects "topic" branches are _always_
forked from the current tip of "master", long-running ones sometimes
rebased to keep up with "master", small fixes and changes are made
directly to the master branch; and small fixes, as well as relevant
features, sometimes those developed on "topic" branches, are back-ported
to release branches.

Note I'm not talking about ideals of best practises specific for Git
here -- I'm talking about actual working operational practises that
people are _very_ familiar with and which have been well proven using a
vast wide variety of different VCS's in the past.  For example I
seriously doubt any of the developers of the projects I'm thinking of
that I'd like to switch to using Git are ever going to want to fork
their topic branches from the oldest release branch base that they
intend to support, and many such projects will necessarily always have
at least a few long-running topic branches that will have to be
frequently rebased to keep up with the trunk so that their eventual
merging will go as smoothly as possible, and yet once any of these topic
branches is finally "closed" their changes may also have to be
back-ported to release branches.

To me the natural way to do these kinds of back-porting "merges" is to
restrict the merge to select only the commits on the branch, i.e. from
its base to its tip, thus the motivation for the topic of my thread (and
I think the motivation for the "What is the best way to backport a
feature?" thread as well).  I think if Git could do this kind of
"partial" merging directly without having to "copy" deltas with "rebase"
or "cherry-pick" or "am" or whatever, and thus create separate histories
for them, then it would be much better at supporting this traditional
practice of using branches to manage releases.  Without such ability it
truly does look as though some form of "patch" management tool is also a
necessary thing(evil?), as "rebase" and "cherry-pick" could quickly get
way out of control and be way too much work otherwise.

-- 
						Greg A. Woods
						Planix, Inc.

<woods@planix.com>       +1 416 218 0099        http://www.planix.com/

[-- Attachment #2: Type: application/pgp-signature, Size: 186 bytes --]

^ permalink raw reply

* Re: multiple working directories for long-running builds (was: "git merge" merges too much!)
From: Dmitry Potapov @ 2009-12-01 21:18 UTC (permalink / raw)
  To: The Git Mailing List
In-Reply-To: <m1NFXvL-000kn2C@most.weird.com>

On Tue, Dec 01, 2009 at 01:58:05PM -0500, Greg A. Woods wrote:
> 
> > > I just disagreed that "git archive" was a reasonable alternative to
> > > leaving the working directory alone during the entire time of the build.
> > 
> > Using "git archive" allows you avoid running long time procedure such as
> > full clean build and testing in the working tree. Also, it is guaranteed
> > that you test exactly what you put in Git and some other garbage in your
> > working tree does not affect the result.
> 
> Sure, but let's be very clear here:  "git archive" is likely even more
> impossible for some large projects to use than "git clone" would be to
> use to create build directories.

AFAIK, "git archive" is cheaper than git clone. I do not say it is fast
for huge project, but if you want to run a process such as clean build
and test that takes a long time anyway, it does not add much to the
total time.

> 
> Disk bandwidth is almost always more expensive than disk space.

Disk bandwidth is certainly more expensive than disk space, and the
whole point was to avoid a lot of disk bandwidth by using hot cache.
If you have two working tree then it is likely that only one will be
in the hot cache, that is why you can switch faster (and to recompile
a few files) than going to another working tree. It has never been
about disk space, it is about disk cache and keeping it hot.

> 
> Multiple working directories are really the only sane solution
> sometimes.

Sure, sometimes... I do not know details to say what will be better in
your case, but I just wanted to say that you should weight that against
switching, because switching in Git is very fast. Much faster than with
any other VCS...

Another thing to consider is that if you put a really huge project in one
Git repo than Git may not be as fast as you may want, because Git tracks
the whole project as the whole. So, you may want to split your project in
a few relatively independent modules (See git submodule).


Dmitry

^ permalink raw reply

* Re: "git merge" merges too much!
From: Dmitry Potapov @ 2009-12-01 20:50 UTC (permalink / raw)
  To: The Git Mailing List
In-Reply-To: <m1NFXpl-000knKC@most.weird.com>

On Tue, Dec 01, 2009 at 01:52:18PM -0500, Greg A. Woods wrote:
> At Mon, 30 Nov 2009 22:22:12 +0300, Dmitry Potapov <dpotapov@gmail.com> wrote:
> Subject: Re: "git merge" merges too much!
> > 
> > The key difference comparing to what you may got used is that branches
> > are normally based on the oldest branch in what this feature may be
> > included. Thus normally changes are not backported to old branches,
> > because you can merge them directly.
> 
> Hmmm... the idea of creating topic branches based on the oldest branch
> where the feature might be used is indeed neither intuitive, nor is it
> mentioned anywhere I've so far read about using topic branches in Git.

Most things that we consider "intuitive" are those that we got used to.
Git is different in many aspect than other VCSes (such as CVS/SVN), and
the workflow that good for those VCSes may not be optimal for Git. There
is a good description that provide basic knowledge how to use Git:

man gitworkflows

or online:

http://www.kernel.org/pub/software/scm/git/docs/gitworkflows.html

If you do not base your changes on the oldest branch then you will not
be able to merge changes, which implies you will have to cherry-pick
manually without ability automatic to track what changes were merged
and what were not, this is a recipe for a disaster...


> At the moment I'm leaning towards a process where the configuration
> branch is re-created for every build -- i.e. the merges are redone from
> every topic branch to a freshly configured branch forked from the
> locally supported release branch, hopefully making use of git-rerere to
> solve most conflicts in as automated a fashion as is possible.

I am not quite sure that I fully understood your idea of configuration
branches, but I want to warn you about one serious limitations of
git-rerere -- it stores conflict resolution per-file basis. This means
that if resolution of some conflict implies some change to another file
then git-rerere will not help you here. So, it handles maybe 80-90%
cases, but not all of them.

> 
> Perhaps Stacked-Git really is the best answer.  I will have to
> investigate more.

There is also TopGit. I have never used any of them, but if you are
interested in patch management system, you probably should look at both
of them. StGit is modelled after quilt, while TopGit is aimed to be
better integrated with Git and better fit to work in distributed
environment. But as I said, I do not have any first hand experience
with any of them. (Personally, I would look at TopGit first, but maybe
I am biased here).

> > 
> > $ git branch new-foo foo
> > 
> > $ git rebase --onto newbase oldbase new-foo
> 
> Hmmm.... I'll have to think about that.  It makes some sense, but I
> don't intuitively read the command-line parameters well enough to
> predict the outcome in all of the scenarios I'm interested in.
> 
> what is "oldbase" there?  I'm guessing it means "base of foo" (and for
> the moment, "new-foo" too)?

You have:

 o---o---o---o---o  newbase
       \
        o---o---o---o---o  oldbase
                         \
                          o---o---o  foo


and you want this:

 o---o---o---o---o  newbase
     |            \
     |             o´--o´--o´  new-foo
      \
       o---o---o---o---o  oldbase
                         \
                          o---o---o  foo


Dmitry

^ permalink raw reply

* Re: [RFC PATCH 0/8] Git remote helpers to implement smart transports.
From: Junio C Hamano @ 2009-12-01 20:42 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Ilari Liusvaara, Sverre Rabbelier, git
In-Reply-To: <20091201193009.GM21299@spearce.org>

"Shawn O. Pearce" <spearce@spearce.org> writes:

> Ilari Liusvaara <ilari.liusvaara@elisanet.fi> wrote:
>> 
>> For instance, to support new types of authentication for smart transports
>> without patching client git binaries (SSH has lots of failure modes that
>> are quite nasty to debug) or abusing GIT_PROXY (yuck). 
>
> So the bulk of this series is about making a proxy for git://
> easier to tie into git?
>
> Forgive me if I sound stupid, but for gits:// shouldn't that just
> be a matter of git_connect() forking a git-remote-gits process
> linked against openssl?  Or, maybe it just runs `openssl s_client`?
>
> Why go through all of this effort of making a really generic proxy
> protocol system when the long-term plan is to just ship native
> gits:// support as part of git-core?

I didn't know what the long-term plan was to be honest, but after skimming
the series, I think your response is a good summary.

It is somewhat unfortunate that a few changes I liked (e.g. the "debug"
bit), even though it was somewhat painful to read them due to coding style
differences, were not at the beginning of the series but instead buried
after changes that are much bigger and controversial (e.g. [6/8]).

^ permalink raw reply

* Re: [RFC PATCH 6/8] Remove special casing of http, https and ftp
From: Ilari Liusvaara @ 2009-12-01 19:39 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: git
In-Reply-To: <20091201182414.GK21299@spearce.org>

On Tue, Dec 01, 2009 at 10:24:14AM -0800, Shawn O. Pearce wrote:
> Ilari Liusvaara <ilari.liusvaara@elisanet.fi> wrote:
> 
> These should all be hardlinks to a single executable, not duplicate
> relinks of the same object files.

Fixed for next round (probably send that out in few days).

-Ilari

^ permalink raw reply

* Re: [RFC PATCH 0/8] Git remote helpers to implement smart transports.
From: Shawn O. Pearce @ 2009-12-01 19:30 UTC (permalink / raw)
  To: Ilari Liusvaara; +Cc: Sverre Rabbelier, git
In-Reply-To: <20091201171908.GA15436@Knoppix>

Ilari Liusvaara <ilari.liusvaara@elisanet.fi> wrote:
> On Tue, Dec 01, 2009 at 08:52:45AM -0800, Shawn O. Pearce wrote:
> 
> > Or better, why this is even necessary?
> 
> I have seen requests for gits:// (and in fact, I have plans to
> implement that protocol).
> 
> For instance, to support new types of authentication for smart transports
> without patching client git binaries (SSH has lots of failure modes that
> are quite nasty to debug) or abusing GIT_PROXY (yuck). 

So the bulk of this series is about making a proxy for git://
easier to tie into git?

Forgive me if I sound stupid, but for gits:// shouldn't that just
be a matter of git_connect() forking a git-remote-gits process
linked against openssl?  Or, maybe it just runs `openssl s_client`?

Why go through all of this effort of making a really generic proxy
protocol system when the long-term plan is to just ship native
gits:// support as part of git-core?
 
-- 
Shawn.

^ permalink raw reply

* [PATCH] help: Do not unnecessarily look for a repository
From: David Aguilar @ 2009-12-01 19:27 UTC (permalink / raw)
  To: gitster; +Cc: git, David Aguilar

Although 'git help' actually doesn't need to be run inside a git
repository and uses no repository-specific information, it looks for a git
directory.  Searching for a git directory can be annoying in auto-mount
environments.  With this commit, 'git help' no longer searches for a
repository when run without any options.

7c3baa9 originally modified 'git help -a' to not require a repository.
This applies the same fix for 'git help'.

Signed-off-by: David Aguilar <davvid@gmail.com>
---
 builtin-help.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin-help.c b/builtin-help.c
index ca08519..09ad4b0 100644
--- a/builtin-help.c
+++ b/builtin-help.c
@@ -427,9 +427,6 @@ int cmd_help(int argc, const char **argv, const char *prefix)
 		return 0;
 	}
 
-	setup_git_directory_gently(&nongit);
-	git_config(git_help_config, NULL);
-
 	if (!argv[0]) {
 		printf("usage: %s\n\n", git_usage_string);
 		list_common_cmds_help();
@@ -437,6 +434,9 @@ int cmd_help(int argc, const char **argv, const char *prefix)
 		return 0;
 	}
 
+	setup_git_directory_gently(&nongit);
+	git_config(git_help_config, NULL);
+
 	alias = alias_lookup(argv[0]);
 	if (alias && !is_git_command(argv[0])) {
 		printf("`git %s' is aliased to `%s'\n", argv[0], alias);
-- 
1.6.5.3

^ permalink raw reply related

* Re: [RFC PATCH 4/8] Support remote helpers implementing smart transports
From: Shawn O. Pearce @ 2009-12-01 19:22 UTC (permalink / raw)
  To: Ilari Liusvaara; +Cc: git
In-Reply-To: <1259675838-14692-5-git-send-email-ilari.liusvaara@elisanet.fi>

Ilari Liusvaara <ilari.liusvaara@elisanet.fi> wrote:
> diff --git a/Documentation/git-remote-helpers.txt b/Documentation/git-remote-helpers.txt
> index 5cfdc0c..adf815c 100644
> --- a/Documentation/git-remote-helpers.txt
> +++ b/Documentation/git-remote-helpers.txt
> @@ -90,6 +90,28 @@ Supported if the helper has the "push" capability.
>  +
>  Supported if the helper has the "import" capability.
>  
> +'connect-r' <service>::
> +	Connects to given service. Stdin and stdout of helper are
> +	connected to specified service (no git or git- prefixes are used,
> +	so e.g. fetching uses 'upload-pack' as service) on remote side.

This flies against every other convention we have.  git:// uses the
string 'git-upload-pack' and 'git-receive-pack', and so does the
smart-http code.  We should continue to use the git- prefix here,
to be consistent, even though by context its clearly implied.

> +	Valid replies to this command are 'OK' (connection established),

Why 'OK'?  Currently remote-helpers return an empty blank line
to any successful command, not 'OK'.

> +	'FALLBACK' (no smart transport support, fall back to dumb
> +	transports) and 'ERROR' (can't connect, don't bother trying to
> +	fall back).

FALLBACK almost makes sense, but ERROR we don't do in the
the existing helper protocol.  Instead the helper simply
prints its error message(s) to stderr and does exit(128).
aka what die() does.

> +Supported if the helper has the "connect-r" capability. Not used if
> +helper has the "invoke-r" capability, as invoke is preferred to connect.
> +
> +'invoke-r' <cmdlength> <cmd>::
> +	Like connect-r command, but instead of service name, command
> +	line is given. The length of command field is given in command
> +	length field.
> ++
> +Supported if the helper has the "invoke-r" capability.

Why both connect-r and invoke-r?  Why isn't connect-r sufficient
here?  Isn't it sufficient for any service that runs over git:// ?

-- 
Shawn.

^ permalink raw reply

* Re: [PATCH] get_ref_states: strdup entries and free util in stale  list
From: Junio C Hamano @ 2009-12-01 19:20 UTC (permalink / raw)
  To: Bert Wesarg; +Cc: Junio C Hamano, Johannes Schindelin, Jay Soffian, git
In-Reply-To: <36ca99e90912011014sd7372d0yc234873a26c2ae43@mail.gmail.com>

Bert Wesarg <bert.wesarg@googlemail.com> writes:

> A quick test with my use case does not show errors in the maint
> branch. So it should not be needed (except the memory leak fix of the
> .util member). And valgrind confirms this.

Thanks.

^ permalink raw reply

* Re: [RFC PATCH 6/8] Remove special casing of http, https and ftp
From: Daniel Barkalow @ 2009-12-01 19:15 UTC (permalink / raw)
  To: Ilari Liusvaara; +Cc: git
In-Reply-To: <1259675838-14692-7-git-send-email-ilari.liusvaara@elisanet.fi>

On Tue, 1 Dec 2009, Ilari Liusvaara wrote:

> HTTP, HTTPS and FTP are no longer special to transport code. Also
> add support for FTPS (curl supports it so it is easy).

We've been through this extensively, and settled on having a special case 
for URLs that specify a pure location. That is, the distinction between 
http and ftp is at the level of how you get to the content for that 
location, not what you do to interact with it. (Even with webdav or the 
git-specific smart server support, we use the same detection method on all 
locations, and ftp simply never has the possibility of having these 
features detected.)

It would be fine to add "ftps" to the list of URL schemes that indicate a 
pure location, except that it's plausible that ftps supports writing, but 
obviously not by webdav, which is what the push support via curl will 
attempt, so it's more likely to be confusing than helpful.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply

* Re: multiple working directories for long-running builds (was: "git merge" merges too much!)
From: Greg A. Woods @ 2009-12-01 18:58 UTC (permalink / raw)
  To: The Git Mailing List
In-Reply-To: <20091201185114.GC11235@dpotapov.dyndns.org>

[-- Attachment #1: Type: text/plain, Size: 1701 bytes --]

At Tue, 1 Dec 2009 21:51:14 +0300, Dmitry Potapov <dpotapov@gmail.com> wrote:
Subject: Re: multiple working directories for long-running builds (was:	"git merge" merges too much!)
> 
> Obviously, switching branches while running build may produce very
> confusing results, but it is not any different than editing files by
> hands during built -- any concurrent modification may confuse the build
> system.

That's what I said.  This is why multiple working directories is an
essential feature for any significantly large project.


> > I just disagreed that "git archive" was a reasonable alternative to
> > leaving the working directory alone during the entire time of the build.
> 
> Using "git archive" allows you avoid running long time procedure such as
> full clean build and testing in the working tree. Also, it is guaranteed
> that you test exactly what you put in Git and some other garbage in your
> working tree does not affect the result.

Sure, but let's be very clear here:  "git archive" is likely even more
impossible for some large projects to use than "git clone" would be to
use to create build directories.

Disk bandwidth is almost always more expensive than disk space.

>   But my point was that switching
> between branches and recompile a few changed files may be faster than
> going to another working tree.

That's possibly going to generate even more unnecessary churn in the
working directory, and thus even more unnecessary re-compiles.

Multiple working directories are really the only sane solution
sometimes.

-- 
						Greg A. Woods
						Planix, Inc.

<woods@planix.com>       +1 416 218 0099        http://www.planix.com/

[-- Attachment #2: Type: application/pgp-signature, Size: 186 bytes --]

^ permalink raw reply

* Re: non-US-ASCII file names (e.g. Hiragana) on Windows
From: Thomas Singer @ 2009-12-01 18:55 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Johannes Sixt, git
In-Reply-To: <m3k4x6na81.fsf@localhost.localdomain>

Jakub Narebski wrote:
> If you use Git from Java, why don't you just use JGit (www.jgit.org),
> which is Git implementation in Java?

We are using JGit for the read-only stuff and the Git command line
executable for all writing commands. We very much appreciate Shawn O.
Pearce' (and the other JGit developers') effort, but Git is a fast moving
target and (much) more complex than CVS or SVN, for which we use Java
libraries communicating with the corresponding server which adds another
sanity layer to the repository making repository corruption less likely than
direct access.

-- 
Best regards,
Thomas Singer
=============
syntevo GmbH
http://www.syntevo.com
http://blog.syntevo.com

^ permalink raw reply

* Re: "git merge" merges too much!
From: Greg A. Woods @ 2009-12-01 18:52 UTC (permalink / raw)
  To: Dmitry Potapov; +Cc: The Git Mailing List
In-Reply-To: <20091130192212.GA23181@dpotapov.dyndns.org>

[-- Attachment #1: Type: text/plain, Size: 4904 bytes --]

At Mon, 30 Nov 2009 22:22:12 +0300, Dmitry Potapov <dpotapov@gmail.com> wrote:
Subject: Re: "git merge" merges too much!
> 
> The key difference comparing to what you may got used is that branches
> are normally based on the oldest branch in what this feature may be
> included. Thus normally changes are not backported to old branches,
> because you can merge them directly.

Hmmm... the idea of creating topic branches based on the oldest branch
where the feature might be used is indeed neither intuitive, nor is it
mentioned anywhere I've so far read about using topic branches in Git.

To use topic branches effectively this way, especially in managing local
and custom changes to a large remote project where separate working
directories are needed for long-running builds, I think some additional
software configuration management tool must be used to create
"configuration" branches where all the desired change sets (topic
branches) are merged.

I spent half my dreaming time early this morning running through
scenarios of how to use topic branches, with true merging (not
re-basing), in a usable work-flow.

At the moment I'm leaning towards a process where the configuration
branch is re-created for every build -- i.e. the merges are redone from
every topic branch to a freshly configured branch forked from the
locally supported release branch, hopefully making use of git-rerere to
solve most conflicts in as automated a fashion as is possible.

This may not be a sane thing to do though -- it may be too much work to
do for every fix.  It somewhat goes against the current natural trend in
many of the projects I work on to develop changes on the trunk and then
back-port (some of) them to release branches.

Perhaps Stacked-Git really is the best answer.  I will have to
investigate more.


> > > Yes, you must cherry-pick or use rebase (which is a more featureful
> > > version of the pipeline you mentioned).
> > 
> > "git rebase" will not work for me unless it grows a "copy" option ,
> > i.e. one which does not delete the original branch (i.e. avoids the
> > "reset" phase of its operation).
> 
> There is no reset phase...

By "reset phase" I meant this part, from git-rebase(1):

       The current branch is reset to <upstream>, or <newbase> if the --onto
       option was supplied. This has the exact same effect as git reset --hard
       <upstream> (or <newbase>).


> It is just reassigning the head of branch to
> point to a different commit-id. If you want to copy a branch instead of
> rebasing the old one, you create a new branch (a new name) that points
> to the same commit as the branch that you want to copy, after that you
> rebase this new branch. You can do that like this:
> 
> $ git branch new-foo foo
> 
> $ git rebase --onto newbase oldbase new-foo

Hmmm.... I'll have to think about that.  It makes some sense, but I
don't intuitively read the command-line parameters well enough to
predict the outcome in all of the scenarios I'm interested in.

what is "oldbase" there?  I'm guessing it means "base of foo" (and for
the moment, "new-foo" too)?

It's confusing because the manual page uses the word "upstream" to
describe this parameter.

From my experiments it looks like what I might want to do to copy a
local branch to port its changes from one release branch to another is
something like this (where local-v2.0 is a branch with local changes
forked from release branch REL-v2.0, and I want to back-port these
changes to a new local branch forked from the release branch REL-v1.0):

	$ git branch local-base-v1.0 REL-v1.0	# mark base of new branch
	$ git branch local-v1.0 local-v2.0	# dup head of src branch
	$ git rebase --onto local-base-v1.0 REL-v2.0 local-v1.0
	$ git branch -d local-base-v1.0

The first and last steps may not be necessary if REL-v1.0 really is a
branch, but in my play project it is just a tag on the trunk.  In the
case that it were really already a branch then hopefully this would do:

	$ git branch local-v1.0 local-v2.0	# dup head of src branch
	$ git rebase --onto REL-v1.0 REL-v2.0 local-v1.0

The trick here seems to be to invent the name of the new branch based on
where it's going to be rebased to.

I think this does suffice very nicely as a "git copy" operation!


> The "copy" does not have the problem of rebase, but it has a different
> problem: You have two series of commits instead of one. If you found
> a bug in one of those commits, you will have to patch each series
> separately. Also, git merge may produce additional conflicts... So,
> copying commits is not something that I would recommend to do often.

Indeed.

-- 
						Greg A. Woods

+1 416 218-0098                VE3TCP          RoboHack <woods@robohack.ca>
Planix, Inc. <woods@planix.com>      Secrets of the Weird <woods@weird.com>

[-- Attachment #2: Type: application/pgp-signature, Size: 186 bytes --]

^ permalink raw reply

* Re: multiple working directories for long-running builds (was: "git merge" merges too much!)
From: Dmitry Potapov @ 2009-12-01 18:51 UTC (permalink / raw)
  To: The Git Mailing List
In-Reply-To: <m1NFX19-000kn4C@most.weird.com>

On Tue, Dec 01, 2009 at 12:59:58PM -0500, Greg A. Woods wrote:
> At Tue, 1 Dec 2009 08:47:34 +0300, Dmitry Potapov <dpotapov@gmail.com> wrote:
> Subject: Re: "git merge" merges too much!
> > 
> > On Mon, Nov 30, 2009 at 07:24:14PM -0500, Greg A. Woods wrote:
> > > 
> > > Things get even weirder if you happen to be playing with older branches
> > > too -- most build tools don't have ability to follow files that go back
> > > in time as they assume any product files newer than the sources are
> > > already up-to-date, no matter how much older the sources might become on
> > > a second build.
> > 
> > No, files do not go back in time when you switch between branches. The
> > timestamp on files is the time when they are written to your working
> > tree
> 
> Hmmm, I didn't really say anything in particular about file timestamps
> -- I meant the file content may go back in time.  More correctly I
> should have said that the file content may become inconsistent with the
> state of other files that have just been compiled.

There is no difference of content going back in time or forth. If a file
is changed, any decent build system should recompile the corresponding
files. If the build does not handle dependencies properly, you can end
up with inconsistent state just by editing some files.

> If the timestamps do not get set back to commit time, but rather are
> simply updated to move the last modify time to the time each change is
> made to a working file (which is as you said, to be expected),

More precisely, Git does not anything about modification time during
checkout. The system automatically updates the modification time when
a file is written, and Git does not mess with it.

> regardless of whether its content goes back in time or not, then this
> may or may not help a currently running build to figure out what really
> needs to be re-compiled.

Obviously, switching branches while running build may produce very
confusing results, but it is not any different than editing files by
hands during built -- any concurrent modification may confuse the build
system.

> I just disagreed that "git archive" was a reasonable alternative to
> leaving the working directory alone during the entire time of the build.

Using "git archive" allows you avoid running long time procedure such as
full clean build and testing in the working tree. Also, it is guaranteed
that you test exactly what you put in Git and some other garbage in your
working tree does not affect the result. But my point was that switching
between branches and recompile a few changed files may be faster than
going to another working tree.


Dmitry

^ permalink raw reply

* Re: Umlaut in filename makes troubles
From: Daniel Barkalow @ 2009-12-01 18:48 UTC (permalink / raw)
  To: rick23; +Cc: git
In-Reply-To: <200912010815.14515.rick23@gmx.net>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2066 bytes --]

On Tue, 1 Dec 2009, rick23@gmx.net wrote:

> I have problems with my repository under slackware vs. windows. I 
> created a repo in linux and every time I use it under msysgit, the 
> files containing umlauts in the filename are marked as deleted (and 
> vice versa).
> 
> For instance: the repo perfectly synced under msysgit leads to:
> 
> user@sauron:/media/disk-2$ git status |grep Auszug
> #       deleted:    "trunk/007_Literatur/Auszug aus Ergonomische 
> Untersuchung des Lenkgef\374hles.docx"
> #       "trunk/007_Literatur/Auszug aus Ergonomische Untersuchung des 
> Lenkgef\303\274hles.docx"

So the directory contains the utf-8 name, but the index contains a latin-1 
name, when you wrote it under Windows and are looking at it under Linux. 
You probably want to use utf-8 for your repository, so that it's not 
specific to your locale.

> in linux. But the file exists and is displayed correctly in the shell 
> or in dolphin (my filemanager under X):
> 
> user@sauron:/media/disk-2$ ls trunk/007_Literatur/Auszug*
> trunk/007_Literatur/Auszug aus Ergonomische Untersuchung des 
> Lenkgefühles.docx

You've got a utf-8 filesystem, so the u-with-umlaut is the two-byte 
sequence git is showing in the message as being present, not the single 
byte that it's showing as deleted. It looks like you're actually using 
utf-8 for what's on the usb stick, so you probably want the names listed 
in the repository to match that, which means that the correct one here is 
Linux.

> Can you please give me a hint what to do?

Convince Windows (or msysgit) to report filenames to git in utf-8. (I 
don't know *how*, but that's *what* you probably want to do.)

Once you've got everything agreeing on the character set used for 
filenames, you can disable "core.quotepath" to make the messages appear 
with umlauts; if you turned that off before fixing the inconsistancy, it 
would be much trickier to debug, because the "deleted" line would contain 
something that your Linux display won't consider a valid character.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply

* Re: [PATCH/RFC] Allow curl to rewind the RPC read buffer at any time
From: Daniel Stenberg @ 2009-12-01 18:18 UTC (permalink / raw)
  To: Shawn O. Pearce
  Cc: Martin Storsj?, Tay Ray Chuan, git, Nicholas Miell, gsky51,
	Clemens Buchacher, Mark Lodato, Johannes Schindelin
In-Reply-To: <20091201161428.GC21299@spearce.org>

On Tue, 1 Dec 2009, Shawn O. Pearce wrote:

> The #@!*@!* library should be able to generate two requests back-to-back to 
> the same URL without needing to rewind the 2nd request.

If '#@!*@!*' is your pattern for matching libcurl or curl, then sure libcurl 
certainly has no problem at all to send as many requests you like 
back-to-back.

The rewinding business is only really necessary for multipass authentication 
when Expect: 100-continue doesn't work (and thus libcurl has started to send 
data that the server will discard and thus is needed to get sent again). And 
that's not something you can blame "the #@!*@!* library" for, but rather your 
server end and/or how HTTP is defined to work.

-- 

  / daniel.haxx.se

^ permalink raw reply

* Re: [RFC PATCH 6/8] Remove special casing of http, https and ftp
From: Shawn O. Pearce @ 2009-12-01 18:24 UTC (permalink / raw)
  To: Ilari Liusvaara; +Cc: git
In-Reply-To: <1259675838-14692-7-git-send-email-ilari.liusvaara@elisanet.fi>

Ilari Liusvaara <ilari.liusvaara@elisanet.fi> wrote:
> HTTP, HTTPS and FTP are no longer special to transport code. Also
> add support for FTPS (curl supports it so it is easy).
...
> diff --git a/Makefile b/Makefile
> index 42744a4..be0be87 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1676,7 +1676,19 @@ git-http-push$X: revision.o http.o http-push.o $(GITLIBS)
>  	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
>  		$(LIBS) $(CURL_LIBCURL) $(EXPAT_LIBEXPAT)
>  
> -git-remote-curl$X: remote-curl.o http.o http-walker.o $(GITLIBS)
> +git-remote-http$X: remote-curl.o http.o http-walker.o $(GITLIBS)
> +	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
> +		$(LIBS) $(CURL_LIBCURL) $(EXPAT_LIBEXPAT)
> +
> +git-remote-https$X: remote-curl.o http.o http-walker.o $(GITLIBS)
> +	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
> +		$(LIBS) $(CURL_LIBCURL) $(EXPAT_LIBEXPAT)
> +
> +git-remote-ftp$X: remote-curl.o http.o http-walker.o $(GITLIBS)
> +	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
> +		$(LIBS) $(CURL_LIBCURL) $(EXPAT_LIBEXPAT)
> +
> +git-remote-ftps$X: remote-curl.o http.o http-walker.o $(GITLIBS)
>  	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
>  		$(LIBS) $(CURL_LIBCURL) $(EXPAT_LIBEXPAT)

These should all be hardlinks to a single executable, not duplicate
relinks of the same object files.
  
-- 
Shawn.

^ permalink raw reply

* Re: [PATCH/RFC] Add a --bouquet option to git rev-list
From: Junio C Hamano @ 2009-12-01 18:21 UTC (permalink / raw)
  To: Nathan W. Panike; +Cc: Michael J Gruber, git
In-Reply-To: <d77df1110912010931l40472723v80ad675a92d23fa3@mail.gmail.com>

"Nathan W. Panike" <nathan.panike@gmail.com> writes:

>>> include_forks ()
>>> {
>>>     local head="$(git show -s --pretty=format:'%H' HEAD)";
>>>     echo "HEAD $(git for-each-ref --format='%(refname)' \
>>>       refs/heads refs/remotes | while read ref; do \
>>>       if test "$(git merge-base HEAD ${ref}^{commit})" != ""; \
>>>               then echo ${ref}; fi; done)"
>>> }

Because you have to traverse the entire history from tips of refs to know
if the histories to reach them are disjoint, this is fundamentally a very
expensive operation and will not scale to projects with deep histories.

If a low-level support for this kind of thing is necessary, then I do not
think it should just be "give me set of refs that is related to HEAD".  I
suspect that is too inflexible to be useful in other situations.

A command to list refs (i.e. not as rev-list argument that shows list of
commits, but as a new feature of for-each-ref) with new criteria might
have wider use (I am just thinking aloud).  Something like

 - among these refs (you would specify this with --all, --heads, or prefix
   'refs/heads refs/remotes'), list only the ones related to this and that
   ref (here you would give HEAD or whatever you want to check with as
   argument)"; and 

 - its counterpart "list the ones that are _not_ related" with the same
   input.

As to the implementation, instead of running get_merge_bases() number of
times (a naive implementation would be O(n*m), I guess), I think it may
make sense to run the traversal in parallel, similar to the way done in
show-branches (but the termination condition would be different).

^ permalink raw reply

* Re: [PATCH] get_ref_states: strdup entries and free util in stale  list
From: Bert Wesarg @ 2009-12-01 18:20 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Junio C Hamano, Jay Soffian, git
In-Reply-To: <36ca99e90912010105r428a7bfdw63928e8a5515bd1d@mail.gmail.com>

On Tue, Dec 1, 2009 at 10:05, Bert Wesarg <bert.wesarg@googlemail.com> wrote:
> There are still invalid reads of size 4. I think the problem is the
> flex array member of 'struct ref' and strlen(). If its worth I can
> look into this.
A short heads-up, here is the valgrind error I get for this invalid read:

==27305== Invalid read of size 4
==27305==    at 0x4936AF: copy_ref (remote.c:870)
==27305==    by 0x4942E4: get_fetch_map (remote.c:1271)
==27305==    by 0x44473E: get_remote_ref_states (builtin-remote.c:271)
==27305==    by 0x446DCE: cmd_remote (builtin-remote.c:1022)
==27305==    by 0x4045F0: handle_internal_command (git.c:257)
==27305==    by 0x404B8F: main (git.c:482)
==27305==  Address 0x5b5ba38 is 104 bytes inside a block of size 107 alloc'd
==27305==    at 0x4C24477: calloc (vg_replace_malloc.c:418)
==27305==    by 0x4B09AD: xcalloc (wrapper.c:75)
==27305==    by 0x493924: alloc_ref_with_prefix (remote.c:853)
==27305==    by 0x46653B: get_remote_heads (connect.c:96)
==27305==    by 0x4A9347: get_refs_via_connect (transport.c:453)
==27305==    by 0x4A7F14: transport_get_remote_refs (transport.c:895)
==27305==    by 0x4445B6: get_remote_ref_states (builtin-remote.c:810)
==27305==    by 0x446DCE: cmd_remote (builtin-remote.c:1022)
==27305==    by 0x4045F0: handle_internal_command (git.c:257)
==27305==    by 0x404B8F: main (git.c:482)

^ permalink raw reply

* Re: [PATCH] get_ref_states: strdup entries and free util in stale  list
From: Bert Wesarg @ 2009-12-01 18:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Schindelin, Jay Soffian, git
In-Reply-To: <7v8wdm1ui1.fsf@alter.siamese.dyndns.org>

On Tue, Dec 1, 2009 at 18:20, Junio C Hamano <gitster@pobox.com> wrote:
> Bert Wesarg <bert.wesarg@googlemail.com> writes:
>
>>>  - The ref abbrev_branch() is called and the address of whose substring is
>>>   taken to be used as "name" in handle_one_branch() is refspec.src, but
>>>   what goes to .util is refname that is refspec.dst---they are different
>>>   strings and one is not a substring of the other.
>> I don't see you point here.
>
> Of course you don't ;-) because we were looking at different versions.
>
> I wanted to apply the same fix to both maint and master.  For the code in
> 'master' your observation is 100% correct.
A quick test with my use case does not show errors in the maint
branch. So it should not be needed (except the memory leak fix of the
.util member). And valgrind confirms this.

Bert

>

^ permalink raw reply

* Re: multiple working directories for long-running builds (was: "git merge" merges too much!)
From: Greg A. Woods @ 2009-12-01 17:59 UTC (permalink / raw)
  To: The Git Mailing List; +Cc: Dmitry Potapov
In-Reply-To: <20091201054734.GB11235@dpotapov.dyndns.org>

[-- Attachment #1: Type: text/plain, Size: 2257 bytes --]

At Tue, 1 Dec 2009 08:47:34 +0300, Dmitry Potapov <dpotapov@gmail.com> wrote:
Subject: Re: "git merge" merges too much!
> 
> On Mon, Nov 30, 2009 at 07:24:14PM -0500, Greg A. Woods wrote:
> > 
> > Things get even weirder if you happen to be playing with older branches
> > too -- most build tools don't have ability to follow files that go back
> > in time as they assume any product files newer than the sources are
> > already up-to-date, no matter how much older the sources might become on
> > a second build.
> 
> No, files do not go back in time when you switch between branches. The
> timestamp on files is the time when they are written to your working
> tree

Hmmm, I didn't really say anything in particular about file timestamps
-- I meant the file content may go back in time.  More correctly I
should have said that the file content may become inconsistent with the
state of other files that have just been compiled.

If the timestamps do not get set back to commit time, but rather are
simply updated to move the last modify time to the time each change is
made to a working file (which is as you said, to be expected),
regardless of whether its content goes back in time or not, then this
may or may not help a currently running build to figure out what really
needs to be re-compiled.  Likely it won't even for a recursive-make
style update build, but certainly not for one where all build actions
are pre-determined before any of them are started.

If the content of one or more files goes back in time to an earlier
state while the compile is happening then ultimately the result must be
considered to be undefined.  The best you can hope for is a break in the
compile.

This is why I agreed with you that a build should never be done in a
working directory where any file editing or VCS action is occurring
simultaneously.

I just disagreed that "git archive" was a reasonable alternative to
leaving the working directory alone during the entire time of the build.
It is not really reasonable for large projects any more than stopping
all work on the sources is reasonable.

-- 
						Greg A. Woods
						Planix, Inc.

<woods@planix.com>       +1 416 218 0099        http://www.planix.com/

[-- Attachment #2: Type: application/pgp-signature, Size: 186 bytes --]

^ permalink raw reply

* Re: [PATCH/RFC] Allow curl to rewind the RPC read buffer
From: Junio C Hamano @ 2009-12-01 17:49 UTC (permalink / raw)
  To: Martin Storsjö
  Cc: Tay Ray Chuan, git, Nicholas Miell, gsky51, Clemens Buchacher,
	Mark Lodato, Johannes Schindelin
In-Reply-To: <alpine.DEB.2.00.0912011232450.5582@cone.home.martin.st>

Martin Storsjö <martin@martin.st> writes:

> As long as the current rpc read buffer is the first one, we're able to
> rewind without need for additional buffering.

... and if the current buffer isn't the first one, what do we do?

> +#ifndef NO_CURL_IOCTL
> +curlioerr rpc_ioctl(CURL *handle, int cmd, void *clientp)
> +{
> +	struct rpc_state *rpc = clientp;
> +
> +	switch (cmd) {
> +	case CURLIOCMD_NOP:
> +		return CURLIOE_OK;
> +
> +	case CURLIOCMD_RESTARTREAD:
> +		if (rpc->initial_buffer) {
> +			rpc->pos = 0;
> +			return CURLIOE_OK;
> +		}
> +		fprintf(stderr, "Unable to rewind rpc post data - try increasing http.postBuffer\n");
> +		return CURLIOE_FAILRESTART;
> +
> +	default:
> +		return CURLIOE_UNKNOWNCMD;
> +	}
> +}
> +#endif

What will this result in?  A failed request, then the user increases
http.postBuffer, and re-runs the entire command?  I am not suggesting the
code should do it differently (e.g.  retry with a larger buffer without
having the user to help it).  At least not yet.  That is why my first
question above was "what do we do?" and not "what should we do?".

I am primarily interested in _documenting_ the expected user experience in
the failure case, so that people can notice the message, run "git grep" to
find the above line and then run "git blame" to find the commit to read
its log message to understand what is going on.

^ permalink raw reply

* Re: [PATCH/RFC] Add a --bouquet option to git rev-list
From: Nathan W. Panike @ 2009-12-01 17:31 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git
In-Reply-To: <4B14CF47.5020808@drmicha.warpmail.net>

Hello,

On Tue, Dec 1, 2009 at 2:09 AM, Michael J Gruber
<git@drmicha.warpmail.net> wrote:
> Nathan W. Panike venit, vidit, dixit 30.11.2009 21:55:
>> Add a command line option to rev-list so the command 'git rev-list --bouquet'
>> shows all revisions that are ancestors of refs which share history with HEAD.
>>
>> Signed-off-by: Nathan W. Panike <nathan.panike@gmail.com>
>> ---
>> I have a repository with the following structure:
>>
>>       B
>>      /
>> A'--A--C
>>      \
>>       D
>>
>> E'--E
>>
>> Thus the command 'git merge base E A' returns nothing, as there is no common
>> history.  The E history contains stuff that is derived from the other history
>> (A, B, C, or D).  Often I find myself doing the following:
>
> Either I don't understand the diagram or your term "derived". If
> "derived" means "on some branch of a merge" and E is derived from A, B,
> C, or D, then (since B, C, D is derived from A, and from A') E is
> derived from A', and they will have a merge base.
>

"Derived" in my case means that E is processed from a snapshot of the
tree at, say, A.

> Are these diagrams really disconnected from each other?

Yes.  I started the history of E with plumbing using git commit-tree,
without a -p flag specifying a parent

>
>> git checkout C
>> gitk $(include_forks) &
>> <View history, make changes, merges, et cetera>
>> git checkout E
>> <go back to gitk, only see history for B, C, etc>
>>
>> Now the 'include_forks' command is a bash function in my .bashrc:
>>
>> include_forks ()
>> {
>>     local head="$(git show -s --pretty=format:'%H' HEAD)";
>>     echo "HEAD $(git for-each-ref --format='%(refname)' \
>>       refs/heads refs/remotes | while read ref; do \
>>       if test "$(git merge-base HEAD ${ref}^{commit})" != ""; \
>>               then echo ${ref}; fi; done)"
>> }
>>
>> The shell thus intercepts my command and I must restart gitk to see the history
>> of E.
>>
>> With this patch, I can issue the command 'gitk --bouquet' and when I checkout
>> E, I can 'reload' in gitk and see the history of E automatically.
>
> What would your patch do in the example you gave above? Which refs would
> it cause gitk (rev-list) to show?
>

I wish to be concrete, so let us suppose you use a default clone of
git.git.  Further, suppose you are on origin/master.
Then, with my patch,

git rev-list --bouquet

should be an---admittedly less efficient---equivalent to

git rev-list --all --not refs/remotes/origin/html
refs/remotes/origin/man refs/remotes/origin/todo

> Michael
>

Thanks,

Nathan Panike

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox