* Re: [PATCH] commit: suggest --amend --reset-author to fix commiter identity
From: Jonathan Nieder @ 2011-01-07 17:13 UTC (permalink / raw)
To: Matthieu Moy; +Cc: git, gitster
In-Reply-To: <1294409671-5479-1-git-send-email-Matthieu.Moy@imag.fr>
Matthieu Moy wrote:
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -47,7 +47,11 @@ static const char implicit_ident_advice[] =
> "\n"
> "If the identity used for this commit is wrong, you can fix it with:\n"
> "\n"
> -" git commit --amend --author='Your Name <you@example.com>'\n";
> +" git commit --amend --author='Your Name <you@example.com>'\n"
> +"\n"
> +"or\n"
> +"\n"
> +" git commit --amend --reset-author\n";
>
This message gets used when the author was set from gecos because
not available in the configuration. Do you perhaps mean:
If the identity used for this commit is wrong, you can fix it with:
git commit --amend --author='Your Name <you@example.com>'
git commit --amend --author='Your Name <you@example.com>'
or
cat <<EOF >>~/.gitconfig
[user]
name = Your Name
email = you@example.com
EOF
git commit --amend --reset-author
?
^ permalink raw reply
* Re: [PATCH] t9010: svnadmin can fail even if available
From: Ramkumar Ramachandra @ 2011-01-07 16:58 UTC (permalink / raw)
To: Jonathan Nieder; +Cc: Junio C Hamano, A Large Angry SCM, Git Mailing List
In-Reply-To: <20110107013159.GA23280@burratino>
Hi,
Jonathan Nieder writes:
> To do the same for t91* would be impossible. If svn is broken or not
> installed, svn-fe will run fine, but "git svn" will not. On the other
> hand, if svnadmin were broken but svn still worked, "git svn" would be
> fine but that would be quite strange and I do not think it is worth
> spending time to prepare for.
I don't think it's worth spending time preparing for every concievable
breakage. The patch A few more examples of possible breakages I've
encountered:
- APR compiled without threading support, SVN compiled with it, or
viceversa.
- SVN is compiled against GNU iconv, but apr-iconv installed, or
viceversa.
- Two different versions of a dependent library are installed, and SVN
links to a different version in a different location.
One or many components of SVN may fail. So, I'm in favor of the
current approach: if SVN is installed, attempt to run all the t91*
tests. Any failure can either be interpreted as a real test failure or
malformed SVN installation.
-- Ram
^ permalink raw reply
* Re: Resumable clone/Gittorrent (again)
From: Luke Kenneth Casson Leighton @ 2011-01-07 15:59 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Nguyen Thai Ngoc Duy, Git Mailing List
In-Reply-To: <alpine.LFD.2.00.1101061956470.22191@xanadu.home>
On Fri, Jan 7, 2011 at 3:21 AM, Nicolas Pitre <nico@fluxnic.net> wrote:
>> The last thing I like about these chains is that the number of chains
>> is reasonable. It won't increase too fast over time (as compared to
>> the number of commits). As such it maps well to BitTorrent's "pieces".
>
> My problem right now is that I don't see how this maps well to Git.
bittorrent as "just another file getting method" maps very well.
only with some modifications to the bittorrent protocol would the
concept map well to bittorrent "pieces" because the pieces are at
present a) fixed size b) defined by a heuristic based on the file size
(logarithmic) so that the number of pieces are kept to within a
reasonable limit.
bottom line: my take on this is (sorry to say, nguyen) that i don't
believe bittorrent "pieces" map well to the chains concept, unless the
chains are perfectly fixed identical sizes [which they could well be?
am i mistaken about this?]
l.
^ permalink raw reply
* Re: [PATCH/RFC] Documentation/checkout: explain behavior wrt local changes
From: r.ductor @ 2011-01-07 14:27 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Jonathan Nieder, git, Jeff King
In-Reply-To: <7vtyhl8t5y.fsf@alter.siamese.dyndns.org>
Dear all
I'm a beginner git user trying to defend the beginners' party. Using Jonathan's explanations (and some educated guess on git behavior) I tried to concoct the man git-checkout documentation I would have liked to read, which is somewhat more explicit than Jonathan's patch. If you want to implement/improve it please double check it (and maybe warn me about my own errors). If you are happy about that I will try to send more suggestions. Thanks for your time and your work.
riccardo
git checkout [<branch>]
git checkout -b|-B <new_branch> [<start point>]
Tries to change the current HEAD to <branch> (or <start point>) and to safely update the working tree and the index.
If there exist files having three distinct contents in the current commit, in the new branch and in the working tree, then the operation is canceled and all state is preserved.
(This behavior may be changed with the option --merge.) If this is not the case, for all other files it acts as follows.
- Files not differing between the current commit and the new branch: their contents in the working tree (and eventually in the index) are left unchanged.
- Files in the current commit, unchanged or absent in the working tree, but not present in the new branch are removed from the working tree and from the index.
- Files in the new branch but not in the current commit are added to the working tree and to the index (if they were already there they must have had the same contents by hypothesis).
- Files in the working tree or in the index that are absent in the current commit and in the new branch are left unchanged.
If -b is given, ...
On Friday 07 January 2011 01:16:25 Junio C Hamano wrote:
> Jonathan Nieder <jrnieder@gmail.com> writes:
>
> > 'git checkout' [<branch>]::
> > 'git checkout' -b|-B <new_branch> [<start point>]::
> >
> > - This form switches branches by updating the index, working
> > - tree, and HEAD to reflect the specified branch.
> > + This form switches branches by changing `HEAD` and updating the
> > + tracked files to the specified branch. 'git checkout' will
> > + stop without doing anything if local changes overlap with
> > + changes to the tracked files. (Any local changes that do not
> > + overlap with changes from `HEAD` to the specified branch will
> > + be preserved.)
>
^ permalink raw reply
* Re: concurrent fetches to update same mirror
From: Marc Branchaud @ 2011-01-07 14:51 UTC (permalink / raw)
To: Jeff King; +Cc: Junio C Hamano, Shawn Pearce, Neal Kreitzinger, git
In-Reply-To: <4D272834.9060001@xiplink.com>
On 11-01-07 09:50 AM, Marc Branchaud wrote:
> On 11-01-06 06:45 PM, Jeff King wrote:
>> On Wed, Jan 05, 2011 at 03:29:47PM -0800, Junio C Hamano wrote:
>>
>>> Jeff King <peff@peff.net> writes:
>>>
>>>> Interestingly, in the case of ref _creation_, not update, like this:
>>>>
>>>> mkdir repo && cd repo && git init
>>>> git remote add origin some-remote-repo-that-takes-a-few-seconds
>>>> xterm -e 'git fetch -v; read' & xterm -e 'git fetch -v; read'
>>>>
>>>> then both will happily update, the second one overwriting the results of
>>>> the first. It seems in the case of locking a ref which previously didn't
>>>> exist, we don't enforce that it still doesn't exist.
>>>
>>> We probably should, especially when there is no --force or +prefix is
>>> involved.
>>
>> Hmph. So I created the test below to try to exercise this, expecting to
>> see at least one failure: according to the above example, we aren't
>> actually checking "null sha1 means ref must not exist", so we should get
>> an erroneous success for that case. And there is the added complication
>> that the null sha1 may also mean "don't care what the old one was". So
>> even if I changed the code, we would get erroneous failures the other
>> way.
>>
>> But much to my surprise, it actually passes with stock git. Which means
>> I need to dig a little further to see exactly what is going on.
>
> I should point out that the repository where I saw this issue is running git
> 1.7.1.
Oops -- sorry! I'm in the wrong concurrency thread...
M.
^ permalink raw reply
* Re: [msysGit] [PATCH/RFC] alias: use run_command api to execute aliases
From: Johannes Schindelin @ 2011-01-07 14:51 UTC (permalink / raw)
To: Erik Faye-Lund; +Cc: git, msysgit, j6t
In-Reply-To: <AANLkTi=6wG6khBAqLA8nki5-wbxQB-oYUAgMSqT-egpt@mail.gmail.com>
Hi,
On Fri, 7 Jan 2011, Erik Faye-Lund wrote:
> On Fri, Jan 7, 2011 at 2:17 AM, Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
>
> > On Thu, 6 Jan 2011, Erik Faye-Lund wrote:
> >
> >> On Windows, system() executes with cmd.exe instead of /bin/sh. This
> >> means that aliases currently has to be batch-scripts instead of
> >> bourne-scripts. On top of that, cmd.exe does not handle single
> >> quotes, which is what the code-path currently uses to handle
> >> arguments with spaces.
> >>
> >> To solve both problems in one go, use run_command_v_opt() to execute
> >> the alias. It already does the right thing prepend "sh -c " to the
> >> alias.
> >
> > Would this not break setups where aliases were defined to execute
> > batch scripts?
> >
> > If this is true, I'm of two minds here.
>
> It would indeed, but I wouldn't worry TOO much about it. We've clearly
> told the users that Git for Windows is a tool that you have to be
> willing to work on to use.
>
> But I'm kind of of two minds here myself, but for a slightly different
> reason: I think Git for Windows SHOULD use cmd.exe to execute scripts.
> We should be able to lose the msys-environment and still have the basic
> functionality working. In that sense, this is a step in the wrong
> direction. But I'd rather have all code use the same code-path to
> execute scripts, and make a bit switch to cmd.exe together with porting
> all supplied scripts to batch-files some time in the future.
Okay, strike my objections, I agree now. Feel free to apply!
Thanks,
Dscho
^ permalink raw reply
* Re: concurrent fetches to update same mirror
From: Marc Branchaud @ 2011-01-07 14:50 UTC (permalink / raw)
To: Jeff King; +Cc: Junio C Hamano, Shawn Pearce, Neal Kreitzinger, git
In-Reply-To: <20110106234512.GA17231@sigill.intra.peff.net>
On 11-01-06 06:45 PM, Jeff King wrote:
> On Wed, Jan 05, 2011 at 03:29:47PM -0800, Junio C Hamano wrote:
>
>> Jeff King <peff@peff.net> writes:
>>
>>> Interestingly, in the case of ref _creation_, not update, like this:
>>>
>>> mkdir repo && cd repo && git init
>>> git remote add origin some-remote-repo-that-takes-a-few-seconds
>>> xterm -e 'git fetch -v; read' & xterm -e 'git fetch -v; read'
>>>
>>> then both will happily update, the second one overwriting the results of
>>> the first. It seems in the case of locking a ref which previously didn't
>>> exist, we don't enforce that it still doesn't exist.
>>
>> We probably should, especially when there is no --force or +prefix is
>> involved.
>
> Hmph. So I created the test below to try to exercise this, expecting to
> see at least one failure: according to the above example, we aren't
> actually checking "null sha1 means ref must not exist", so we should get
> an erroneous success for that case. And there is the added complication
> that the null sha1 may also mean "don't care what the old one was". So
> even if I changed the code, we would get erroneous failures the other
> way.
>
> But much to my surprise, it actually passes with stock git. Which means
> I need to dig a little further to see exactly what is going on.
I should point out that the repository where I saw this issue is running git
1.7.1.
M.
^ permalink raw reply
* [PATCH] commit: suggest --amend --reset-author to fix commiter identity
From: Matthieu Moy @ 2011-01-07 14:14 UTC (permalink / raw)
To: git, gitster; +Cc: Matthieu Moy
The advantage of this command is that it is cut-and-paste ready, while
using --author='...' requires the user to type his name and email a
second time.
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
---
builtin/commit.c | 6 +++++-
1 files changed, 5 insertions(+), 1 deletions(-)
diff --git a/builtin/commit.c b/builtin/commit.c
index 22ba54f..440223c 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -47,7 +47,11 @@ static const char implicit_ident_advice[] =
"\n"
"If the identity used for this commit is wrong, you can fix it with:\n"
"\n"
-" git commit --amend --author='Your Name <you@example.com>'\n";
+" git commit --amend --author='Your Name <you@example.com>'\n"
+"\n"
+"or\n"
+"\n"
+" git commit --amend --reset-author\n";
static const char empty_amend_advice[] =
"You asked to amend the most recent commit, but doing so would make\n"
--
1.7.4.rc0.1.g6944b.dirty
^ permalink raw reply related
* Re: [msysGit] [PATCH/RFC] alias: use run_command api to execute aliases
From: Erik Faye-Lund @ 2011-01-07 14:24 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: git, msysgit, j6t
In-Reply-To: <alpine.DEB.1.00.1101070216390.1542@bonsai2>
On Fri, Jan 7, 2011 at 2:17 AM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> Hi,
>
> On Thu, 6 Jan 2011, Erik Faye-Lund wrote:
>
>> On Windows, system() executes with cmd.exe instead of /bin/sh. This
>> means that aliases currently has to be batch-scripts instead of
>> bourne-scripts. On top of that, cmd.exe does not handle single quotes,
>> which is what the code-path currently uses to handle arguments with
>> spaces.
>>
>> To solve both problems in one go, use run_command_v_opt() to execute
>> the alias. It already does the right thing prepend "sh -c " to the
>> alias.
>
> Would this not break setups where aliases were defined to execute batch
> scripts?
>
> If this is true, I'm of two minds here.
>
It would indeed, but I wouldn't worry TOO much about it. We've clearly
told the users that Git for Windows is a tool that you have to be
willing to work on to use.
But I'm kind of of two minds here myself, but for a slightly different
reason: I think Git for Windows SHOULD use cmd.exe to execute scripts.
We should be able to lose the msys-environment and still have the
basic functionality working. In that sense, this is a step in the
wrong direction. But I'd rather have all code use the same code-path
to execute scripts, and make a bit switch to cmd.exe together with
porting all supplied scripts to batch-files some time in the future.
^ permalink raw reply
* Re: Status of the svn remote helper project (Jan 2011, #1)
From: David Michael Barr @ 2011-01-07 14:00 UTC (permalink / raw)
To: Jonathan Nieder
Cc: git, Ramkumar Ramachandra, Sverre Rabbelier, Sam Vilain,
Stephen Bash, Tomas Carnecky
In-Reply-To: <20110105233915.GB22975@burratino>
Hi,
Jonathan wrote:
> Let's get svn-fe3 polished so when the next merge window comes around
> it is ready to be merged quickly.
In short, I have tested the latest rollup and see no regressions when testing
against the ASF and KDE repos.
Thank you Jonathan for the hard work, the outstanding patches are
becoming quite numerous.
--
David Barr.
^ permalink raw reply
* Re: Wish: make commiter email address configurable per-repo
From: Thomas Rast @ 2011-01-07 13:43 UTC (permalink / raw)
To: Stephen Kelly; +Cc: git
In-Reply-To: <ig7449$lbg$2@dough.gmane.org>
Stephen Kelly wrote:
> Thomas Rast wrote:
> > See user.email in git-config(1). Most people set it globally, as in
> >
> > git config --global user.email "author@example.com"
> >
> > but there's nothing stopping you from doing
> >
> > git config user.email "alias@example.com"
> >
> > to set it on a per-repo level. (Or just edit .git/config, of course.)
>
> Doesn't this set both the author and the committer?
Stephen Kelly wrote earlier:
> If my email address that I use for committing is not the same as that
> configured in the bugzilla, the automated bug closing does not work.
Oh, I see. Yes, it does.
Probably if KDE has this use-case then that means we need to implement
it as a feature on size alone, but I briefly looked into the code and
it requires a bit more restructuring than I'm willing to do over
coffee.
I think as a stop-gap measure you'll have to use an alias such as
ci = commit --author="your usual <author>"
along with a local setting for user.email to force them to be
different. (Note that this will re-set the author when saying 'git
ci --amend' on other people's commits!)
--
Thomas Rast
trast@{inf,student}.ethz.ch
^ permalink raw reply
* Re: Wish: make commiter email address configurable per-repo
From: Stephen Kelly @ 2011-01-07 13:23 UTC (permalink / raw)
To: git
In-Reply-To: <201101071420.40570.trast@student.ethz.ch>
Thomas Rast wrote:
> Stephen Kelly wrote:
>> So for some git repos in KDE which I work on on work time, I'd like to
>> set a different committer address. I can't just set GIT_COMMITTER_EMAIL
>> or whatever in my bashrc, because in other repos I want to use a
>> different committer email, and don't want it set globally for all git
>> repos I work on.
>>
>> This doesn't seem to be configurable in git config. Can that be changed?
>
> See user.email in git-config(1). Most people set it globally, as in
>
> git config --global user.email "author@example.com"
>
> but there's nothing stopping you from doing
>
> git config user.email "alias@example.com"
>
> to set it on a per-repo level. (Or just edit .git/config, of course.)
>
Doesn't this set both the author and the committer?
^ permalink raw reply
* Re: Wish: make commiter email address configurable per-repo
From: Thomas Rast @ 2011-01-07 13:20 UTC (permalink / raw)
To: Stephen Kelly; +Cc: git
In-Reply-To: <ig73o1$lbg$1@dough.gmane.org>
Stephen Kelly wrote:
> So for some git repos in KDE which I work on on work time, I'd like to set a
> different committer address. I can't just set GIT_COMMITTER_EMAIL or
> whatever in my bashrc, because in other repos I want to use a different
> committer email, and don't want it set globally for all git repos I work on.
>
> This doesn't seem to be configurable in git config. Can that be changed?
See user.email in git-config(1). Most people set it globally, as in
git config --global user.email "author@example.com"
but there's nothing stopping you from doing
git config user.email "alias@example.com"
to set it on a per-repo level. (Or just edit .git/config, of course.)
--
Thomas Rast
trast@{inf,student}.ethz.ch
^ permalink raw reply
* Wish: make commiter email address configurable per-repo
From: Stephen Kelly @ 2011-01-07 13:16 UTC (permalink / raw)
To: git
Hi,
In KDE the committer email address is used to be able to use keywords in
commit messages to automatically close bugs.
If my email address that I use for committing is not the same as that
configured in the bugzilla, the automated bug closing does not work.
So for some git repos in KDE which I work on on work time, I'd like to set a
different committer address. I can't just set GIT_COMMITTER_EMAIL or
whatever in my bashrc, because in other repos I want to use a different
committer email, and don't want it set globally for all git repos I work on.
This doesn't seem to be configurable in git config. Can that be changed?
All the best,
Steve.
^ permalink raw reply
* bug: git fetch via http breaks repo on Ctrl-C
From: Uwe Kleine-König @ 2011-01-07 11:43 UTC (permalink / raw)
To: git; +Cc: Sascha Hauer
Hello,
It happend the third time now that a repository was currupted after
interrupting a git fetch from a http remote. One of the occurences
(together with the needed fix) can be found here:
http://article.gmane.org/gmane.linux.kernel.next/14355
It would be nice to get that fixed.
Uwe
--
Pengutronix e.K. | Uwe Kleine-König |
Industrial Linux Solutions | http://www.pengutronix.de/ |
^ permalink raw reply
* bug in gitk: history moves right when scrolling up and down with mouse wheel
From: Uwe Kleine-König @ 2011-01-07 10:55 UTC (permalink / raw)
To: git
Hello,
I don't know yet how to reliably trigger that, but it feels scary.
If that help, it happens with the view
{karo {} ^linus/master {git for-each-ref --format='%(refname)' refs/remotes/customers/karo refs/heads/karo}}
If I knew how to record a video of my screen, I'd do this. Maybe
someone knows? Maybe this report is already enough?
Happens with Debian's git 1:1.7.2.3-2.2.
Best regards
Uwe
--
Pengutronix e.K. | Uwe Kleine-König |
Industrial Linux Solutions | http://www.pengutronix.de/ |
^ permalink raw reply
* bug? in checkout with ambiguous refnames
From: Uwe Kleine-König @ 2011-01-07 10:46 UTC (permalink / raw)
To: git
Hello,
everything is clean:
ukl@octopus:~/gsrc/linux-2.6$ git diff; git diff --cached
ukl@octopus:~/gsrc/linux-2.6$ git checkout sgu/mxs-amba-uart
warning: refname 'sgu/mxs-amba-uart' is ambiguous.
Previous HEAD position was c13fb17... Merge commit '65e29a85a419f9a161ab0f09f9d69924e36d940e' into HEAD
Switched to branch 'sgu/mxs-amba-uart'
OK, it might be a bad idea to have this ambiguity, still ...
ukl@octopus:~/gsrc/linux-2.6$ git diff; git diff --cached --stat
arch/arm/mach-mxs/Kconfig | 2 ++
arch/arm/mach-mxs/clock-mx23.c | 2 +-
arch/arm/mach-mxs/clock-mx28.c | 2 +-
arch/arm/mach-mxs/devices-mx23.h | 2 +-
arch/arm/mach-mxs/devices-mx28.h | 2 +-
arch/arm/mach-mxs/devices.c | 17 ++---------------
arch/arm/mach-mxs/devices/Kconfig | 1 -
arch/arm/mach-mxs/devices/amba-duart.c | 15 +++++++--------
arch/arm/mach-mxs/include/mach/devices-common.h | 4 +---
9 files changed, 16 insertions(+), 31 deletions(-)
ukl@octopus:~/gsrc/linux-2.6$ git diff refs/tags/sgu/mxs-amba-uart
ukl@octopus:~/gsrc/linux-2.6$ git diff --cached refs/tags/sgu/mxs-amba-uart
So working copy and cache are at refs/tags/sgu/mxs-amba-uart, HEAD
points to refs/heads/sgu/mxs-amba-uart
Best regards
Uwe
--
Pengutronix e.K. | Uwe Kleine-König |
Industrial Linux Solutions | http://www.pengutronix.de/ |
^ permalink raw reply
* Re: weird github capitalization problem
From: Andreas Stricker @ 2011-01-07 9:17 UTC (permalink / raw)
To: bolfo; +Cc: git
In-Reply-To: <1294146242606-5888573.post@n2.nabble.com>
[-- Attachment #1: Type: text/plain, Size: 2063 bytes --]
Am 04.01.11 14:04, schrieb bolfo:
> I first installed everything on my laptop, coded some stuff and then pushed
> to github. Apparently something went wrong because there was a new
> directory, while at first the directory was OurProjectsources, there now was
> a new directory called OurProjectSources. Weird since my local directory has
> the s not capitalized.
> I work on a windows PC while the original author works on a Mac, could this
> be the problem?
Yes, Mac OSX HFS+ filesystem ignores the case by default (you'll need
to reformat to change this). So OurProjectSources and OurProjectsources
both refers to the same directory on Mac OS X. On Linux there are two
different directories
This frequently causes issues here too. An example:
me@mac:t $ git init r
Initialized empty Git repository in /private/tmp/t/r/.git/
me@mac:r (master) $ mkdir OurProjectsources
me@mac:r (master) $ touch OurProjectsources/a
me@mac:r (master) $ git add OurProjectsources/a
me@mac:r (master) $ git commit -m "initial import"
[master (root-commit) c2cb2f3] initial import
0 files changed, 0 insertions(+), 0 deletions(-)
create mode 100644 OurProjectsources/a
me@mac:r (master) $ mv OurProjectsources/ OurProjectSources
me@mac:r (master) $ touch OurProjectSources/b
me@mac:r (master) $ git add OurProjectSources/b
me@mac:r (master) $ git commit -m "added b"
[master 4de780c] added b
0 files changed, 0 insertions(+), 0 deletions(-)
create mode 100644 OurProjectSources/b
me@mac:r (master) $ git stat
# On branch master
nothing to commit (working directory clean)
me@mac:r (master) $ scp -r .git linux:t.git
me@mac:r (master) $ ssh linux
me@linux:~ $ git clone t.git/
Initialized empty Git repository in /home/me/t/.git/
me@linux:~ $ cd t
me@linux:~/t $ ls
OurProjectsources OurProjectSources
me@linux:~/t $ find *
OurProjectsources
OurProjectsources/a
OurProjectSources
OurProjectSources/b
And there it is, our mess. The mac user accidentally created
two different directories but didn't see them.
~/Andy
[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4722 bytes --]
^ permalink raw reply
* Re: Resumable clone/Gittorrent (again) - stable packs?
From: Zenaan Harkness @ 2011-01-07 10:04 UTC (permalink / raw)
To: Jeff King; +Cc: Nicolas Pitre, git
In-Reply-To: <20110107053119.GA23177@sigill.intra.peff.net>
On Fri, Jan 7, 2011 at 16:31, Jeff King <peff@peff.net> wrote:
> On Fri, Jan 07, 2011 at 12:22:07AM -0500, Jeff King wrote:
> the reason I am
> interested in this expanded definition of mirroring is for a few
> features people have been asking for:
>
> 1. restartable clone; any bundle format is easily restartable using
> standard protocols
This is very important to me. I have failed to establish an initial
repo for a few larger projects, some apache projects and opentaps most
recently. It is getting _really_ frustrating.
> 2. avoid too-big clones; I remember the gentoo folks wanting to
> disallow full clones from their actual dev machines and push people
> off to some more static method of pulling. I think not just because
> of restartability, but because of the load on the dev machines
And of course the lack of restartability causes an ongoing increase in
the load on the machines delivering those large clones.
> 3. people on low-bandwidth servers who fork major projects; if I write
> three kernel patches and host a git server, I would really like
> people to only fetch my patches from me and get the rest of it from
> kernel.org
This is not so much of a problem - can already be handled by cloning
your linux-full.git to a private dir, and only publishing your shallow
"personal patches only" clone, or better still, just a tar-ball of
your 3 patches, or email them, or etc.
So I agree with the big issues being restartable large clones and
lowering server loads.
Zen
^ permalink raw reply
* Re: Resumable clone/Gittorrent (again)
From: Nguyen Thai Ngoc Duy @ 2011-01-07 6:34 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Git Mailing List, Luke Kenneth Casson Leighton
In-Reply-To: <alpine.LFD.2.00.1101061956470.22191@xanadu.home>
On Fri, Jan 7, 2011 at 10:21 AM, Nicolas Pitre <nico@fluxnic.net> wrote:
> How do you actually define your chain? Given that Git is conceptually
> snapshot based, there is currently no relationship between two blobs
> forming the content for two different versions of the same file. Even
> delta objects are not really part of the Git data model as they are only
> an encoding variation of a given primary object. In fact, we may and
> actually do have deltas where the base object is not from the same
> worktree file as the delta object itself.
>
> The only thing that
> ties this all together is the commit graph. And that graph might have
> multiple forks and merges so any attempt at a linearity representation
> into a chain is rather futile. Therefore it is not clear to me how you
> can define a chain with a beginning and an end, and how this can be
> resumed midway.
There's no need to be linear. OK it's not a chain, but a DAG of
objects that has the same path, in the same structure of commit DAG.
>> We start by fetching all commit contents reachable from a commit tip.
>
> Sure. This is doable today and is called a shalow clone with depth=1.
I meant only commit objects, no trees nor blobs.
>> This is a chain, therefore resumable.
>
> I don't get that part though. How is this resumable? That's the very
> issue we have with a clone.
I assume that all commits are sent in an order that parent commits are
always after the commit in question. We can make a pack of undeltified
commit objects in such order. That would make sure we could recover a
continuous commit DAG from the tip if the pack cannot be sent
completely to client.
We can traverse commit graph we have, and request for a pack of
missing commits to grow the commit DAG until we have all commits.
> I proposed a solution to that already, which is to use
> git-upload-archive for one of the tip commit since the data stream
> produced by upload-archive (once decompressed) is actually
> deterministic. Once completed, this can be converted into a shalow
> clone on the client side, and can be deepened in smaller steps
> afterwards.
You see, I don't send trees and blobs in this phase. There are three
phases. Phase 1 fetches all commits. Once we have all commits. We can
use them to request packs of trees of the same path. Those packs are
like the commit pack, but deltified. That's phase 2. When we have
enough trees, we can proceed to phase 3: fetching packs of blobs.
>> From there each commit can be
>> examined. Missing trees and blobs will be fetched as chains. Everytime
>> a delta is received, we can recreate the new object and verify it (we
>> should have its SHA-1 from its parent trees/commits).
>
> What if the delta is based on an object from another chain? How do you
> determine which chain to ask for to get that base?
Chains should be independent. If a chain is based on another chain and
we have not got its base yet (because the other chain is not
completed), we can fetch the base separately. In theory we need to
fetch a version of all paths once for them to become bases. So it's
like a broken down version of git-upload-archive.
>> Because these chains are quite independent, in a sense that a blob
>> chain is independent from another blob chain (but requires tree
>> chains, of course). We can fetch as many as we want in parallel, once
>> we're done with the commit chain.
>
> But in practice, most of those chains will end up containing objects
> which are duplicate of objects in another chain. How do you tell the
> remote that you want part of a chain because you've got 96% of it in
> another chain already?
Because all clients should have full commit graph (without trees and
blobs) before doing anything, they should be able to specify a rev
list for the chain they need. So if you only need SHA1~76..SHA1~100 of
a path, say so to remote side. SHA-1 must be one of the refs on remote
side, so it can parse the syntax and verify quickly if "SHA1~76" is
reachable/allowed to transfer.
>> The last thing I like about these chains is that the number of chains
>> is reasonable. It won't increase too fast over time (as compared to
>> the number of commits). As such it maps well to BitTorrent's "pieces".
>
> My problem right now is that I don't see how this maps well to Git.
Git sees a repository as history of snapshots. This way I see it as a
bunch of "git log -- path", not that bad.
--
Duy
^ permalink raw reply
* Re: Resumable clone/Gittorrent (again) - stable packs?
From: Jeff King @ 2011-01-07 5:31 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Zenaan Harkness, git
In-Reply-To: <20110107052207.GA23128@sigill.intra.peff.net>
On Fri, Jan 07, 2011 at 12:22:07AM -0500, Jeff King wrote:
> refs/mirrors/bundle/torrent
> refs/mirrors/bundle/http
> refs/mirrors/fetch/git
> refs/mirrors/fetch/http
>
> and the client can decide its preferred way of getting data: a bundle by
> http or by torrent, or connecting directly to some other git repository
> by git protocol or http. It would fetch the appropriate ref, which would
> contain a blob in some method-specific format. For torrent, it would be
> a torrent file. For the others, probably a newline-delimited set of
> URLs. You could also provide a torrent-magnet ref if you didn't even
> want to distribute the torrent file.
>
> And no matter what the method used, at the end you have some set of refs
> and objects, and you can re-try your (now much smaller fetch).
And I think it is probably obvious to you, Nicolas, since these are
problems you have been thinking about for some time, but the reason I am
interested in this expanded definition of mirroring is for a few
features people have been asking for:
1. restartable clone; any bundle format is easily restartable using
standard protocols
2. avoid too-big clones; I remember the gentoo folks wanting to
disallow full clones from their actual dev machines and push people
off to some more static method of pulling. I think not just because
of restartability, but because of the load on the dev machines
3. people on low-bandwidth servers who fork major projects; if I write
three kernel patches and host a git server, I would really like
people to only fetch my patches from me and get the rest of it from
kernel.org
-Peff
^ permalink raw reply
* Re: Resumable clone/Gittorrent (again) - stable packs?
From: Jeff King @ 2011-01-07 5:22 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: Zenaan Harkness, git
In-Reply-To: <alpine.LFD.2.00.1101062221480.22191@xanadu.home>
On Thu, Jan 06, 2011 at 11:33:51PM -0500, Nicolas Pitre wrote:
> Here's what I suggest:
>
> cd my_project
> BUNDLENAME=my_project_$(date "+%s").gitbundle
> git bundle create $BUNDLENAME --all
> maketorrent-console your_favorite_tracker $BUNDLENAME
>
> Then start seeding that bundle, and upload $BUNDLENAME.torrent as
> bundle.torrent inside my_project.git on your server.
>
> Now... Git clients could be improved to first check for the availability
> of the file "bundle.torrent" on the remote side, either directly in
> my_project.git, or through some Git protocol extension. Or even better,
> the torrent hash could be stored in a Git ref, such as
> refs/bittorrent/bundle and the client could use that to retrieve the
> bundle.torrent file through some other means.
I really like the simplicity of this idea. It could even be generalized
to handle more traditional mirrors, too. Just slice up the refs/mirrors
namespace to provide different methods of fetching some initial set of
objects. For example, upload-pack might advertise (in addition to the
usual refs):
refs/mirrors/bundle/torrent
refs/mirrors/bundle/http
refs/mirrors/fetch/git
refs/mirrors/fetch/http
and the client can decide its preferred way of getting data: a bundle by
http or by torrent, or connecting directly to some other git repository
by git protocol or http. It would fetch the appropriate ref, which would
contain a blob in some method-specific format. For torrent, it would be
a torrent file. For the others, probably a newline-delimited set of
URLs. You could also provide a torrent-magnet ref if you didn't even
want to distribute the torrent file.
And no matter what the method used, at the end you have some set of refs
and objects, and you can re-try your (now much smaller fetch). And there
are a few obvious optimizations:
1. When you get the initial set of refs from the master, remember
them. If the mirror actually satisfies everything you were going to
fetch, then you don't even have to reconnect for the final fetch.
2. You can optionally cache the mirror list, and go straight to a
mirror for future fetches instead of checking the master. This is
only a reasonable thing to do if the mirrors are kept up to date,
and provide good incremental access (i.e., only actual git-protocol
mirrors, not torrent or http file).
-Peff
^ permalink raw reply
* Re: Resumable clone/Gittorrent (again) - stable packs?
From: Nicolas Pitre @ 2011-01-07 4:33 UTC (permalink / raw)
To: Zenaan Harkness; +Cc: git
In-Reply-To: <AANLkTikgzqoG2cymNJ0NN03RsTRJi22R9M+0LFJ8U2yB@mail.gmail.com>
[-- Attachment #1: Type: TEXT/PLAIN, Size: 6423 bytes --]
On Fri, 7 Jan 2011, Zenaan Harkness wrote:
> On Fri, Jan 7, 2011 at 08:09, Nicolas Pitre <nico@fluxnic.net> wrote:
> > On Thu, 6 Jan 2011, Zenaan Harkness wrote:
> >
> >> Bittorrent requires some stability around torrent files.
> >>
> >> Can packs be generated deterministically?
> >
> > They _could_, but we do _not_ want to do that.
> >
> > The only thing which is stable in Git is the canonical representation of
> > objects, and the objects they depend on, expressed by their SHA1
> > signature. Any BitTorrent-alike design for Git must be based on that
> > property and not the packed representation of those objects which is not
> > meant to be stable.
> >
> > If you don't want to design anything and simply reuse current BitTorrent
> > codebase then simply create a Git bundle from some release version and
> > seed that bundle for a sufficiently long period to be worth it. Then
> > falling back to git fetch in order to bring the repo up to date with the
> > very latest commits should be small and quick. When that clone gets too
> > big then it's time to start seeding another more up-to-date bundle.
>
> Thanks guys for the explanations.
>
> So, we don't _want_ to generate packs deterministically.
> BUT, we _can_ reliably unpack a pack (duh).
Of course.
> So if my configured "canonical upstream" decides on a particular
> compression etc, I (my git client) doesn't care what has been chosen
> by my upstream.
Indeed. This is like saying: I'm sending you the value 52, but I chose
to use the representation "24 + 28", while someone else might decide to
encode that value as "13 * 4" instead. You still are able to decode it
to the same result in both cases.
> What is important for torrent-able packs though is stability over some
> time period, no matter what the format.
Hence my suggestion to simply seed a Git bundle over BitTorrent. Bundles
are files which are designed to be used by completely ad hoc transports
and you can fetch from them just like if they were a remote repository.
> There's been much talk of caching, invalidating of caches, overlapping
> torrent-packs etc.
And in my humble opinion this is just all crap. All those suggestions
are fragile, create administrative issues, eat up server resources, and
they all are suboptimal in the end. No one ever implemented a working
prototype so far either.
We don't want caches. Fundamentally, we do not need any cache. Caches
are a pain to administrate on a busy server anyway as they eat disk
space and they also represent a much bigger security risk compared to a
read-only operation.
Furthermore, a cache is good only for the common case that everyone
want. but with Git, you cannot presume that everyone is at the same
version locally. So either you do a custom transfer for each client to
minimize transfers and caching the result in that case might not benefit
that many people, or you make the cached data bigger so to cover more
cases while making the transfer suboptimal.
Finally, we do have a cache already, and that's the existing packs
themselves. During a clone, the vast majority of the transferred data
is streamed without further processing straight of those existing packs
as we try to reuse as much data as possible from those packs so not to
recompute/recompress that data all the time.
> In every case, for torrents to work, the P2P'd files must have some
> stability over some time period.
> (If this assumption is incorrect, please clarify, not counting
> every-file-is-a-torrent and every-commit-is-a-torrent.)
>
> So, torrentable options:
> - torrent per commit
> - torrent per pack
> - torrent per torrent-archive - new file format
>
> Torrent per commit - too small, too many torrents; we need larger
> p2p-able sizes in general.
>
> Torrent per pack - packs non-deterministically created, both between
> hosts and even intra-host (libz upgrade, nr_threads change, git pack
> algorithm optimization).
>
> A new torrent format, if "close enough" to current git pack
> performance (cpu load, threadability, size) is potential for new
> version of git pack file format - we don't want to store two sets of
> pack files on disk, if sensible to not do so; unlikely to happen - I
> can't conceive that a torrentable format would be anything but worse
> than pack files and therefore would be rejected from git master.
>
> Can we can relax the perceived requirement to deterministically create
> pack files?
> Well, over what time period are pack files stable in a particular git?
> Over what time period do we require stable files for torrenting?
>
> Can we simply configure our local git to keep specified pack files for
> specified time period?
> And use those for torrent-packs?
> Perhaps the torrent file could have a UseBy date?
Again, this is just too much complexity for so little gain.
Here's what I suggest:
cd my_project
BUNDLENAME=my_project_$(date "+%s").gitbundle
git bundle create $BUNDLENAME --all
maketorrent-console your_favorite_tracker $BUNDLENAME
Then start seeding that bundle, and upload $BUNDLENAME.torrent as
bundle.torrent inside my_project.git on your server.
Now... Git clients could be improved to first check for the availability
of the file "bundle.torrent" on the remote side, either directly in
my_project.git, or through some Git protocol extension. Or even better,
the torrent hash could be stored in a Git ref, such as
refs/bittorrent/bundle and the client could use that to retrieve the
bundle.torrent file through some other means.
When the bundle.torrent file is retrieved, then just pull the torrent
content (and seed it some more to be nice). Then simply run "git clone"
using the original arguments but with the obtained bundle instead of the
original URL. Then replace the remote URL in .git/config with the
actual remote URL instead of the bundle file path. And finally perform
a "git pull" to bring the new commits that were added to the remote
repository since the bundle was created. That final pull will be small
and quick.
After a while, that final pull will get bigger as the difference between
the bundled version and the current tip in the remote repository will
grow. So every so often, say 3 months, it might be a good idea to
create a new bundle so that the latest commits are included into it in
order to make that final pull small and quick again.
Isn't that sufficient?
Nicolas
^ permalink raw reply
* Re: Resumable clone/Gittorrent (again)
From: Nicolas Pitre @ 2011-01-07 3:21 UTC (permalink / raw)
To: Nguyen Thai Ngoc Duy; +Cc: Git Mailing List, Luke Kenneth Casson Leighton
In-Reply-To: <AANLkTinUV9Z_w85Gz13J+bm8xqnxJ9jBJXJm9bn5Y2ec@mail.gmail.com>
On Wed, 5 Jan 2011, Nguyen Thai Ngoc Duy wrote:
> Hi,
>
> I've been analyzing bittorrent protocol and come up with this. The
> last idea about a similar thing [1], gittorrent, was given by Nicolas.
> This keeps close to that idea (i.e the transfer protocol must be around git
> objects, not file chunks) with a bit difference.
>
> The idea is to transfer a chain of objects (trees or blobs), including
> base object and delta chain. Objects are chained in according to
> worktree layout, e.g. all objects of path/to/any/blob will form a
> chain, from a commit tip down to the root commits. Chains can have
> gaps, and don't need to start from commit tip. The transfer is
> resumable because if a delta chain is corrupt at some point, we can
> just request another chain from where it stops. Base object is
> obviously resumable.
How do you actually define your chain? Given that Git is conceptually
snapshot based, there is currently no relationship between two blobs
forming the content for two different versions of the same file. Even
delta objects are not really part of the Git data model as they are only
an encoding variation of a given primary object. In fact, we may and
actually do have deltas where the base object is not from the same
worktree file as the delta object itself.
The only thing that
ties this all together is the commit graph. And that graph might have
multiple forks and merges so any attempt at a linearity representation
into a chain is rather futile. Therefore it is not clear to me how you
can define a chain with a beginning and an end, and how this can be
resumed midway.
> We start by fetching all commit contents reachable from a commit tip.
Sure. This is doable today and is called a shalow clone with depth=1.
> This is a chain, therefore resumable.
I don't get that part though. How is this resumable? That's the very
issue we have with a clone.
I proposed a solution to that already, which is to use
git-upload-archive for one of the tip commit since the data stream
produced by upload-archive (once decompressed) is actually
deterministic. Once completed, this can be converted into a shalow
clone on the client side, and can be deepened in smaller steps
afterwards.
> From there each commit can be
> examined. Missing trees and blobs will be fetched as chains. Everytime
> a delta is received, we can recreate the new object and verify it (we
> should have its SHA-1 from its parent trees/commits).
What if the delta is based on an object from another chain? How do you
determine which chain to ask for to get that base?
> Because these chains are quite independent, in a sense that a blob
> chain is independent from another blob chain (but requires tree
> chains, of course). We can fetch as many as we want in parallel, once
> we're done with the commit chain.
But in practice, most of those chains will end up containing objects
which are duplicate of objects in another chain. How do you tell the
remote that you want part of a chain because you've got 96% of it in
another chain already?
> The last thing I like about these chains is that the number of chains
> is reasonable. It won't increase too fast over time (as compared to
> the number of commits). As such it maps well to BitTorrent's "pieces".
My problem right now is that I don't see how this maps well to Git.
Nicolas
^ permalink raw reply
* Re: [PATCH] Mark gitk script executable
From: Jonathan Nieder @ 2011-01-07 3:01 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Anders Kaseorg, git, Paul Mackerras
In-Reply-To: <7vlj2x8mr4.fsf@alter.siamese.dyndns.org>
Junio C Hamano wrote:
> Anders Kaseorg <andersk@MIT.EDU> writes:
>> The executable bit on gitk-git/gitk was lost (accidentally it seems) by
>> commit 62ba5143ec2ab9d4083669b1b1679355e7639cd5. Put it back, so that
>> gitk can be run directly from a git.git checkout.
>>
>> Note that the script is already executable in gitk.git, just not in
>> git.git.
>
> It did not lose the bit by accident but 62ba5143 pretty much was a
> deliberate fix. "gitk" is a source file, and its build product,
> gitk-wish, is what is eventually installed with executable bit on.
How does this case differ from other executable source files like
git-am.sh?
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox