Git development
 help / color / mirror / Atom feed
* Re: Mercurial 0.4b vs git patchbomb benchmark
From: Sean @ 2005-05-02 19:02 UTC (permalink / raw)
  To: Bill Davidsen
  Cc: Matt Mackall, Morten Welinder, Linus Torvalds, linux-kernel, git
In-Reply-To: <427650E7.2000802@tmr.com>

On Mon, May 2, 2005 12:10 pm, Bill Davidsen said:

> Now look at pulling 41MB over a T1 link. All of a sudden I care bigtime!
> I want very much to use my bandwidth for other things, I don't want 41MB
> added to my backup, etc. Disk space is cheap, but unless you ignore
> backups and have an OC3 or so, these numbers are large enough to be
> irritating. Not a huge issue, just one of those "piss me off every time
> I do it" things.

That 41MB or lets say 200MB is spread over several months between
releases.   Pulling once a day from the git public repository, makes this
barely noticeable.  In the future there may be optimized protocols to
handle this more efficiently.

You bring up a good point about backups though.  Eventually it might be
nice to have a utility that exports/imports a git repository in a flat
file using deltas rather than snapshots.   Such an export format would
make backups and tarballs cheaper.

Sean



^ permalink raw reply

* Re: [PATCH] add git.spec and adapt Makefile for RPM build
From: Paul Jakma @ 2005-05-02 19:06 UTC (permalink / raw)
  To: Horst von Brand; +Cc: Chris Wright, Kay Sievers, git, Linus Torvalds
In-Reply-To: <200505021858.j42Iw4M1029427@laptop11.inf.utfsm.cl>

On Mon, 2 May 2005, Horst von Brand wrote:

> And yes, I've seen quite a few packages autogenerating the spec 
> file. As a result, you /can't/ build the package from pristine 
> sources, you have to unpack and configure to get enough for 
> building. For me that just isn't acceptable, as it completely 
> misses the point of RPM.

I think maybe you're missing the point of what is sometimes known as 
a 'make dist' target. (eg in autoconf type build systems).

> (You can go "rpmbuild -ta whatever-2.3.1.tar.bz2" if the tarball is set up
> correctly, your idea prevents that).

Then the tarball wasn't of distributable (ie end-user buildable) 
source.

regards,
-- 
Paul Jakma	paul@clubi.ie	paul@jakma.org	Key ID: 64A2FF6A
Fortune:
"Now this is a totally brain damaged algorithm.  Gag me with a smurfette."
 		-- P. Buhr, Computer Science 354

^ permalink raw reply

* Re: [PATCH] add git.spec and adapt Makefile for RPM build
From: Chris Wright @ 2005-05-02 19:08 UTC (permalink / raw)
  To: Horst von Brand; +Cc: Chris Wright, Kay Sievers, git, Linus Torvalds
In-Reply-To: <200505021858.j42Iw4M1029427@laptop11.inf.utfsm.cl>

* Horst von Brand (vonbrand@inf.utfsm.cl) wrote:
> Chris Wright <chrisw@osdl.org> said:
> >                   It simply means a structured release process.  IOW,
> > the git.spec would be generated for a release tarball.
> 
> Come on, you have to fix the spec file for the changelog and version by
> hand anyway, autoconfiscating it doesn't help one iota there.

That's the point, you don't _have_ to do that.

> And yes, I've seen quite a few packages autogenerating the spec file. As a
> result, you /can't/ build the package from pristine sources, you have to
> unpack and configure to get enough for building. For me that just isn't
> acceptable, as it completely misses the point of RPM.
> 
> (You can go "rpmbuild -ta whatever-2.3.1.tar.bz2" if the tarball is set up
> correctly, your idea prevents that).

You just place the generated spec file in a release tarball.  IOW, your
'release' Makefile target depends on foo.spec, and creates a clean release
tarball with all you need to do an -ta build.

thanks,
-chris

^ permalink raw reply

* Re: [PATCH] add git.spec and adapt Makefile for RPM build
From: Paul Jakma @ 2005-05-02 19:13 UTC (permalink / raw)
  To: Horst von Brand; +Cc: git
In-Reply-To: <Pine.LNX.4.62.0505022005200.14200@sheen.jakma.org>

On Mon, 2 May 2005, Paul Jakma wrote:

> I think maybe you're missing the point of what is sometimes known as a 'make 
> dist' target. (eg in autoconf type build systems).

Apologies: /Or/ the project which provided such a tarball missed the 
point.

regards,
-- 
Paul Jakma	paul@clubi.ie	paul@jakma.org	Key ID: 64A2FF6A
Fortune:
"MacDonald has the gift on compressing the largest amount of words into
the smallest amount of thoughts."
 		-- Winston Churchill

^ permalink raw reply

* Re: on when to checksum
From: Tom Lord @ 2005-05-02 19:21 UTC (permalink / raw)
  To: torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.58.0504201601130.6467@ppc970.osdl.org>


  The thing is, I don't "trickle" things in. That would be horribly 
  inefficient for me. So I go over the patches, make a mbox, and do them all 
  in one go. And then they need to happen _fast_. If it takes 20 minutes, I 
  go away for coffee or something, and then if something didn't apply 
  half-way through, I will have lost my "context".

  That's why I want things instant. Not because I have huge daily throughput 
  issues, but I have huge _latency_ issues. 

I'm curious about what is the value of the "batch" nature of that
proces?

Presumably most patches apply cleanly and most or orthogonal (order
independent).   I'm sure that there are frequently interesting exceptions
but am I generally right about "most" here?

So, if I understand, you review each change before stuffing it in a
mailbox, then you apply all the patches in that mailbox in batch.
In the majority of cases, the buffering of changes in the mailbox
adds nothing.

Why isn't that more automated: when you approve a change, it could be
applied at once, in the background.  If conflictless, it can be committed,
tested, whatever.  If conflicting, *then* the change can be buffered
up for you to look at.   Explicit declarations from programmers or 
text-based computations about dependencies among the patches can help
improve the queue management in more complicated cases.

In other words, a more asynchronous process might save you time *and*
pay off by reserving more of your attention for areas where it's 
really needed.

-t


^ permalink raw reply

* Re: More problems...
From: Petr Baudis @ 2005-05-02 19:33 UTC (permalink / raw)
  To: Anton Altaparmakov
  Cc: Russell King, Junio C Hamano, Linus Torvalds, Ryan Anderson, git
In-Reply-To: <Pine.LNX.4.60.0504292254430.25700@hermes-1.csi.cam.ac.uk>

Dear diary, on Fri, Apr 29, 2005 at 11:57:53PM CEST, I got a letter
where Anton Altaparmakov <aia21@cam.ac.uk> told me that...
> There should definitely be an option to either enable or disable this as 
> there are legitimate cases for not wanting hard links or indeed using 
> file systems which do not support them.

Are there legitimate cases for not wanting hard links when you are able
to create them? (Same filesystem, filesystem supports them...)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Re: More problems...
From: Dave Kleikamp @ 2005-05-02 19:44 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Anton Altaparmakov, Russell King, Junio C Hamano, Linus Torvalds,
	Ryan Anderson, git
In-Reply-To: <20050502193327.GB20818@pasky.ji.cz>

On Mon, 2005-05-02 at 21:33 +0200, Petr Baudis wrote:
> Dear diary, on Fri, Apr 29, 2005 at 11:57:53PM CEST, I got a letter
> where Anton Altaparmakov <aia21@cam.ac.uk> told me that...
> > There should definitely be an option to either enable or disable this as 
> > there are legitimate cases for not wanting hard links or indeed using 
> > file systems which do not support them.
> 
> Are there legitimate cases for not wanting hard links when you are able
> to create them? (Same filesystem, filesystem supports them...)

Cloning a different user's repo?
-- 
David Kleikamp
IBM Linux Technology Center


^ permalink raw reply

* Re: More problems...
From: Thomas Glanzmann @ 2005-05-02 19:51 UTC (permalink / raw)
  To: git
In-Reply-To: <1115063079.8041.5.camel@kleikamp>

Hello,

> Cloning a different user's repo?

it isn't my quota. :-) So that's a feature. :-)

	Thomas

^ permalink raw reply

* Re: on when to checksum
From: Linus Torvalds @ 2005-05-02 19:57 UTC (permalink / raw)
  To: Tom Lord; +Cc: git
In-Reply-To: <200505021921.MAA26977@emf.net>



On Mon, 2 May 2005, Tom Lord wrote:
> 
> I'm curious about what is the value of the "batch" nature of that
> proces?

My time.

I don't know about other people, but I don't multitask. I do one thing, 
and that's it. I don't move my mouse around. I sit in my mail reader, and 
I read email. I don't read one email, switch to another window, apply it, 
swithc back, read the next email etc etc.

In fact, I claim that anybody who works that way is going to have an IQ of 
about 15 points lower than somebody who batches things up. Just because 
you end up losing your context, and that effectively makes you stupid. 

Concentration is a wonderful thing, but it _requires_ that you do things 
in a concentrated manner.

> So, if I understand, you review each change before stuffing it in a
> mailbox, then you apply all the patches in that mailbox in batch.
> In the majority of cases, the buffering of changes in the mailbox
> adds nothing.

I read email, and while reading email I save the interesting ones off to
another mbox (I call mine "doit"). They get saved off for "later perusal".

I do a first-order review at that stage, and in fact, 95% of the time, 
what goes into the "doit" folder _will_ get applied. Not 100%, though, 
exactly because at this stage I just read email and work in a mail-reader: 
I don't usually even look at the actual kernel sources that a patch 
involves. In particular, sometimes it turns out that the patch wasn't 
against my version at all, but against a -mm tree, and I just don't even 
worry about technical details at that stage.

Stage #2 is going through the "doit" folder at some later date (maybe a 
couple of times a day), and going through it one more time. Maybe not that 
much more "carefully", but with a different intent - now I actually check 
sign-offs, add my own, and check out the actual problems in the source 
tree if needed.

Stage #3 is actually applying it.

_Each_ stage culls out bad things.

And I _really_ don't bounce between stages.

> In other words, a more asynchronous process might save you time *and*
> pay off by reserving more of your attention for areas where it's 
> really needed.

It's not asynchronous. It's batched in different stages so that I can 
work better. And latency matters.

		Linus

^ permalink raw reply

* Re: questions about cg-update, cg-pull, and cg-clone.
From: Petr Baudis @ 2005-05-02 19:58 UTC (permalink / raw)
  To: Zack Brown; +Cc: Git Mailing List
In-Reply-To: <20050430005322.GA5408@tumblerings.org>

Dear diary, on Sat, Apr 30, 2005 at 02:53:22AM CEST, I got a letter
where Zack Brown <zbrown@tumblerings.org> told me that...
> 'cg-update branch-name' grabs any new changes from the upstream repository and
> merges them into my local repository. If I've been editing files in my local
> repository, the update attempts to merge the changes cleanly.

Yes.

> Now, if the update is clean, a cg-commit is invoked automatically, and if the
> update is not clean, I then have to resolve any conflicts and give the cg-commit
> command by hand. But: what is the significance of either of these cg-commit
> commands? Why should I have to write a changelog entry recording this merge? All

You might want to write some special notes regarding the merge, e.g.
when you want to describe some non-trivial conflict resolution, or even
give a short blurb of the changes you are merging.

If you don't know what to say, just press Ctrl-D. The first line of the
commit always says "Merge with what_you_are_merging_with".

> I'm doing is updating my tree to be current. Why should I have to 'commit' that
> update?

If you are only updating your tree to be current, you don't have to
commit, and in fact you don't commit (you do so-called "fast-forward
merge", which will just update your HEAD pointer to point at the newer
commit). You commit only when you were merging stuff (so-called "tree
merge"; well, that's at least how I call it to differentiate it from the
fast-forward merge). That means you have some local commits over there -
I can't just update your tree to be current, sorry. That would lose your
commit. I have to merge the changes into your tree through a merge
commit.

> Now I look at 'cg-pull'. What does this do? The readme says something about
> printing two ids, and being useful for diffs. But can't I do a diff after a
> cg-update and get the same result? I'm very confused about cg-pull right now.

cg-pull does the first part of cg-update. It is concerned by fetching
the stuff from the remote repository to the local one. cg-merge then
does the second part, merging the stuff to your local tree (doing either
fast-forward or tree merge).

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Re: Mercurial 0.4b vs git patchbomb benchmark
From: Sam Ravnborg @ 2005-05-02 20:54 UTC (permalink / raw)
  To: Linus Torvalds, Bill Davidsen, Andrea Arcangeli, Matt Mackall,
	linux-kernel, git
In-Reply-To: <20050502171802.GA28045@nevyn.them.org>

On Mon, May 02, 2005 at 01:18:02PM -0400, Daniel Jacobowitz wrote:
> > 	#!/bin/sh
> > 	exec perl perlscript.pl "$@"
> > 
> > instead.
> 
> Do you know any vaguely Unix-like system where #!/usr/bin/env does not
> work?  I don't; I've used it on Solaris, HP-UX, OSF/1...

I had to pull out a call to env from kbuild due to strange errors in
some mandrake? based system.
I never tracked it down fully at that time, I just realised that two
different programs named env was present, and the less common one made
the linux kernel build fail. env was not called with any path in that
example so that may have cured it.

	Sam

^ permalink raw reply

* Re: Mercurial 0.4b vs git patchbomb benchmark
From: Tom Lord @ 2005-05-02 21:06 UTC (permalink / raw)
  To: vonbrand; +Cc: seanlkml, git
In-Reply-To: <200504292145.j3TLjoTC014157@laptop11.inf.utfsm.cl>


   From: Horst von Brand <vonbrand@inf.utfsm.cl>

   Now pray tell how Joe signing one, two, three, or none of the things he is
   juggling makes any difference here.

We are talking about these signable things:

	1) Joe's assertions about the ancestry of his 
	   change.

	2) A full tree that Joe believes contains exactly
	   his change, compared to the ancestry, in some
	   well-defined way.

	3) A "patch" -- a statement of the well-defined 
	   change Joe is making.


Signing (1) is mandatory if history-sensitive merges are to be
possible.

If everything works perfectly, then signing (1) and (2) is
mathematically equivalent to signing (1) and (3) and both are
equivalent to signing (1), (2), and (3).

Things don't work perfectly.

A document containing (1) and (2) is, almost by definition, a "human
scale" document.   It reprepresents a real-world unit of human labor.
It summarizes the product of that labor in a human-readable, compact
form.   In most cases, a person could study a (1),(2) document in
great detail, byte for byte, relying on very few software tools.

By contrast, a document containing (1) and (3), for a project as large
as the kernel, can not be described as a "human scale" document: it
represents the product of vast amounts of human labor -- exceeding a 
single human's capacity to fully comprehend.   There are just too many
bits there for a full tree to specifically represent a single human's 
detailed *intensions* except indirectly.

You are countering, essentially, that programmers are afforded plenty
of tools for comparing two trees.  Therefore, as I understand you, any
programmer with working tree-comparison tools can robustly commit a
(1),(3) pair accurately.  Similarly, any programmer can robustly
receive a (1),(3) pair and study it as if it were a (1),(2) pair -- so
where's the problem?

The problem is in lot's of places but perhaps the clearest summary can
be presented as a communications problem.  Supposing that work is done
entirely with (1),(3) pairs:


	Alice and Bob both have a copy of the tree ORIG.

	Alice makes changes and now also has tree MOD_alice.

        Alice examines her changes locally.  Her
	version of the changes, summarized as a patch (aka changeset)
        is: CHANGES_alice.

	Alice signs the pair <ORIG, MOD_alice> (a (1),(3) pair) and
	sends it to Bob.

	Bob faithfully retrieves MOD_alice.

	Bob compares Mod_alice to ORIG, using robust tools 
	he has at hand.  The patch (aka changeset) which 
	summarizes the differences in his view is: CHANGES_bob.


Nothing in this scenario gives Bob a way to prove that CHANGES_bob ==
CHANGES_alice.  Bob can be as certain as we are content with that he
wound up with the same _tree_ that Alice did, but he and Alice will
have to go out of their way if they want to check their communication
in such a way that Bob can be confident Alice has checked that she
said what she meant to say.

More bluntly, given just a (1),(3) pair, Bob is extending his vulnerability
to include a reliance on Alice's patch-computing tools.   If Alice were
known to be signing a (1),(2) pair which she had reviewed in detail,
then Bob's vulnerability stays at just his local patch-handling tools
and his general trust of Alice.

In general, it's the potential specificity of a (1),(2) signature, rather
than a (1),(3), that makes (1),(2) signing the more robust idea (from
the robustness perspective).


-t


^ permalink raw reply

* Re: More problems...
From: Petr Baudis @ 2005-05-02 21:13 UTC (permalink / raw)
  To: Ryan Anderson; +Cc: Russell King, git
In-Reply-To: <20050429195055.GE1233@mythryan2.michonline.com>

Dear diary, on Fri, Apr 29, 2005 at 09:50:55PM CEST, I got a letter
where Ryan Anderson <ryan@michonline.com> told me that...
> Why not just use "rsync" for both remote and local synchronization, and
> provide a "relink" command to scan two .git/objects/ repositories and
> hardlink matching files together?

No. This completely misses the point, which is to avoid useless I/O when
doing this local stuff; also, it saves disk space to a degree, but it is
wildly fluctuating.

I like Junio's local-pull solution much more (from the conceptual
standpoint; I didn't look at the code yet).

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Re: Mercurial 0.4b vs git patchbomb benchmark
From: Kyle Moffett @ 2005-05-02 21:17 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Bill Davidsen, Andrea Arcangeli, Matt Mackall, linux-kernel, git
In-Reply-To: <Pine.LNX.4.58.0505020921080.3594@ppc970.osdl.org>

On May 2, 2005, at 12:31:06, Linus Torvalds wrote:
> That said, I think the /usr/bin/env trick is stupid too. It may be  
> more
> portable for various Linux distributions, but if you want _true_
> portability, you use /bin/sh, and you do something like
>
>     #!/bin/sh
>     exec perl perlscript.pl "$@"

Oooh, I can one-up that hack with this evil from perlrun(1):

#!/bin/sh -- # -*- perl -*- -W -T
eval 'exec perl -wS $0 ${1+"$@"}'
     if 0;
# PERL SCRIPT HERE

Description:
Perl ignores the eval($string) because of the "if 0" in the  
statement. The
shell sees the statement end at the newline, and executes it faithfully.
The end result is that the preferred Perl gets the script.  I don't know
Python, so I don't know if such a trick exists there.

Cheers,
Kyle Moffett

-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCM/CS/IT/U d- s++: a18 C++++>$ UB/L/X/*++++(+)>$ P+++(++++)>$
L++++(+++) E W++(+) N+++(++) o? K? w--- O? M++ V? PS+() PE+(-) Y+
PGP+++ t+(+++) 5 X R? tv-(--) b++++(++) DI+ D+ G e->++++$ h!*()>++$  
r  !y?(-)
------END GEEK CODE BLOCK------




^ permalink raw reply

* Re: More problems...
From: Linus Torvalds @ 2005-05-02 22:19 UTC (permalink / raw)
  To: Anton Altaparmakov
  Cc: Petr Baudis, Russell King, Junio C Hamano, Ryan Anderson, git
In-Reply-To: <Pine.LNX.4.60.0505022258150.27741@hermes-1.csi.cam.ac.uk>



On Mon, 2 May 2005, Anton Altaparmakov wrote:
> 
> Yes, yes, I know all tools are perfect and never have bugs but I am
> paranoid.  (-;

I do agree.

I think hardlinks are wonderful for

 - "git farms" (ie something like what kernel.org does, but in a more 
   controlled manner - right now kernel.org is really just a standard
   location for different people putting their own files in).

   In this environment, doing hard-linking should also imply 

	- mounting the filesystem "noatime"
	- using a different UID for the hardlinked objects

   ie the "farm administrator" does the hardlinking automatically, and 
   chown()'s them to himself, so that different git trees cannot screw 
   each other up. The "noatime" thing is there because having different 
   users means that git's internal "O_NOATIME" optimization no longer 
   works, and you really want to avoi dgetting lots of write-backs just 
   for "atime".

 - people who have lots of trees. I think Jeff Garzik has something like
   20+ BK trees. At that point, hardlinking just makes sense, and your 
   work patterns are likely to be aware of the different trees anyway.

But for "normal" situations, where you have a tree or two, the hardlinking 
win might not be big enough to warrant the maintenance headache. With 
hardlinking, you _do_ need to "trust" the other trees to some degree.

		Linus

^ permalink raw reply

* Re: Trying to use AUTHOR_DATE
From: Krzysztof Halasa @ 2005-05-02 22:10 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Edgar Toernig, Linus Torvalds, H. Peter Anvin, Luck, Tony, git
In-Reply-To: <1114865964.24014.77.camel@localhost.localdomain>

David Woodhouse <dwmw2@infradead.org> writes:

> During a leap second, won't tm_sec be 60?

You could rather have two 59th seconds. Or the "seconds" could be, say,
0.1% longer for 1000 s. Depends on synchronization mechanism.

I think 60th second could only be possible with leap-seconds aware
things (NTP, GPS, reference radio clocks etc.).

> And in fact you don't seem to
> handle leap seconds at all, so isn't my_mktime going to be out by one
> second for every leap second which has occurred since 1970?

No, actually the system time (i.e., the number of seconds since 1970)
is already corrected (minutes are seconds/60, hrs = minutes/60 etc.)
You are off calculating time deltas, but I guess if you need such
accuracy your software already knows about leap seconds.
-- 
Krzysztof Halasa

^ permalink raw reply

* Re: How to get bash to shut up about SIGPIPE?
From: Paul Jackson @ 2005-05-02 22:17 UTC (permalink / raw)
  To: Rene Scharfe; +Cc: torvalds, git, pasky
In-Reply-To: <20050430110410.GA25322@lsrfire.ath.cx>

Rene wrote:
> Are you sure it's SMP dependant?

No - I'm not sure.  It just happened to be that way on the couple of
systems I looked at (and I figured that in any case, it was a good bet
that Linus had multiple processors ;).

> Here's a patch for cg-log

Looks plausible to me.

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <pj@engr.sgi.com> 1.650.933.1373, 1.925.600.0401

^ permalink raw reply

* Re: More problems...
From: Anton Altaparmakov @ 2005-05-02 22:01 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Russell King, Junio C Hamano, Linus Torvalds, Ryan Anderson, git
In-Reply-To: <20050502193327.GB20818@pasky.ji.cz>

On Mon, 2 May 2005, Petr Baudis wrote:
> Dear diary, on Fri, Apr 29, 2005 at 11:57:53PM CEST, I got a letter
> where Anton Altaparmakov <aia21@cam.ac.uk> told me that...
> > There should definitely be an option to either enable or disable this as 
> > there are legitimate cases for not wanting hard links or indeed using 
> > file systems which do not support them.
> 
> Are there legitimate cases for not wanting hard links when you are able
> to create them? (Same filesystem, filesystem supports them...)

I would say yes.  For example, I want to update my git tools to the latest 
and greatest development version.  Do I really want to let it loose on all 
the repositories?  Probably not.  So I would want to make a clone of the 
repository that is not connected in any way with the old one and then 
try the new tools.  If there were hard links involved working on the 
cloned repository could potentially damage the original one.

Yes, yes, I know all tools are perfect and never have bugs but I am 
paranoid.  (-;

Best regards,

	Anton
-- 
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/

^ permalink raw reply

* Re: Mercurial 0.4b vs git patchbomb benchmark
From: Linus Torvalds @ 2005-05-02 22:02 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Matt Mackall, Morten Welinder, Sean, linux-kernel, git
In-Reply-To: <427650E7.2000802@tmr.com>



On Mon, 2 May 2005, Bill Davidsen wrote:
> 
> If there is a functional reason to use git, something Mercurial doesn't 
> do, then developers will and should use git. But the associated hassles 
> with large change size, rather than the absolute size, are worth 
> considering.

Note that we discussed this early on, and the issues with full-file 
handling haven't changed. It does actually have real functional 
advantages:

 - you can share the objects freely between different trees, never 
   worrying about one tree corrupting another trees object by mistake.
 - you can drop old objects.

delta models very fundamentally don't support this. 

For example, a simple tree re-linker will work on any mirror site, and
work reliably, even if I end up uploading new objects with some tool that
doesn't know to break hardlinks etc. That can easily be much more than a
10x win for a git repository site (imagine something like bkbits.net, but
got git).

Whether it is a huge deal or not, I don't know. I do know that the big 
deal to me is just the simplicity of the git object models. It makes me 
trust it, even in the presense of inevitable bugs. It's a very safe model, 
and right now safe is good.

		Linus

^ permalink raw reply

* Re: Trying to use AUTHOR_DATE
From: H. Peter Anvin @ 2005-05-02 22:26 UTC (permalink / raw)
  To: Krzysztof Halasa
  Cc: David Woodhouse, Edgar Toernig, Linus Torvalds, Luck, Tony, git
In-Reply-To: <m3wtqhe0t6.fsf@defiant.localdomain>

Krzysztof Halasa wrote:
> David Woodhouse <dwmw2@infradead.org> writes:
> 
>>During a leap second, won't tm_sec be 60?
> 
> You could rather have two 59th seconds. Or the "seconds" could be, say,
> 0.1% longer for 1000 s. Depends on synchronization mechanism.
>  
> I think 60th second could only be possible with leap-seconds aware
> things (NTP, GPS, reference radio clocks etc.).
> 

It is, but you can't assume you don't have that.  Either way, you just 
treat it the same as the following second.

	-hpa

^ permalink raw reply

* Re: Mercurial 0.4b vs git patchbomb benchmark
From: Matt Mackall @ 2005-05-02 22:30 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Bill Davidsen, Morten Welinder, Sean, linux-kernel, git
In-Reply-To: <Pine.LNX.4.58.0505021457060.3594@ppc970.osdl.org>

On Mon, May 02, 2005 at 03:02:16PM -0700, Linus Torvalds wrote:
> 
> 
> On Mon, 2 May 2005, Bill Davidsen wrote:
> > 
> > If there is a functional reason to use git, something Mercurial doesn't 
> > do, then developers will and should use git. But the associated hassles 
> > with large change size, rather than the absolute size, are worth 
> > considering.
> 
> Note that we discussed this early on, and the issues with full-file 
> handling haven't changed. It does actually have real functional 
> advantages:
> 
>  - you can share the objects freely between different trees, never 
>    worrying about one tree corrupting another trees object by mistake.

Not sure if this is terribly useful. It just makes it harder to pull
the subset you're interested in.

>  - you can drop old objects.

You can't drop old objects without dropping all the changesets that
refer to them or otherwise being prepared to deal with the broken
links.

> delta models very fundamentally don't support this. 

The latter can be done in a pretty straightforward manner in mercurial
with one pass over the data. But I have a goal to make keeping the
whole history cheap enough that no one balks at it.

> For example, a simple tree re-linker will work on any mirror site, and
> work reliably, even if I end up uploading new objects with some tool that
> doesn't know to break hardlinks etc. That can easily be much more than a
> 10x win for a git repository site (imagine something like bkbits.net, but
> got git).

What is a tree re-linker? Finds duplicate files and hard-links them?
Ok, that makes some sense. But it's a win on one machine and a lose
everywhere else.

> Whether it is a huge deal or not, I don't know. I do know that the big 
> deal to me is just the simplicity of the git object models. It makes me 
> trust it, even in the presense of inevitable bugs. It's a very safe model, 
> and right now safe is good.

I've added an "hg verify" command to Mercurial. It doesn't attempt to
fix anything up yet, but it can catch a couple things that git
probably can't (like file revisions that aren't owned by any
changeset), namely because there's more metadata around to look at.

I'll probably post an updated version tomorrow, I'm beginning to work
on a git2hg script.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply

* Re: Next problem: cg-commit
From: Petr Baudis @ 2005-05-02 22:30 UTC (permalink / raw)
  To: Russell King; +Cc: git
In-Reply-To: <20050429215118.D30010@flint.arm.linux.org.uk>

Dear diary, on Fri, Apr 29, 2005 at 10:51:18PM CEST, I got a letter
where Russell King <rmk@arm.linux.org.uk> told me that...
> Unfortunately, cg-commit seems to return wrong exit status, returning
> 1 on success.  Eg:
> 
> $ cg-commit
> arch/arm/mach-ixp2000/pci.c
> include/asm-arm/arch-ixp2000/platform.h
> Enter commit message, terminated by ctrl-D on a separate line:
> blah blah blah
> Committed as fafb525292acc9c0818b91b1d8e58cf770616542.
> $ echo $?
> 1
> 
> It appears that [ "$merging" ] towards the end of cg-commit is the
> cause of this odd behaviour.  Force zero exit status, since we
> successfully completed.

Nice find, thanks. I added it to few other files too.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Re: Mercurial 0.4b vs git patchbomb benchmark
From: Linus Torvalds @ 2005-05-02 22:49 UTC (permalink / raw)
  To: Matt Mackall; +Cc: Bill Davidsen, Morten Welinder, Sean, linux-kernel, git
In-Reply-To: <20050502223002.GP21897@waste.org>



On Mon, 2 May 2005, Matt Mackall wrote:
> > 
> >  - you can share the objects freely between different trees, never 
> >    worrying about one tree corrupting another trees object by mistake.
> 
> Not sure if this is terribly useful. It just makes it harder to pull
> the subset you're interested in.

You don't have to share things in a single subdirectory. Symlinks and 
hardlinks work fine, as do actual filesystem tricks ;)

> >  - you can drop old objects.
> 
> You can't drop old objects without dropping all the changesets that
> refer to them or otherwise being prepared to deal with the broken
> links.

Absolutely. This needs support from fsck to allow us to say "commit xxxx 
is no longer in the tree, because we pruned it".

Alternatively (and that's the much less intrusive one), you keep all the
commit objects, but drop the tree and blob objects. Again, all you need 
for this to work is just feed a list of commits to fsck, and tell it 
"we've pruned those from the tree", which tells fsck not to start looking 
for the contents of those commits.

So for example, you can trivially have something that automates this: take 
each commit that is older than <x> days, add it to the "prune list", and 
run fsck, and delete all objects that now show up as being unreachable 
(since fsck won't be looking at what those commits reference).

I could write this up in ten minutes. It's really simple.

And it's simple _exactly_ because we don't do deltas.

> > delta models very fundamentally don't support this. 
> 
> The latter can be done in a pretty straightforward manner in mercurial
> with one pass over the data. But I have a goal to make keeping the
> whole history cheap enough that no one balks at it.

With delta's, you have two choices:

 - change all the sha1 names (ie a pruned tree would no longer be 
   compatible with a non-pruned one)
 - make the delta part not show up as part of the sha1 name (which means 
   that it's unprotected).

which one would you have?

> What is a tree re-linker? Finds duplicate files and hard-links them?
> Ok, that makes some sense. But it's a win on one machine and a lose
> everywhere else.

Where would it be a loss? Esepcially since with git, it's cheap (you don't 
need to compare content to find objects to link - you can just compare 
filename listings).

> I've added an "hg verify" command to Mercurial. It doesn't attempt to
> fix anything up yet, but it can catch a couple things that git
> probably can't (like file revisions that aren't owned by any
> changeset), namely because there's more metadata around to look at.

git-fsck-cache catches exactly those kinds of things. And since it checks
pretty much every _single_ assumption in git (which is not a lot, since
git doesn't have a lot of assumptions), I guarantee you that you can't
find any more than it does (the filename ordering is the big missing
piece: I _still_ don't verify that trees are ordered. I've been mentioning
it since the beginning, but I'm lazy).

In other words, your verifier can't verify anything more. It's entirely 
possible that more things can go _wrong_, since you have more indexes, so 
your verifier will have more to check, but that's not an advantage, that's 
a downside.

		Linus

^ permalink raw reply

* Re: cogito: linux-2.6 merge fails due to cg-rm
From: Petr Baudis @ 2005-05-02 23:10 UTC (permalink / raw)
  To: Matt Porter; +Cc: git
In-Reply-To: <20050502102034.B21716@cox.net>

Dear diary, on Mon, May 02, 2005 at 07:20:34PM CEST, I got a letter
where Matt Porter <mporter@kernel.crashing.org> told me that...
> On Mon, May 02, 2005 at 10:12:50AM -0700, Matt Porter wrote:
> > These kept showing up as "needs merged" even though I explicitly
> > tried to cg-rm them or "update-cache --remove" them. It turns out
> > that cg-rm is 'rm -f'ing the files before calling update-cache.
> > By touching each file, and then modifying cg-rm as follows, I
> > was able to complete the merge. I'm not sure yet if this is the
> > proper fix to the cogito script. It at least made update-cache
> > happy for this remove case.
> 
> Looking a bit further, I see the cg-Xmergefile also removes the
> file before update-cache --remove which doesn't seem to work. This
> seems to be the actual culprit during the merge, but cg-rm needed
> fixed to manually fix without calling git commands directly.

Oops, this was de facto fixed long time ago (by Junio) but for some
reason was not physically merged to my tree. (He already pointed out,
but I re-forgot to fix it back then.) update-cache --remove is supposed
to remove missing files, so the cg- code is correct.

Should be fixed now. Thanks for pointing it out.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply

* Re: How to get bash to shut up about SIGPIPE?
From: Petr Baudis @ 2005-05-02 23:17 UTC (permalink / raw)
  To: Rene Scharfe; +Cc: Paul Jackson, Linus Torvalds, git
In-Reply-To: <20050430110410.GA25322@lsrfire.ath.cx>

Dear diary, on Sat, Apr 30, 2005 at 01:04:10PM CEST, I got a letter
where Rene Scharfe <rene.scharfe@lsrfire.ath.cx> told me that...
> On Fri, Apr 29, 2005 at 11:29:22PM -0700, Paul Jackson wrote:
> > Linus replied to pj:
> > > > Code Sample 2:
> > > > ...
> > > Didn't change anything for me. Same thing.
> > 
> > I don't believe you did what I did.
> > 
> > The source code for bash, both 2.x and 3.x versions, clearly displays a
> > simpler error message (no line number or redisplay of your script
> > commands) in the case that you set a trap.  And I tested both shells on
> > a multiprocessor, to verify that they behaved as I expected, running
> > these silly little scripts.
> 
> I don't have a multiprocessor and I see the same.  Are you sure it's SMP
> dependant?
> 
> Your solution (trapping _inside_ the job, too) works for me, btw.  Here's
> a patch for cg-log that reduces the clutter to two "Broken pipe" lines
> (pun not intended).

Could you elaborate on how exactly is it supposed to help? I see
identical behaviour with the traps and without them (UP, bash-2.05b).

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
C++: an octopus made by nailing extra legs onto a dog. -- Steve Taylor

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox