Using git to track my PhD thesis, couple of questions

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Using git to track my PhD thesis, couple of questions
@ 2009-08-27 20:34 seanh
  2009-08-27 20:41 ` Sverre Rabbelier
                   ` (3 more replies)
  0 siblings, 4 replies; 23+ messages in thread
From: seanh @ 2009-08-27 20:34 UTC (permalink / raw)
  To: git

I'm planning to use git to track my PhD thesis as I work on it and to 
let my supervisors track it. I've setup a git repository and a gitweb 
instance showing it. There are a couple of specific requirements.

1. My supervisors don't want to see all the little commits that I make 
day by day. So I'll commit to a dev branch, then whenever I've made 
significant progress will merge it into a trunk branch. I want the trunk 
branch to get all the changes but as one big commit, not inherit all the 
little commits like a normal merge would do. I think this is a `git 
merge --squash`. Btw the help for that command ends quite brilliantly: 
"(or more in case of an octopus)".

2. They don't want to look at the latex source but the PDFs built from 
it, which they're going to annotate with their comments. So I need an 
easy way for them to get the PDF of each commit from gitweb without 
having to checkout the repo and build it themselves. Normally I 
wouldn't commit the PDF files into the repo because they're compiled 
files not source files, but it seems that just building a PDF and 
committing it along with each commit to trunk would be by far the 
easiest way to achieve this. But will git store the PDFs efficiently, or 
will the repo start to get really big?

Thanks

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Using git to track my PhD thesis, couple of questions
  2009-08-27 20:34 Using git to track my PhD thesis, couple of questions seanh
@ 2009-08-27 20:41 ` Sverre Rabbelier
  2009-08-27 20:55 ` Matthieu Moy
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 23+ messages in thread
From: Sverre Rabbelier @ 2009-08-27 20:41 UTC (permalink / raw)
  To: seanh; +Cc: git

Heya,

On Thu, Aug 27, 2009 at 13:34, seanh<seanh.nospam@gmail.com> wrote:
> 2. They don't want to look at the latex source but the PDFs built from
> it, which they're going to annotate with their comments. So I need an
> easy way for them to get the PDF of each commit from gitweb without
> having to checkout the repo and build it themselves. Normally I
> wouldn't commit the PDF files into the repo because they're compiled
> files not source files, but it seems that just building a PDF and
> committing it along with each commit to trunk would be by far the
> easiest way to achieve this. But will git store the PDFs efficiently, or
> will the repo start to get really big?

If they only care about the pdf anyway, why not have a separate branch
to which you commit the pdf's instead?

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Using git to track my PhD thesis, couple of questions
  2009-08-27 20:34 Using git to track my PhD thesis, couple of questions seanh
  2009-08-27 20:41 ` Sverre Rabbelier
@ 2009-08-27 20:55 ` Matthieu Moy
  2009-08-28  8:34   ` Paolo Bonzini
  2009-08-27 21:38 ` Junio C Hamano
  2009-08-27 22:21 ` demerphq
  3 siblings, 1 reply; 23+ messages in thread
From: Matthieu Moy @ 2009-08-27 20:55 UTC (permalink / raw)
  To: seanh; +Cc: git

seanh <seanh.nospam@gmail.com> writes:

> I'm planning to use git to track my PhD thesis as I work on it and to 
> let my supervisors track it. I've setup a git repository and a gitweb 
> instance showing it. There are a couple of specific requirements.
>
> 1. My supervisors don't want to see all the little commits that I make 
> day by day.

I'm not sure I understand why you want that. From what you say, your
supervisors won't be looking at the LaTeX source, so they won't read
the diffs for the commits. Instead, they will be looking at regular
snapshots in PDF. So, how is that disturbing to keep the intermediate
commits ?

> So I'll commit to a dev branch, then whenever I've made 
> significant progress will merge it into a trunk branch. I want the trunk 
> branch to get all the changes but as one big commit, not inherit all the 
> little commits like a normal merge would do. I think this is a `git 
> merge --squash`.

It is, but this also means _you_ will somehow lose your intermediate
commits. Well, you may not really lose them, but after a merge
--squash, you have two options to continue working: work on top of the
squashed commit (and then your ancestry doesn't contain the small
ones), or work on top of your previous branches (and then, you don't
have a proper merge tracking, and you'll get spurious conflicts if you
try another merge --squash).

> 2. They don't want to look at the latex source but the PDFs built from 
> it, which they're going to annotate with their comments. So I need an 
> easy way for them to get the PDF of each commit from gitweb without 
> having to checkout the repo and build it themselves.

Well, they never need a PDF other than the latest version, will they?
Then, you don't need Git to send them your PDFs, just upload the PDFs
somewhere where your supervisors can grab them periodically, and
you're done.

The issue is when they start modifying the LaTeX files: then you have
to think of merging, and you'd better do that with a revision control
system.

I also used a revision control system to write my Ph.D (Git was born
after I started writting, so it wasn't Git yet), and my reviewing
system has been all the more simple: when a chapter is done, send an
email with the PDF attached, and "Hi, chapter $n is done, can you have
a look?". That just works.

> Normally I wouldn't commit the PDF files into the repo because
> they're compiled files not source files, but it seems that just
> building a PDF and committing it along with each commit to trunk
> would be by far the easiest way to achieve this. But will git store
> the PDFs efficiently, or will the repo start to get really big?

Git will do delta-compression as it can, but I don't think PDFs will
delta-compress very well, so your repository may grow rather quickly,
yes. If possible, commit the PDFs on a separate branch so that you can
easily keep your clean history small in disk space, and discard the
PDFs if needed.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Using git to track my PhD thesis, couple of questions
  2009-08-27 20:34 Using git to track my PhD thesis, couple of questions seanh
  2009-08-27 20:41 ` Sverre Rabbelier
  2009-08-27 20:55 ` Matthieu Moy
@ 2009-08-27 21:38 ` Junio C Hamano
  2009-08-27 22:21 ` demerphq
  3 siblings, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2009-08-27 21:38 UTC (permalink / raw)
  To: seanh; +Cc: git

seanh <seanh.nospam@gmail.com> writes:

> I'm planning to use git to track my PhD thesis as I work on it and to 
> let my supervisors track it. I've setup a git repository and a gitweb 
> instance showing it. There are a couple of specific requirements.
>
> 1. My supervisors don't want to see all the little commits that I make 
> day by day. So I'll commit to a dev branch, then whenever I've made 
> significant progress will merge it into a trunk branch. I want the trunk 
> branch to get all the changes but as one big commit, not inherit all the 
> little commits like a normal merge would do. I think this is a `git 
> merge --squash`. Btw the help for that command ends quite brilliantly: 
> "(or more in case of an octopus)".
>
> 2. They don't want to look at the latex source but the PDFs built from 
> it, which they're going to annotate with their comments. So I need an 
> easy way for them to get the PDF of each commit from gitweb without 
> having to checkout the repo and build it themselves. Normally I 
> wouldn't commit the PDF files into the repo because they're compiled 
> files not source files, but it seems that just building a PDF and 
> committing it along with each commit to trunk would be by far the 
> easiest way to achieve this. But will git store the PDFs efficiently, or 
> will the repo start to get really big?

What I would do if I were you (and I did something similar recently while
working on my book) is something like this:

 * Keep your source in git.  Do not worry about the commit granularity.
   Commit as often as you think makes sense.

 * Have a Makefile to build pdf if you have not done so.

 * Dedicate a separate directory, for review pupose.  Have a separate git
   repository there.  If you choose to use an untracked subdirectory
   'publish' of your source work tree (you do not have to), you would do
   something like this:

	$ mkdir publish
        $ (cd publish && git init)

   Arrange things so that "git push" in that repository will propagate its
   contents to the public repository your advisors will look at.

 * Have a 'publish' target in your Makefile, which would roughly do:

	#!/bin/sh

	make pdf &&
        cp paper.pdf publish/. &&

        this=$(git rev-parse HEAD) &&
        prev=$(cd publish &&
               git show -s | sed -ne 's/^ *Changes up to: \(.*\)$/\1/p'
	) &&
	{
		echo "Changes up to: $this"
                echo
                case "$prev" in
                '') # initial round
                        git shortlog ;;
                ?*)
                        git shortlog $prev.. ;;
                esac
	} >publish/log &&
        cd publish &&
        git add paper.pdf &&
        git commit -F log &&
        git push

 * Then when you want to submit the current status for review (perhaps you
   would want this to happen at the end of each day, or every other day,
   or whatever), type

    $ make publish

The idea is:

 (1) If your source material is not interesting to your advisors at all,
     there is no point showing, let alone the commit granularity of your
     work; and

 (2) If your advisors want to see PDF and PDF only, then give them that,
     but as you correctly said, that is a cruft from your source's point
     of view, so do not mix them together.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Using git to track my PhD thesis, couple of questions
  2009-08-27 20:34 Using git to track my PhD thesis, couple of questions seanh
                   ` (2 preceding siblings ...)
  2009-08-27 21:38 ` Junio C Hamano
@ 2009-08-27 22:21 ` demerphq
  3 siblings, 0 replies; 23+ messages in thread
From: demerphq @ 2009-08-27 22:21 UTC (permalink / raw)
  To: seanh; +Cc: git

2009/8/27 seanh <seanh.nospam@gmail.com>:
> 2. They don't want to look at the latex source but the PDFs built from
> it, which they're going to annotate with their comments. So I need an
> easy way for them to get the PDF of each commit from gitweb without
> having to checkout the repo and build it themselves. Normally I
> wouldn't commit the PDF files into the repo because they're compiled
> files not source files, but it seems that just building a PDF and
> committing it along with each commit to trunk would be by far the
> easiest way to achieve this. But will git store the PDFs efficiently, or
> will the repo start to get really big?

As you can generate the PDF's from the latex then just hack gitweb to
let them download it from there.

Yves


-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Using git to track my PhD thesis, couple of questions
  2009-08-27 20:55 ` Matthieu Moy
@ 2009-08-28  8:34   ` Paolo Bonzini
  2009-08-28  8:46     ` Matthieu Moy
  0 siblings, 1 reply; 23+ messages in thread
From: Paolo Bonzini @ 2009-08-28  8:34 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: seanh, git

On 08/27/2009 10:55 PM, Matthieu Moy wrote:
> seanh<seanh.nospam@gmail.com>  writes:
>
>> I'm planning to use git to track my PhD thesis as I work on it and to
>> let my supervisors track it. I've setup a git repository and a gitweb
>> instance showing it. There are a couple of specific requirements.
>>
>> 1. My supervisors don't want to see all the little commits that I make
>> day by day.
>
> I'm not sure I understand why you want that. From what you say, your
> supervisors won't be looking at the LaTeX source, so they won't read
> the diffs for the commits. Instead, they will be looking at regular
> snapshots in PDF. So, how is that disturbing to keep the intermediate
> commits ?
>
>> So I'll commit to a dev branch, then whenever I've made
>> significant progress will merge it into a trunk branch. I want the trunk
>> branch to get all the changes but as one big commit, not inherit all the
>> little commits like a normal merge would do. I think this is a `git
>> merge --squash`.
>
> It is, but this also means _you_ will somehow lose your intermediate
> commits. Well, you may not really lose them, but after a merge
> --squash, you have two options to continue working: work on top of the
> squashed commit (and then your ancestry doesn't contain the small
> ones), or work on top of your previous branches (and then, you don't
> have a proper merge tracking, and you'll get spurious conflicts if you
> try another merge --squash).

You can also merge from the master to your working branch after every 
merge --squash.

    ... work on local ...
    git commit
    ... work on local ...
    git commit

    git checkout master
    git merge --squash local; git commit -m'day 1'
    git checkout local
    git merge master

> I also used a revision control system to write my Ph.D (Git was born
> after I started writting, so it wasn't Git yet), and my reviewing
> system has been all the more simple: when a chapter is done, send an
> email with the PDF attached, and "Hi, chapter $n is done, can you have
> a look?". That just works.

That's the same I did.  I used git, but only locally.  I never published 
the repository for my supervisor, she didn't care.

>> Normally I wouldn't commit the PDF files into the repo because
>> they're compiled files not source files, but it seems that just
>> building a PDF and committing it along with each commit to trunk
>> would be by far the easiest way to achieve this. But will git store
>> the PDFs efficiently, or will the repo start to get really big?
>
> Git will do delta-compression as it can, but I don't think PDFs will
> delta-compress very well, so your repository may grow rather quickly,
> yes. If possible, commit the PDFs on a separate branch so that you can
> easily keep your clean history small in disk space, and discard the
> PDFs if needed.

That's a good advice.  Remember to delete the branch reflog too if you 
want to clean the history.

You can also try \pdfcompresslevel=0, which would probably make 
delta-compression behave better at the expense of distributing bigger 
files to your supervisor.  If you use hyperref, see this:

    http://www.tug.org/pipermail/pdftex/2003-August/004402.html

Best of all would be to have filters doing/undoing the PDF compression, 
but I know of no free program doing this.

Paolo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Using git to track my PhD thesis, couple of questions
  2009-08-28  8:34   ` Paolo Bonzini
@ 2009-08-28  8:46     ` Matthieu Moy
  2009-08-28 13:37       ` seanh
  0 siblings, 1 reply; 23+ messages in thread
From: Matthieu Moy @ 2009-08-28  8:46 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: seanh, git

Paolo Bonzini <bonzini@gnu.org> writes:

> You can also merge from the master to your working branch after every
> merge --squash.

Yes, good point. I didn't think of this, but it works because ...

>    ... work on local ...
>    git commit
>    ... work on local ...
>    git commit
>
>    git checkout master
>    git merge --squash local; git commit -m'day 1'

... this should fast-forward, so get the same tree as in branch
'local' and ...

>    git checkout local
>    git merge master

... then this is a merge of two identical trees, so it's trivial.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Using git to track my PhD thesis, couple of questions
  2009-08-28  8:46     ` Matthieu Moy
@ 2009-08-28 13:37       ` seanh
  2009-08-28 13:51         ` Matthieu Moy
                           ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: seanh @ 2009-08-28 13:37 UTC (permalink / raw)
  To: git

Wow, really helpful responses, thanks a lot.

I think having read all this that I'll do it manually. I'll still use 
git to track my latex source and will commit to it as often as I like 
and not worry about commit granularity. Whenever I've finished a 
significant chunk I'll add a PDF of it to a manually edited web page 
along with a description of what changed since the last time I added a 
PDF. I can use git log etc to help write the manual changelog. My 
supervisors can just look at this manually constructed page and if it 
gets too big I'll just archive the oldest PDFs. I can tag the git repo 
at the points where I add a PDF to the web page. I guess this is pretty 
close to what software projects do with version releases and their 
public website.

On Thu, Aug 27, 2009 at 01:41:04PM -0700, Sverre Rabbelier wrote:
> If they only care about the pdf anyway, why not have a separate branch
> to which you commit the pdf's instead?

Well I was thinking they'd look at the changelogs with the diffs showing 
exactly what changed in the latex source files, which should be pretty 
self-explanatory, but then when they wanted to read a whole chapter and 
add comments to it they'd want the PDF not the latex.

I don't really understand the script Junio posted (not literate in sh) 
but I think it might have something to do with copying changelogs over 
from the source repo to a PDFs repo.

On Fri, Aug 28, 2009 at 12:21:42AM +0200, demerphq wrote:
> As you can generate the PDF's from the latex then just hack gitweb to
> let them download it from there.

Unfortunately gitweb is written in Perl. But I know what you mean, it 
should in theory be possible for them to click on a 'Get PDF' link for a 
particular revision that causes the PDF to be built and returned to 
their browser.

In response to Matthieu and Paolo, I'm not sure I understand the git 
internals involved in the discussion around merge --squash, I had a 
feeling this would produce a 'merge' that git in some sense would 'not 
know about', since it sounds complex and I don't understand it I don't 
think I want to go there.

Thanks all

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Using git to track my PhD thesis, couple of questions
  2009-08-28 13:37       ` seanh
@ 2009-08-28 13:51         ` Matthieu Moy
  2009-08-28 13:54           ` Matthias Andree
  2009-08-28 21:42           ` Using git to track my PhD thesis, couple of questions david
  2009-08-28 15:50         ` Paolo Bonzini
  2009-08-28 16:12         ` demerphq
  2 siblings, 2 replies; 23+ messages in thread
From: Matthieu Moy @ 2009-08-28 13:51 UTC (permalink / raw)
  To: seanh; +Cc: git

seanh <seanh.nospam@gmail.com> writes:

> In response to Matthieu and Paolo, I'm not sure I understand the git 
> internals involved in the discussion around merge --squash, I had a 
> feeling this would produce a 'merge' that git in some sense would 'not 
> know about',

Yes, that's it. Git does a merge, and immediately forgets it was a
merge. The consequence is when you merge again later, Git will not be
able to use the merge information to be clever about merging. Somehow,
Git will be as bad as SVN for merging if you don't know what you're
doing ;-).

> since it sounds complex and I don't understand it I don't think I
> want to go there.

Well, it's fun also to learn Git notions in more details ;-).

-- 
Matthieu

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Using git to track my PhD thesis, couple of questions
  2009-08-28 13:51         ` Matthieu Moy
@ 2009-08-28 13:54           ` Matthias Andree
  2009-08-28 15:12             ` Merging in Subversion 1.5 (was: Re: Using git to track my PhD thesis, couple of questions) Jakub Narebski
  2009-08-28 21:42           ` Using git to track my PhD thesis, couple of questions david
  1 sibling, 1 reply; 23+ messages in thread
From: Matthias Andree @ 2009-08-28 13:54 UTC (permalink / raw)
  To: git

Matthieu Moy schrieb:
> seanh <seanh.nospam@gmail.com> writes:
> 
>> In response to Matthieu and Paolo, I'm not sure I understand the git 
>> internals involved in the discussion around merge --squash, I had a 
>> feeling this would produce a 'merge' that git in some sense would 'not 
>> know about',
> 
> Yes, that's it. Git does a merge, and immediately forgets it was a
> merge. The consequence is when you merge again later, Git will not be
> able to use the merge information to be clever about merging. Somehow,
> Git will be as bad as SVN for merging if you don't know what you're
> doing ;-).

To be fair, SVN versions 1.5 and newer can track merges. If the repository
predates 1.5, it has to be updated on the server side (see the release notes for
details). It just tracks which revisions have been merged and which not, for
further details, see the svn book. (http://svnbook.red-bean.com/ IIRC)

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Merging in Subversion 1.5 (was: Re: Using git to track my PhD thesis, couple of questions)
  2009-08-28 13:54           ` Matthias Andree
@ 2009-08-28 15:12             ` Jakub Narebski
  2009-08-28 15:29               ` Avery Pennarun
  2009-08-30 19:41               ` Merging in Subversion 1.5 (was: Re: Using git to track my PhD thesis, couple of questions) Sam Vilain
  0 siblings, 2 replies; 23+ messages in thread
From: Jakub Narebski @ 2009-08-28 15:12 UTC (permalink / raw)
  To: Matthias Andree; +Cc: git, Matthieu Moy

Matthias Andree <matthias.andree@gmx.de> writes:
> Matthieu Moy schrieb:
>> seanh <seanh.nospam@gmail.com> writes:
>> 
>>> In response to Matthieu and Paolo, I'm not sure I understand the git 
>>> internals involved in the discussion around merge --squash, I had a 
>>> feeling this would produce a 'merge' that git in some sense would 'not 
>>> know about',
>> 
>> Yes, that's it. Git does a merge, and immediately forgets it was a
>> merge. The consequence is when you merge again later, Git will not be
>> able to use the merge information to be clever about merging. Somehow,
>> Git will be as bad as SVN for merging if you don't know what you're
>> doing ;-).
> 
> To be fair, SVN versions 1.5 and newer can track merges. If the
> repository predates 1.5, it has to be updated on the server side
> (see the release notes for details). It just tracks which revisions
> have been merged and which not, for further details, see the svn
> book. (http://svnbook.red-bean.com/ IIRC)

>From what I understand (from what I have read, and browsed, and
lurged, and noticed) is that Subversion 1.5+ does merge tracking, but
in very different way that in Git:

 * the svn:mergeinfo is client-side property; if I understand
   correctly this would help you in repeated merges, but not anyone
   other

 * svn:mergeinfo contains _per-file_ merge info, so it is much, much
   more "chatty" than Git multiple parents.  This might be more
   powerfull approach, in the same sense that more advanced merge
   strategies that 3-way merge were more powerfull -- but 3-way merge
   is best because it is simple (and either it is simple that 3-way
   merge is enough, or complicated so manual intervention is required).

 * You have to explicitely enable using svn:mergeinfo in log and blame

 * The command to merge trunk into branch is different from command to
   merge branch into trunk.

Also IIRC there is warning (well, at least there was in Subversion 1.5
release notes) that merge tracking doesn't work entirely correctly in
the face of criss-cross merges (multiple merge bases) and renaming
(although I do hope that they fixed problem with silent corruption if
there is rename during merge).

-- 
Jakub Narebski

Git User's Survey 2009: http://tinyurl.com/GitSurvey2009

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Merging in Subversion 1.5 (was: Re: Using git to track my PhD  thesis, couple of questions)
  2009-08-28 15:12             ` Merging in Subversion 1.5 (was: Re: Using git to track my PhD thesis, couple of questions) Jakub Narebski
@ 2009-08-28 15:29               ` Avery Pennarun
  2009-08-28 15:44                 ` Matthias Andree
  2009-08-28 16:19                 ` Merging in Subversion 1.5 Jakub Narebski
  2009-08-30 19:41               ` Merging in Subversion 1.5 (was: Re: Using git to track my PhD thesis, couple of questions) Sam Vilain
  1 sibling, 2 replies; 23+ messages in thread
From: Avery Pennarun @ 2009-08-28 15:29 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Matthias Andree, git, Matthieu Moy

On Fri, Aug 28, 2009 at 3:12 PM, Jakub Narebski<jnareb@gmail.com> wrote:
> From what I understand (from what I have read, and browsed, and
> lurged, and noticed) is that Subversion 1.5+ does merge tracking, but
> in very different way that in Git:
>
>  * the svn:mergeinfo is client-side property; if I understand
>   correctly this would help you in repeated merges, but not anyone
>   other

I don't believe there is such a thing as a "client-side property" in
svn.  I see someone said this on stackoverflow
(http://stackoverflow.com/questions/1156698/are-svn-merges-idempotent)
but I'm pretty sure they were either mistaken or using a different
definition of "client-side."

I think they probably meant that it's the client's responsibility to
set the property correctly, not the server's, and if your client is
too old any you do a merge, it'll forget to set svn:mergeinfo, causing
confusion for everyone.  There's discussion in the svn book
(http://svnbook.red-bean.com/en/1.5/svn.branchmerge.advanced.html) but
nothing implies that it's a non-replicated property.  Indeed, I can
see no particular reason that anyone would want it to be, for the
reasons you specify.

>  * svn:mergeinfo contains _per-file_ merge info, so it is much, much
>   more "chatty" than Git multiple parents.  This might be more
>   powerfull approach, in the same sense that more advanced merge
>   strategies that 3-way merge were more powerfull -- but 3-way merge
>   is best because it is simple (and either it is simple that 3-way
>   merge is enough, or complicated so manual intervention is required).

svn people really love their cherry-picks and want to keep track of
which things get cherry picked from one branch to another.  This is
nice (at least for informational purposes) although they go through
some probably-unnecessary contortions *after* doing this, including
splitting a merge from "maint" into "master" into two sequential
merges, if you've previously cherry-picked a commit from master into
maint.  The above svn book link describes this in a bit more detail.

I don't think that behaviour would be much help in any situation I've
ever experienced, so I agree with your comment that 3-way merge is
generally better.

Tracking cherry picks in git would be really nice *sometimes*, but it
creates a tradeoff where you then have to slurp in huge amounts of
history that you might not want.  In svn, this tradeoff doesn't exist,
since anything you cherry pick must have already existed on the server
anyway, and can never go away.

>  * You have to explicitely enable using svn:mergeinfo in log and blame

Conversely, in git you can basically disable it using --first-parent,
which is sometimes handy.  (It's handiest if your team has a policy of
always using --no-ff when merging into trunk, which makes git act a
bit more like svn's merge tracking.  I realize this is a bit heretical
to suggest on the git list, but I appreciate that the option exists
despite its heresy :))

Have fun,

Avery

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Merging in Subversion 1.5 (was: Re: Using git to track my PhD thesis, couple of questions)
  2009-08-28 15:29               ` Avery Pennarun
@ 2009-08-28 15:44                 ` Matthias Andree
  2009-08-28 16:19                 ` Merging in Subversion 1.5 Jakub Narebski
  1 sibling, 0 replies; 23+ messages in thread
From: Matthias Andree @ 2009-08-28 15:44 UTC (permalink / raw)
  To: Avery Pennarun, Jakub Narebski; +Cc: git, Matthieu Moy

Am 28.08.2009, 17:29 Uhr, schrieb Avery Pennarun <apenwarr@gmail.com>:

> I think they probably meant that it's the client's responsibility to
> set the property correctly, not the server's, and if your client is
> too old any you do a merge, it'll forget to set svn:mergeinfo, causing
> confusion for everyone.  There's discussion in the svn book
> (http://svnbook.red-bean.com/en/1.5/svn.branchmerge.advanced.html) but
> nothing implies that it's a non-replicated property.  Indeed, I can
> see no particular reason that anyone would want it to be, for the
> reasons you specify.

It is replicated, and the common remedy against older clients is to refuse  
commits from those clients that do not support mergeinfo. This is done by  
defining a repository hook on the server side that validates this. AFAIR  
such a hook example ships with SVN.

-- 
Matthias Andree

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Using git to track my PhD thesis, couple of questions
  2009-08-28 13:37       ` seanh
  2009-08-28 13:51         ` Matthieu Moy
@ 2009-08-28 15:50         ` Paolo Bonzini
  2009-08-28 16:12         ` demerphq
  2 siblings, 0 replies; 23+ messages in thread
From: Paolo Bonzini @ 2009-08-28 15:50 UTC (permalink / raw)
  To: seanh; +Cc: git

On 08/28/2009 03:37 PM, seanh wrote:
> In response to Matthieu and Paolo, I'm not sure I understand the git
> internals involved in the discussion around merge --squash, I had a
> feeling this would produce a 'merge' that git in some sense would 'not
> know about', since it sounds complex and I don't understand it I don't
> think I want to go there.

Yes, the problem is that git does not track what happens when you do 
"git merge --squash", which makes it harder to do merges after some time 
(because of conflicts).

The solution I gave (and Matthieu explained how it works, even though 
it's very technical) is a way to "explain" git what you did.  If you try 
it on a fake example with gitk, you should understand it better.

    mkdir test
    cd test

    # import
    git init
    echo a > test
    git add a
    git commit -m1

    # some changes happen in your local "fine grained" branch
    git checkout -b local
    echo b > test
    git commit -a -m2
    echo c >> test
    git commit -a -m3                                  ##<<<

    # the magic incantation brings those commit to master
    # (first two commands) and teaches git what happened (last two)
    git checkout master
    git merge --squash local; git commit -m'merge 1'   ##<<<
    git checkout local
    git merge master                                   ##<<<

    # more local changes
    sed -i s/b/d/ test
    git commit -a -m4
    echo z >> test
    git commit -a -m5                                  ##<<<

    # the magic incantation, again
    git checkout master
    git merge --squash local; git commit -m'merge 1'   ##<<<
    git checkout local
    git merge master                                   ##<<<

Use gitk at the points indicated with ##<<<

It is actually very similar to what you chose to do.  My commits to 
master, in practice, are your tags.  You may want to see how gitk's 
graphs looks in both scenarios, and choose the one that you prefer.

Hope this helps!

Paolo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Using git to track my PhD thesis, couple of questions
  2009-08-28 13:37       ` seanh
  2009-08-28 13:51         ` Matthieu Moy
  2009-08-28 15:50         ` Paolo Bonzini
@ 2009-08-28 16:12         ` demerphq
  2009-08-28 21:44           ` david
  2 siblings, 1 reply; 23+ messages in thread
From: demerphq @ 2009-08-28 16:12 UTC (permalink / raw)
  To: seanh; +Cc: git

2009/8/28 seanh <seanh.nospam@gmail.com>:
> On Fri, Aug 28, 2009 at 12:21:42AM +0200, demerphq wrote:
>> As you can generate the PDF's from the latex then just hack gitweb to
>> let them download it from there.
>
> Unfortunately gitweb is written in Perl. But I know what you mean, it
> should in theory be possible for them to click on a 'Get PDF' link for a
> particular revision that causes the PDF to be built and returned to
> their browser.

What is unfortunate about that? Perl is a duct tape/swiss-army-knife
of the internet.  Hacking gitweb to generate PDF's on the fly from
latex documents should be a fairly trivial hack, even if you aren't a
Perl hacker.

See:

http://search.cpan.org/~andrewf/LaTeX-Driver-0.08/lib/LaTeX/Driver.pm

for just one of many Perl modules to interface with with LaTeX.

Good luck.

Yves

-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Merging in Subversion 1.5
  2009-08-28 15:29               ` Avery Pennarun
  2009-08-28 15:44                 ` Matthias Andree
@ 2009-08-28 16:19                 ` Jakub Narebski
  2009-08-28 16:28                   ` Matthias Andree
  2009-08-28 16:34                   ` Avery Pennarun
  1 sibling, 2 replies; 23+ messages in thread
From: Jakub Narebski @ 2009-08-28 16:19 UTC (permalink / raw)
  To: Avery Pennarun; +Cc: Matthias Andree, git, Matthieu Moy

On Fri, 28 Aug 2009, Avery Pennarun wrote:
> On Fri, Aug 28, 2009 at 3:12 PM, Jakub Narebski<jnareb@gmail.com> wrote:

> > From what I understand (from what I have read, and browsed, and
> > lurged, and noticed) is that Subversion 1.5+ does merge tracking, but
> > in very different way that in Git:
> >
> >  * the svn:mergeinfo is client-side property; if I understand
> >   correctly this would help you in repeated merges, but not anyone
> >   other
> 
> I don't believe there is such a thing as a "client-side property" in
> svn.

What about svn:ignore or svn:mimetype (IIRC) property?

> I see someone said this on stackoverflow 
> (http://stackoverflow.com/questions/1156698/are-svn-merges-idempotent)
> but I'm pretty sure they were either mistaken or using a different
> definition of "client-side."

I think I got this (wrong?) impression from there.

> >  * svn:mergeinfo contains _per-file_ merge info, so it is much, much
> >   more "chatty" than Git multiple parents.  This might be more
> >   powerfull approach, in the same sense that more advanced merge
> >   strategies that 3-way merge were more powerfull -- but 3-way merge
> >   is best because it is simple (and either it is simple that 3-way
> >   merge is enough, or complicated so manual intervention is required).
> 
> svn people really love their cherry-picks and want to keep track of
> which things get cherry picked from one branch to another.  This is
> nice (at least for informational purposes) although they go through
> some probably-unnecessary contortions *after* doing this, including
> splitting a merge from "maint" into "master" into two sequential
> merges, if you've previously cherry-picked a commit from master into
> maint.  The above svn book link describes this in a bit more detail.
> 
> I don't think that behaviour would be much help in any situation I've
> ever experienced, so I agree with your comment that 3-way merge is
> generally better.

Errr... what I meant here that I have read (on some blog, but either
I didn't bookmark it, or I can't find the bookmark) that svn:mergeinfo
is not as simple as listing _revisions_ which are merged (i.e. either
all parents, or additional parent), but it lists per-file merge 
information, and can be quite large.
 
> >  * You have to explicitely enable using svn:mergeinfo in log and blame
> 
> Conversely, in git you can basically disable it using --first-parent,
> which is sometimes handy. [...]

In git-log.  But in git-blame?

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Merging in Subversion 1.5
  2009-08-28 16:19                 ` Merging in Subversion 1.5 Jakub Narebski
@ 2009-08-28 16:28                   ` Matthias Andree
  2009-08-28 16:34                   ` Avery Pennarun
  1 sibling, 0 replies; 23+ messages in thread
From: Matthias Andree @ 2009-08-28 16:28 UTC (permalink / raw)
  To: git@vger.kernel.org; +Cc: Jakub Narebski

[culling most of Cc: list]

Am 28.08.2009, 18:19 Uhr, schrieb Jakub Narebski <jnareb@gmail.com>:

> On Fri, 28 Aug 2009, Avery Pennarun wrote:
>> On Fri, Aug 28, 2009 at 3:12 PM, Jakub Narebski<jnareb@gmail.com> wrote:
>
>> > From what I understand (from what I have read, and browsed, and
>> > lurged, and noticed) is that Subversion 1.5+ does merge tracking, but
>> > in very different way that in Git:
>> >
>> >  * the svn:mergeinfo is client-side property; if I understand
>> >   correctly this would help you in repeated merges, but not anyone
>> >   other
>>
>> I don't believe there is such a thing as a "client-side property" in
>> svn.
>
> What about svn:ignore or svn:mimetype (IIRC) property?

All this is committed to the repository, so there isn't a question of if  
it's client-side in a sense of "local to the client/checkout". Some  
properties (such as svn:mergeinfo) require a bit of additional server-side  
support, but that's about it.

Oh, and to complicate matters, let me mention revprops (such as  
svn:log).   SCNR :^)

-- 
Matthias Andree

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Merging in Subversion 1.5
  2009-08-28 16:19                 ` Merging in Subversion 1.5 Jakub Narebski
  2009-08-28 16:28                   ` Matthias Andree
@ 2009-08-28 16:34                   ` Avery Pennarun
  1 sibling, 0 replies; 23+ messages in thread
From: Avery Pennarun @ 2009-08-28 16:34 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Matthias Andree, git, Matthieu Moy

On Fri, Aug 28, 2009 at 4:19 PM, Jakub Narebski<jnareb@gmail.com> wrote:
> On Fri, 28 Aug 2009, Avery Pennarun wrote:
>> On Fri, Aug 28, 2009 at 3:12 PM, Jakub Narebski<jnareb@gmail.com> wrote:
>> >  * You have to explicitely enable using svn:mergeinfo in log and blame
>>
>> Conversely, in git you can basically disable it using --first-parent,
>> which is sometimes handy. [...]
>
> In git-log.  But in git-blame?

I don't know about git-blame, as I rarely use it.  If it doesn't
support --first-parent, I imagine it would be easy to add, if it were
important to someone.

Avery

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Using git to track my PhD thesis, couple of questions
  2009-08-28 13:51         ` Matthieu Moy
  2009-08-28 13:54           ` Matthias Andree
@ 2009-08-28 21:42           ` david
  1 sibling, 0 replies; 23+ messages in thread
From: david @ 2009-08-28 21:42 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: seanh, git

On Fri, 28 Aug 2009, Matthieu Moy wrote:

> seanh <seanh.nospam@gmail.com> writes:
>
>> In response to Matthieu and Paolo, I'm not sure I understand the git
>> internals involved in the discussion around merge --squash, I had a
>> feeling this would produce a 'merge' that git in some sense would 'not
>> know about',
>
> Yes, that's it. Git does a merge, and immediately forgets it was a
> merge. The consequence is when you merge again later, Git will not be
> able to use the merge information to be clever about merging. Somehow,
> Git will be as bad as SVN for merging if you don't know what you're
> doing ;-).

I thought that was what rere did?

David Lang

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Using git to track my PhD thesis, couple of questions
  2009-08-28 16:12         ` demerphq
@ 2009-08-28 21:44           ` david
  2009-08-28 22:16             ` demerphq
  0 siblings, 1 reply; 23+ messages in thread
From: david @ 2009-08-28 21:44 UTC (permalink / raw)
  To: demerphq; +Cc: seanh, git

On Fri, 28 Aug 2009, demerphq wrote:

> 2009/8/28 seanh <seanh.nospam@gmail.com>:
>> On Fri, Aug 28, 2009 at 12:21:42AM +0200, demerphq wrote:
>>> As you can generate the PDF's from the latex then just hack gitweb to
>>> let them download it from there.
>>
>> Unfortunately gitweb is written in Perl. But I know what you mean, it
>> should in theory be possible for them to click on a 'Get PDF' link for a
>> particular revision that causes the PDF to be built and returned to
>> their browser.
>
> What is unfortunate about that? Perl is a duct tape/swiss-army-knife
> of the internet.  Hacking gitweb to generate PDF's on the fly from
> latex documents should be a fairly trivial hack, even if you aren't a
> Perl hacker.

I have a situation where I need to generae pdf's from files that are under 
git. I have a git repository on by webserver that I push to and have a 
trigger that regenerates the pdfs any time there is a push.

David Lang

> See:
>
> http://search.cpan.org/~andrewf/LaTeX-Driver-0.08/lib/LaTeX/Driver.pm
>
> for just one of many Perl modules to interface with with LaTeX.
>
> Good luck.
>
> Yves
>
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Using git to track my PhD thesis, couple of questions
  2009-08-28 21:44           ` david
@ 2009-08-28 22:16             ` demerphq
  0 siblings, 0 replies; 23+ messages in thread
From: demerphq @ 2009-08-28 22:16 UTC (permalink / raw)
  To: david; +Cc: seanh, git

2009/8/28  <david@lang.hm>:
> On Fri, 28 Aug 2009, demerphq wrote:
>
>> 2009/8/28 seanh <seanh.nospam@gmail.com>:
>>>
>>> On Fri, Aug 28, 2009 at 12:21:42AM +0200, demerphq wrote:
>>>>
>>>> As you can generate the PDF's from the latex then just hack gitweb to
>>>> let them download it from there.
>>>
>>> Unfortunately gitweb is written in Perl. But I know what you mean, it
>>> should in theory be possible for them to click on a 'Get PDF' link for a
>>> particular revision that causes the PDF to be built and returned to
>>> their browser.
>>
>> What is unfortunate about that? Perl is a duct tape/swiss-army-knife
>> of the internet.  Hacking gitweb to generate PDF's on the fly from
>> latex documents should be a fairly trivial hack, even if you aren't a
>> Perl hacker.
>
> I have a situation where I need to generae pdf's from files that are under
> git. I have a git repository on by webserver that I push to and have a
> trigger that regenerates the pdfs any time there is a push.

Actually this discussion makes me think that there is room for a hack
to gitweb to provide extensible and pluggable renderers of the files
in a repository. Such a framework would for instance provide for
syntax highlighting, PDF generation from latex files, etc.

Hypothetically it wouldnt be too hard to do. A Win32 (dare I say)
registry of file extensions/shebang lines would be linked into a set
of renderer plugin's, which in turn would automatically add the
required links to render the file as needed. Quite doable actually.

Yves


-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Merging in Subversion 1.5 (was: Re: Using git to track my PhD thesis, couple of questions)
  2009-08-28 15:12             ` Merging in Subversion 1.5 (was: Re: Using git to track my PhD thesis, couple of questions) Jakub Narebski
  2009-08-28 15:29               ` Avery Pennarun
@ 2009-08-30 19:41               ` Sam Vilain
  2009-08-31  5:47                 ` Dmitry Potapov
  1 sibling, 1 reply; 23+ messages in thread
From: Sam Vilain @ 2009-08-30 19:41 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Matthias Andree, git, Matthieu Moy

On Fri, 2009-08-28 at 08:12 -0700, Jakub Narebski wrote:
>  * svn:mergeinfo contains _per-file_ merge info, so it is much, much
>    more "chatty" than Git multiple parents.

It can.  But more often, if you're merging complete paths, you will get
complete revision ranges.

See eg
https://trac.parrot.org/parrot/browser/trunk

Note how trac is also hiding the branches that were subsequently deleted
from the mergeinfo ticket.

>  * The command to merge trunk into branch is different from command to
>    merge branch into trunk.

This is a caveat of url-based branches.

> Also IIRC there is warning (well, at least there was in Subversion 1.5
> release notes) that merge tracking doesn't work entirely correctly in
> the face of criss-cross merges (multiple merge bases) and renaming
> (although I do hope that they fixed problem with silent corruption if
> there is rename during merge).

Not sure about that one.  I also heard - unconfirmed - that things start
to go awry if you start branching off branches and merging around the
place.  But if that happens it's likely a bug rather than a design flaw
(I think).

Sam

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: Merging in Subversion 1.5 (was: Re: Using git to track my PhD thesis, couple of questions)
  2009-08-30 19:41               ` Merging in Subversion 1.5 (was: Re: Using git to track my PhD thesis, couple of questions) Sam Vilain
@ 2009-08-31  5:47                 ` Dmitry Potapov
  0 siblings, 0 replies; 23+ messages in thread
From: Dmitry Potapov @ 2009-08-31  5:47 UTC (permalink / raw)
  To: Sam Vilain; +Cc: Jakub Narebski, Matthias Andree, git, Matthieu Moy

On Mon, Aug 31, 2009 at 07:41:56AM +1200, Sam Vilain wrote:
> On Fri, 2009-08-28 at 08:12 -0700, Jakub Narebski wrote:
> 
> > Also IIRC there is warning (well, at least there was in Subversion 1.5
> > release notes) that merge tracking doesn't work entirely correctly in
> > the face of criss-cross merges (multiple merge bases) and renaming
> > (although I do hope that they fixed problem with silent corruption if
> > there is rename during merge).
> 
> Not sure about that one.  I also heard - unconfirmed - that things start
> to go awry if you start branching off branches and merging around the
> place.  But if that happens it's likely a bug rather than a design flaw
> (I think).

Some of the initial issues that existed in SVN 1.5.0 have been resolved,
but some others remain. Here is one bug report related to merge:
http://subversion.tigris.org/issues/show_bug.cgi?id=2897
It was reported two years ago, but the problem is still not fixed.
And there is a few others (some of them even older but even with less
prospect of being fixed any time soon):
http://subversion.tigris.org/issues/show_bug.cgi?id=2837
http://subversion.tigris.org/issues/show_bug.cgi?id=2898
http://subversion.tigris.org/issues/show_bug.cgi?id=3056
http://subversion.tigris.org/issues/show_bug.cgi?id=3157

I don't think they would exist for long if they were ease to fix.  Merge
in Subversion is essence automatic cherry-picking, and it is not easy to
implement that in the way it would be reasonably fast and work correctly
in a general case.

Darcs is probably the best when it comes to cherry-picking but clearly
it is not a speed demon. In case of Subversion, the problem is worse,
because it has to make decision on a per file basis rather than operate
each patch as a unit. So, it is even more difficult to implement that
correctly and efficiently.

What you can do relatively simple is to handle a of one directional
merge, and that was the primary design goal of Subversion merge
tracking feature.

Here is what Daniel Berlin wrote about it:
<<<
The initial merge tracking implementation was not meant to handle
repeated bidirectional merging, at least, as designed.

It was designed to allow cherry picks, and mainly for maintaining
feature branches that were mostly one way merges, with the very
occasional merge in the other direction and then branch death :).

For these cases, it works out fine.

For more complex cyclical merge patterns, you really can't use what
we've got. Trying to work around these cases, or build algorithms
that handle them, is just going to lead you into 20 years of edge
cases that made people come up with changeset dags in the first place.
>>>
Source: http://subversion.tigris.org/ds/viewMessage.do?dsForumId=462&dsMessageId=892215

So, I do not think that SVN merge will ever work correctly for those
edge cases.

But even if Subversion learns how to handle all those complex cases
correctly, it will still come with some surprises. One of the main
advantage of the simple 3-way merge is that it is easy to understand
and it makes the right thing most of time. Linus provided a really good
explanation of it here:
http://thread.gmane.org/gmane.comp.version-control.git/60457/focus=60644

Dmitry

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2009-08-31  5:53 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-08-27 20:34 Using git to track my PhD thesis, couple of questions seanh
2009-08-27 20:41 ` Sverre Rabbelier
2009-08-27 20:55 ` Matthieu Moy
2009-08-28  8:34   ` Paolo Bonzini
2009-08-28  8:46     ` Matthieu Moy
2009-08-28 13:37       ` seanh
2009-08-28 13:51         ` Matthieu Moy
2009-08-28 13:54           ` Matthias Andree
2009-08-28 15:12             ` Merging in Subversion 1.5 (was: Re: Using git to track my PhD thesis, couple of questions) Jakub Narebski
2009-08-28 15:29               ` Avery Pennarun
2009-08-28 15:44                 ` Matthias Andree
2009-08-28 16:19                 ` Merging in Subversion 1.5 Jakub Narebski
2009-08-28 16:28                   ` Matthias Andree
2009-08-28 16:34                   ` Avery Pennarun
2009-08-30 19:41               ` Merging in Subversion 1.5 (was: Re: Using git to track my PhD thesis, couple of questions) Sam Vilain
2009-08-31  5:47                 ` Dmitry Potapov
2009-08-28 21:42           ` Using git to track my PhD thesis, couple of questions david
2009-08-28 15:50         ` Paolo Bonzini
2009-08-28 16:12         ` demerphq
2009-08-28 21:44           ` david
2009-08-28 22:16             ` demerphq
2009-08-27 21:38 ` Junio C Hamano
2009-08-27 22:21 ` demerphq

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).