From: "Catalin Marinas" <catalin.marinas@gmail.com>
To: "Karl Hasselström" <kha@treskal.com>
Cc: git@vger.kernel.org
Subject: Re: [StGit PATCH 03/14] Write to a stack log when stack is modified
Date: Thu, 19 Jun 2008 10:24:28 +0100 [thread overview]
Message-ID: <b0943d9e0806190224v1b6434fesd3a54443422edaeb@mail.gmail.com> (raw)
In-Reply-To: <20080618173246.GA1155@diana.vm.bytemark.co.uk>
2008/6/18 Karl Hasselström <kha@treskal.com>:
> On 2008-06-18 17:16:10 +0100, Catalin Marinas wrote:
>> We could also easily use the commit message directly for metadata
>> instead of .git/patches/<branch>/.
>
> Yes. This is true for any log scheme, though. (But I agree -- longer
> term, it would be very nice to have _only_ the log.)
It would make collaboration using stgit stacks easier. I can start
looking at this once we agree on the log structure.
>> The time to create that tree and blobs worries me a bit, plus the
>> (in my view) complicated structure.
>
> You're right to worry. My log format makes things feel slightly slower
> when logging a 20-30 deep stack. If I can't make it faster, it's not
> viable. But I'm pretty sure I can -- it should be simple to reuse all
> the trees and blobs for the patches that weren't touched. (And
> operations that touch lots of patches, like rebase, have to write a
> lot of git objects anyway as part of their operation, so the relative
> slowdown should not be large.)
>
> Anyway, benchmarking is the way to go here. Talk will only get us that
> far.
In the past, I ran some tests because people complained that stgit was
slow compared to quilt:
http://article.gmane.org/gmane.comp.version-control.git/9670
After profiling (the stg-prof file), as expected, most of the time was
spent in external Git calls which I tried to keep to a minimum. With
your approach, you probably add at least 4-5 calls to Git (just a
guess, I haven't counted) and, with a big repository, it will be
visible (I have about 15 branches on my Linux kernel clone going back
to the 2.6.12 kernel and the .git size is over 1/2GB after git-gc).
>> Making the first log entry special gets difficult with log pruning
>> (unless you prune the whole log rather than entries older than a
>> chosen time or number) since you might have to re-create all the
>> chained log entries as the first log's sha1 will probably change.
>
> You have to re-create all the commits anyway, since they all are
> immutable, and all have a back pointer.
Ah, OK. So, at least initially, we should only support the full log pruning.
>> The applied patches are chained automatically via HEAD. For
>> unapplied patches, we could add the correponding commits as parents
>> of the logging commit (starting with the third parent as the first
>> two are used for log chaining and applied patches). Do we hit any OS
>> limit with the number of arguments?
>
> Not until long after we hit git limits to the number of parents of a
> commit. I believe the octopus merge refuses to create merges with more
> than about 25 parents, and we probably shouldn't do more than that
> either. We'll have to do a tree of octopuses.
For the first log only, we could chain the unapplied patches using
commits with 2 parents. We just need to warn people not to stare at
the <branch>.stgit directly :-)
>> Since we only need the unapplied commits tracking for gc and pull
>> reasons, we could only create that commit that references them when
>> we prune the stack log and link it from the top one (but it won't be
>> used by stgit).
>
> Yes, we need to create an "unapplied octopus" if and only if we have
> unapplied patches that we can't prove are reachable from the stack top
> or the branch head (we have to save both, in case the user has done
> something such as git-committing on topp of the stack and caused head
> != top). Which is for the first log entry, and in situations such as
> "stg pick --unapplied", but not for "stg pop" and the like.
Yes, but just chain unapplied commits rather than using octopus (you
could try to the -mm series with 500+ patches and see what happens).
> I do agree that we shouldn't try to use the octopuses to get hold of
> the commits, though -- just to keep them reachable. We save the sha1
> along with the patch name elsewhere in a more convenient form. (My
> proposed format does precisely this.)
Yes, but you still have to refer the tree objects corresponding to a
patch and extra work to generate the simplified log.
> So. If I got it right, your proposal is:
>
> * Tree: just take the HEAD tree.
Yes.
> * Commit message: list the applied and unapplied patches with their
> commit sha1s.
We don't even need to differentiate between applied and unapplied in
the commit message as long as they are listed in order since one of
the parents would represent the boundary between them. Since lib.stack
is pretty well structured, we could later modify PatchOrder to use the
log.
> * Parents: the previous log entry; branch head; something that
> (recusively) points to all unapplied commits, if necessary.
As you pointed below, "branch head" should probably be the "stack
top". We don't need to track the "branch head" if different, just need
to fix up the error and add the patches to the stack. And, anyway, if
one modifies the HEAD using git directly, the log will still point to
the top of the stack.
The third head would only be needed for the first log entry or when we
use pick --unapplied (in the latter, it only points to the unapplied
commit). The third head shouldn't be used at all in the log, just
created when needed.
> I'd add to that:
>
> * The stack top must be a parent too if head != top.
See above, branch head would actually be the stack top.
> * The commit message must include a version number, and the branch
> head sha1.
OK with the version number but the branch head (or stack top) is one
of the parents already.
> * I'm pretty sure we want the kind of "simplified" log I have in my
> proposal. The full log in your proposal is going to look every bit
> as ugly as the one in mine.
I agree it will look ugly but the simplified log adds an extra
overhead on any stgit action. If we don't use stg log -g, a text only
log command could show the diff. We can add it afterwards though if it
is fast enough.
--
Catalin
next prev parent reply other threads:[~2008-06-19 9:25 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-06-12 5:34 [StGit PATCH 00/14] Undo series Karl Hasselström
2008-06-12 5:34 ` [StGit PATCH 01/14] Fix typo Karl Hasselström
2008-06-12 5:34 ` [StGit PATCH 02/14] Library functions for tree and blob manipulation Karl Hasselström
2008-06-12 5:34 ` [StGit PATCH 03/14] Write to a stack log when stack is modified Karl Hasselström
2008-06-17 10:24 ` Catalin Marinas
2008-06-17 12:31 ` Karl Hasselström
2008-06-17 12:55 ` Karl Hasselström
2008-06-17 14:11 ` Catalin Marinas
2008-06-17 15:32 ` Karl Hasselström
2008-06-18 13:03 ` Catalin Marinas
2008-06-18 14:36 ` Karl Hasselström
2008-06-18 16:16 ` Catalin Marinas
2008-06-18 17:32 ` Karl Hasselström
2008-06-19 9:24 ` Catalin Marinas [this message]
2008-06-19 10:07 ` Karl Hasselström
2008-06-20 9:14 ` Catalin Marinas
2008-06-23 12:36 ` Karl Hasselström
2008-07-12 10:09 ` Catalin Marinas
2008-07-14 6:32 ` Karl Hasselström
2008-07-01 20:13 ` Karl Hasselström
2008-07-03 22:05 ` Catalin Marinas
2008-06-12 5:34 ` [StGit PATCH 04/14] Add utility function for reordering patches Karl Hasselström
2008-06-12 5:34 ` [StGit PATCH 05/14] New command: stg reset Karl Hasselström
2008-06-12 5:34 ` [StGit PATCH 06/14] Log conflicts separately Karl Hasselström
2008-06-12 5:34 ` [StGit PATCH 07/14] Log conflicts separately for all commands Karl Hasselström
2008-06-12 5:34 ` [StGit PATCH 08/14] Add a --hard flag to stg reset Karl Hasselström
2008-06-12 5:35 ` [StGit PATCH 09/14] Don't write a log entry if there were no changes Karl Hasselström
2008-06-12 5:35 ` [StGit PATCH 10/14] Move stack reset function to a shared location Karl Hasselström
2008-06-12 5:35 ` [StGit PATCH 11/14] New command: stg undo Karl Hasselström
2008-06-12 5:35 ` [StGit PATCH 12/14] New command: stg redo Karl Hasselström
2008-06-12 5:35 ` [StGit PATCH 13/14] Log and undo external modifications Karl Hasselström
2008-06-12 5:35 ` [StGit PATCH 14/14] Make "stg log" show stack log instead of patch log Karl Hasselström
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b0943d9e0806190224v1b6434fesd3a54443422edaeb@mail.gmail.com \
--to=catalin.marinas@gmail.com \
--cc=git@vger.kernel.org \
--cc=kha@treskal.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).