Git development
 help / color / mirror / Atom feed
* Re: [PATCH 1/6 (v4)] man page and technical discussion for rev-cache
From: Nick Edelen @ 2009-10-19 20:26 UTC (permalink / raw)
  To: Junio C Hamano, Nicolas Pitre, Johannes Schindelin, Sam Vilain,
	Michael J Gruber
In-Reply-To: <op.uys3qgmitdk399@sirnot.private>

Before any code is introduced the full documentation is put forth.  This 
provides a man page for the porcelain, and a technical doc in technical/.  The 
latter describes the API, and discusses rev-cache's design, file format and 
mechanics.

Signed-off-by: Nick Edelen <sirnot@gmail.com>

---
sorry about the long interval; I haven't got internet in my flat yet and uni
tends to take you over.  anyway, clean resend of patches (attempt 2).  this set
fixes a small bug in graft handling (ie. test t6001), and tweaks test for
compatability.

 Documentation/git-rev-cache.txt       |  190 ++++++++++
 Documentation/technical/rev-cache.txt |  634 +++++++++++++++++++++++++++++++++
 2 files changed, 824 insertions(+), 0 deletions(-)

diff --git a/Documentation/git-rev-cache.txt b/Documentation/git-rev-cache.txt
new file mode 100644
index 0000000..5a713ad
--- /dev/null
+++ b/Documentation/git-rev-cache.txt
@@ -0,0 +1,190 @@
+git-rev-cache(1)
+================
+
+NAME
+----
+git-rev-cache - Add, walk and maintain revision cache slices
+
+SYNOPSIS
+--------
+'git-rev-cache' COMMAND [options] [<commit>...]
+
+DESCRIPTION
+-----------
+The revision cache ('rev-cache') provides a mechanism for significantly
+speeding up revision traversals.  It does this by creating an efficient
+database (cache) of commits, their related objects and topological relations.
+Independant of packs and the object store, this database is composed of
+rev-cache "slices" -- each a different file storing a given segment of commit
+history.  To map commits to their respective slices, a single index file is
+kept for the rev-cache.
+
+'git-rev-cache' provides a front-end for the rev-cache mechanism, intended for
+updating and maintaining rev-cache slices in the current repository.  New cache
+slice files can be 'add'ed, to keep the cache up-to-date; individual slices can
+be traversed; smaller slices can be 'fuse'd into a larger slice; and the
+rev-cache index can be regenerated.
+
+COMMANDS
+--------
+
+add
+~~~
+Add revisions to the cache by creating a new cache slice.  Reads a revision
+list from the command line, formatted as: `START START ... \--not END END ...`
+
+Options
+^^^^^^^
+
+\--all::
+	Include all refs in the new cache slice, like the \--all option in
+	'rev-list'.
+
+\--fresh/\--incremental::
+	Exclude everything already in the revision cache, analogous to
+	\--incremental in 'pack-objects'.
+
+\--stdin::
+	Read newline-seperated revisions from the standard input.  Use \--not
+	to exclude commits, as on the command line.
+
+\--legs::
+	Ensure newly-generated cache slice has no partial ends.  This means that
+	no commit has partially cached parents, in that all its parents are
+	cached or none of them are.  99.9% of users can ignore this command.
++
+\--legs will cause 'rev-cache' to expand potential slice end-points (creating
+"legs") until this condition is met, simplifying the cache slice structure.
+'rev-cache' itself does not care if a slice has legs or not, but the condition
+may reduce the required complexity of other applications that might use the
+revision cache.
+
+\--no-objects::
+	Non-commit objects are normally included along with the commit with
+	which they were introduced.  This is obviously very benificial, but can
+	take longer in cache slice generation.  Using this option will disable
+	non-commit object caching.
++
+\--no-objects is mainly intended for debugging or development purposes, but may
+find use in special situations (e.g. common traversal of only commits).
+
+Output
+^^^^^^
+
+On `stderr` 'add' outputs general information about the generated slice,
+including the number of objects and paths, and the start/end commits (prefix S
+indicates start, E an end).  Through `stdout` it emits only the SHA-1 of the
+slice.
+
+walk
+~~~~
+Analogous to a slice-oriented 'rev-list', 'walk' will traverse a region in a
+particular cache slice.  Interesting and uninteresting (delimited, as with
+'rev-list', with \--not) are specified on the command line, and output is the
+same as vanilla 'rev-list'.
+
+Options
+^^^^^^^
+
+\--objects::
+	Like 'rev-list', 'walk' will normally only list commits.  Use this
+	option to list non-commit objects as well, if they are present in the
+	cache slice.
+
+Output
+^^^^^^
+
+'walk' will simply dump the contents of the output commit list, work list, and
+pending object array.  The headers are outputed on `stderr`, the object hashes
+and names on `stdout`.
+
+fuse
+~~~~
+Merge several cache slices into a single large slice, like 'repack' for
+'rev-cache'.  On each invocation of 'add' a new file ("slice") is added to the
+revision cache directory, and after several additions the directory may become
+populated with many, relatively small slices.  Numerous smaller slices will
+yield poorer performance than a one or two large ones, because of the overhead
+of loading new slices into memory.
+
+Running 'fuse' every once in a while will solve this problem by coalescing all
+the cache slices into one larger slice.  For very large projects, using
+\--ignore-size is advisable to prevent overly large cache slices.  This can be
+set to run on garbage collection; see 'Automation' for more info.
+
+Note that 'fuse' uses the internal revision walker, so the options used in
+fusion override those of the cache slices upon which it operates.  For example,
+if some slices were generated with \--no-objects, yet 'fuse' was performed with
+non-commit objects, the resulting slice would still contain objects but would
+take longer to generate.
+
+Options
+^^^^^^^
+
+\--all::
+	Normally fuse will only include everything that's already in the
+	revision cache.  \--all tells it to start walking from the branch
+	heads, effectively a `add --all --fresh; fuse`
+	(pseudo-revcache-command).
+
+\--no-objects::
+	As in 'add', this option disables inclusion of non-commit objects.  If
+	some cache slices do contain such objects, the information will be lost.
+
+\--ignore-size[=N]::
+	Do not merge cache slices of size >=N (be aware that slices must be
+	mapped to memory).  N can have a suffix of "k" or "m", denoting N as
+	kilobytes and megabytes, respectively.  If N is not provided 'fuse'
+	will default to a size specified in `revcache.ignoresize`, or ~25MB if
+	the config var is not set.
+
+Output
+^^^^^^
+
+This command prints the SHA-1 of the new slice on `stdout`, and information
+about its work on `stderr` -- specifically which files it's removing.
+
+Automation
+^^^^^^^^^^
+
+Set the git configuration variable `gc.revcache` to run 'fuse' on garbage
+collection.  The arguments passed are `fuse \--all \--ignore-size`; i.e. 'gc'
+will keep everything cached into size-regulated slices.
+
+index
+~~~~~
+Regenerate the revision cache index.  If the rev-cache index file associating
+objects with cache slices gets corrupted, lost, or otherwise becomes unusable,
+'index' will quickly regenerate the file.  It's most likely that this won't be
+needed in every day use, as it is targeted towards debugging and development.
+
+alt
+~~~
+Create a cache slice pointer to another slice, identified by its full path:
+`fuse path/to/other/slice`
+
+This command is useful if you have several repositories sharing a common
+history.  Although space requirements for rev-cache are slim anyway, you can in
+this situation reduce it further by using slice pointers, pointing to relavant
+slices in other repositories.  Note that only one level of redirection is
+allowed, and the slice pointer will break if the original slice is removed.
+'fuse' will not touch slice pointers.
+
+NOTES
+-----
+In certain circumstances there may be some inconsistencies with object names
+between cached and non-cached walks.  Specifically, if two objects in a commit
+tree have the same content (= same SHA-1); or if objects of the same SHA-1 are
+introduced independantly in parallel branches.
+
+In the first case rev-cache will use the name of the youngest file, while
+vanilla rev-list will return the name of the entry first encountered in walking
+the tree.  The latter case is a result of rev-cache's internal topological
+ordering: the difference is the same between sorted and unsorted revision walks.
+
+See 'Discussion' for the underlying reasons for the discrepencies.
+
+DISCUSSION
+----------
+For an explanation of the API and its inner workings, see
+link:technical/rev-cache.txt[technical info on rev-cache].
diff --git a/Documentation/technical/rev-cache.txt b/Documentation/technical/rev-cache.txt
new file mode 100644
index 0000000..91fce8b
--- /dev/null
+++ b/Documentation/technical/rev-cache.txt
@@ -0,0 +1,634 @@
+rev-cache
+=========
+
+The revision cache API ('rev-cache') provides a method for efficiently storing
+and accessing commit branch sections.  Such branch slices are defined by a
+series of start/top (interesting) and end/bottom (uninteresting) commits.  Each
+slice contains information on commits in topological order.  Recorded with each
+commit is:
+
+* All intra-slice topological relations, encoded into path "channels" (see
+  'Mechanics' for full explanation).
+* Object meta-data: type, SHA-1, size, date (for commits).
+* Objects introduced by that commit, not present in the its cached parents.
+
+In addition to the API, basic structures are exported for the possibility of
+direct access.
+
+The API
+-------
+You can find the function prototypes in `revision.h`.
+
+Data Structures
+~~~~~~~~~~~~~~~
+The `rev_cache_info` struct holds all the options and flags for the API.
+
+----
+struct rev_cache_info {
+	/* generation flags */
+	unsigned objects : 1,
+		legs : 1,
+		make_index : 1,
+		fuse_me : 1;
+
+	/* index inclusion */
+	unsigned overwrite_all : 1;
+
+	/* traversal flags */
+	unsigned add_to_pending : 1;
+
+	/* fuse options */
+	unsigned int ignore_size;
+
+	/* reserved */
+	struct rev_cache_slice_map *maps,
+		*last_map;
+};
+----
+
+The fields:
+
+`objects`::
+	Add non-commit objects to slice.
+
+`legs`::
+	Ensure end/bottom commits have no children.
+
+`make_index`::
+	Integrate newly-made slice into index.
+
+`fuse_me`::
+	This is specified if a fuse is occuring, and slices are to be reused.
+	This option requires `maps` and `last_maps` to be initialized.
+
+`overwrite_all`::
+	When a cache slice is added to the index, sometimes overlap occures
+	between it and other slices.  Normally, original index entries are kept
+	unless the new entry represents a start commit (older entries are more
+	likely to lead to greater in-slice traversals).  This options overrides
+	that, and updates all entries of the new slice.
+
+`add_to_pending`::
+	Append unique non-commit objects to the `pending` object list in the
+	passed `rev_info` instance.
+
+`add_names`::
+	Include non-commit object names in the pending object entries if
+	`add_to_pending` is set.
+
+`ignore_size`::
+	If non-zero, ignore slices with size greater or equal to this during
+fusion.
+
+`maps`/`last_map`::
+	An array of slice mappings, indexed by their id in the slice index
+	header, to be re-used with `fuse_me`.  `last_map` points to the last
+	mapping used, and should be initialized to 0.
+
+Functions
+~~~~~~~~~
+
+init_rev_cache
+^^^^^^^^^^^^^^
+----
+void init_rev_cache_info(
+	struct rev_cache_info *rci OUT
+)
+----
+
+Initialize `rci` to default options.
+
+make_cache_slice
+^^^^^^^^^^^^^^^^
+----
+int make_cache_slice(
+	struct rev_cache_info *rci IN,
+	struct rev_info *revs IN,
+	struct commit_list **starts IN/OUT,
+	struct commit_list **ends IN/OUT,
+	unsigned char *cache_sha1 OUT
+)
+----
+
+Create a cache slice based on either `revs` (if non-NULL) *or* the `starts` and
+`ends` lists.  The actual list of start and end commits of the slice may be
+different from the parameters, based on what defines the branch segment, and
+this actual list is passed back through `starts` and `ends`.
+
+The cache slice is identified via a SHA-1 generated from the actual start/end
+commit lists.  `cache_sha1`, if non-NULL, can recieve the cache slice name.
+`rci` is used to specify generation options, but can be NULL if you want
+`make_cache_slice` to fall back on defaults.  Returns 0 on success, non-zero on
+failure.
+
+make_cache_index
+^^^^^^^^^^^^^^^^
+----
+int make_cache_index(
+	struct rev_cache_info *rci IN,
+	unsigned char *cache_sha1 IN,
+	int fd IN,
+	unsigned int size IN
+)
+----
+
+Add a slice to the rev-cache index.  `cache_sha1` is the identity hash of the
+cache slice; `fd` is a file descriptor of the cache slice opened with
+read/write privileges (the slice is not actually modified); `size` is the size
+of the cache slice.  Although there are currently no options for index
+updating, `rci` is a placeholder in case of future options.  Note that this
+function is normally called by `make_cache_slice`.  Returns 0 on success,
+non-zero on failure.
+
+open_cache_slice
+^^^^^^^^^^^^^^^^
+----
+int open_cache_slice(
+	unsigned char *sha1 IN,
+	int flags IN
+)
+----
+
+Returns a file descriptor to a cache slice described by `sha1` hash, using
+`flags` as the access mode.  This will follow cache slice pointers to one level
+of indirection.
+
+get_cache_slice
+^^^^^^^^^^^^^^^
+----
+unsigned char *get_cache_slice(
+	struct commit *commit IN
+)
+----
+
+Given a commit object `get_cache_slice` will search the revision cache index
+and return, if found, the cache slice SHA-1.
+
+traverse_cache_slice
+^^^^^^^^^^^^^^^^^^^^
+----
+int traverse_cache_slice(
+	struct rev_info *revs IN/OUT,
+	unsigned char *cache_sha1 IN,
+	struct commit *commit IN,
+	unsigned long *date_so_far IN/OUT,
+	int *slop_so_far IN/OUT,
+	struct commit_list ***queue OUT,
+	struct commit_list **work IN/OUT
+)
+----
+
+Traverse a specified cache slice.  An explanation of the each field:
+
+`revs`::
+	The revision walk instance.  `traverse_cache_slice` uses this for
+	general options (e.g. which objects are included) and slice traversal
+	options (in the `rev_cache_info` field).  If the `add_to_pending`
+	option is specified, non-commit objects are appended to the `pending`
+	object list field.
+
+`cache_sha1`::
+	SHA-1 identifying the cache slice to use.  This can be taken directly
+	from `get_cache_slice`.
+
+`commit`::
+	The current commit object in the revision walk, i.e. the commit which
+	inspired this slice traversal.  Although theoretically redundant in
+	view of the `work` list, this simplifies interaction with normal
+	revision walks, which pop commits from `work` before analyzing them.
+
+`date_so_far`::
+	The date of the oldest encountered interesting commit.  Passing NULL
+	will let `traverse_cache_slice` use defaults.
+
+`slop_so_far`::
+	The `slop` value, a la revision.c.  This is a counter used to determine
+	when to stop traversing, based on how many extra uninteresting commits
+	should be encountered.  NULL will enable defaults, as above.
+
+`queue`::
+	Refers to a pointer to the head of a FIFO commit list, recieving the
+	commits we've seen and added.
+
+`work`::
+	A date-ordered list of commits that have yet to be processed (i.e. seen
+	but not added).  Commits from here present in the slice are removed
+	(and, obviously, used as starting places for traversal), and any end
+	commits encountered are inserted.
+
+starts_from_slices
+^^^^^^^^^^^^^^^^^^
+----
+void starts_from_slices(
+	struct rev_info *revs OUT,
+	unsigned int flags IN,
+	unsigned char *which IN,
+	int n IN
+)
+----
+
+Will mark start-commits in certain rev-cache slices with `flag`, and added them
+to the pending list of `revs`.  If `n` is zero, `starts_from_slices` will use
+all slices.  Otherwise `which` will specify an *unseperated* list of cache
+SHA-1s to use (20 bytes each), and `n` will contain the number of slices (i.e.
+20 * `n` = size of `which`).
+
+fuse_cache_slices
+^^^^^^^^^^^^^^^^^
+----
+int fuse_cache_slices(
+	struct rev_cache_info *rci IN,
+	struct rev_info *revs IN
+)
+----
+
+Generate a slice based on `revs`, replacing all encountered slices with one
+(larger) slice.  The `ignore_size` field in `rci`, if non-zero, will dictate
+which cache slice sizes to ignore in both traversal and replacement.
+
+regenerate_cache_index
+^^^^^^^^^^^^^^^^^^^^^^
+----
+int regenerate_cache_index(
+	struct rev_cache_info *rci IN
+)
+----
+
+Remake the revision cache index, including all the slices.  Currently no
+options in `rci` exist for index (re)generation, but some may develop in the
+future.
+
+to/from_disked_rc_object/index_entry
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+----
+struct rc_object/index_entry *from_disked_rc_object/index_entry(
+	struct rc_object/index_entry_ondisk *src IN,
+	struct rc_object/index_entry *dst OUT
+)
+
+struct rc_object/index_entry_ondisk *to_disked_rc_object/index_entry(
+	struct rc_object/index_entry *src IN,
+	struct rc_object/index_entry_ondisk *dst OUT
+)
+----
+
+Functions to convert between the internal and storage (`_ondisk`) versions of
+object and index entry structures.  These are necessary for direct access to
+the cache slices.  If NULL is provided for `dst` a statically allocated
+structure is used, and a pointer to the struct is returned.  Otherwise the
+functions return `dst`.
+
+Example Usage
+-------------
+
+A few examples to demonstrate usage:
+
+.Creating a slice
+----
+/* pretend you're a porcelain for rev-cache reading from the command line */
+struct rev_info revs;
+struct rev_cache_info rci;
+
+init_revisions(&revs, 0);
+init_rci(&rci);
+
+flags = 0;
+for (i = 1; i < argc; i++) {
+        if (!strcmp(argv[i], "--not"))
+                flags ^= UNINTERESTING;
+        else if(!strcmp(argv[i], "--fresh"))
+                starts_from_slices(&revs, UNINTERESTING, 0, 0);
+        else
+                handle_revision_arg(argv[i], &revs, flags, 1);
+}
+
+/* we want to explicitly set certain options */
+rci.objects = 0;
+
+if (!make_cache_slice(&rci, &revs, 0, 0, cache_sha1))
+        printf("made slice!  it's called %s\n", sha1_to_hex(cache_sha1));
+----
+
+.Traversing a slice
+----
+/* let's say you're walking the tree with a 'work' list of current heads and a
+ * FILO output list 'out' */
+out = 0;
+outp = &out;
+
+while (work) {
+        struct commit *commit = pop_commit(&work);
+        struct object *object = &commit->object;
+        unsigned char *cache_sha1;
+
+        if (cache_sha1 = get_cache_slice(object->sha1)) {
+                /* note that this will instatiate any topo-relations
+                 * as it goes */
+                if (traverse_cache_slice(&revs, cache_sha1,
+                        commit, 0, 0, /* use defaults */
+                        &outp, &work) < 0)
+                        die("I'm overreacting to a non-fatal cache error");
+        } else {
+                struct commit_list *parents = commit->parents;
+
+                while (parents) {
+                        struct commit *p = parents->item;
+                        struct object *po = &p->object;
+
+                        parents = parents->next;
+                        if (po->flags & UNINTERESTING)
+                                continue;
+
+                        if (object->flags & UNINTERESTING)
+                                po->flags |= UNINTERESTING;
+                        else if (po->flags & SEEN)
+                                continue;
+
+                        if (!po->parsed)
+                                parse_commit(p);
+                        insert_by_date(p, &work);
+                }
+
+                if (object->flags & (SEEN | UNINTERESTING) == 0)
+                        outp = &commit_list_insert(commit, outp)->next;
+                object->flags |= SEEN;
+        }
+}
+----
+
+Some Internals
+--------------
+For more advanced usage, the slice and index file(s) may be accessed directly.
+Relavant structures are availabe in `rev-cache.h`.
+
+File Formats
+~~~~~~~~~~~~
+
+Cache Slices
+^^^^^^^^^^^^
+A slice has a basic fixed-size header, followed by a certain number of object
+entries, then a NULL-seperated list of object names.  Commits are sorted in
+topo-order, and each commit entry is followed by the objects added in that
+commit.
+
+----
+         -- +--------------------------------+
+header      | object number, etc...          |
+         -- +--------------------------------+
+commit      | commit info                    |
+entry       | path data                      |
+            +--------------------------------+
+            | tree/blob info                 |
+            +--------------------------------+
+            | tree/blob info                 |
+            +--------------------------------+
+            | ...                            |
+         -- +--------------------------------+
+commit      | commit info                    |
+entry       | path data                      |
+            +--------------------------------+
+            | tree/blob info                 |
+            +--------------------------------+
+            | ...                            |
+         -- +--------------------------------+
+...         ...
+         -- +--------------------------------+
+name list   | \0some_file_name\0             |
+(note       +--------------------------------+
+preceeding  | another_file\0                 |
+null)       ...                              |
+            +--------------------------------+
+----
+
+Here is the header:
+
+----
+struct rc_cache_slice_header {
+	char signature[8]; /* REVCACHE */
+	unsigned char version;
+	uint32_t ofs_objects;
+
+	uint32_t object_nr;
+	uint16_t path_nr;
+	uint32_t size;
+
+	unsigned char sha1[20];
+
+	uint32_t names_size;
+};
+----
+
+Explanations:
+
+`signature`::
+	The identifying signature of cache slice file.  Always "REVCACHE".
+`version`::
+	The version number, currently 1.
+`ofs_objects`::
+	The byte offset at which the commit/object listing starts.  Always
+	present at the 10th byte, regardless of file version.
+`object_nr`::
+	The total number of objects (commit + non-commit objects) present in
+	the slice.
+`path_nr`::
+	The total number of paths/channels used in encoding the topological
+	data.  Note that paths are reused (see 'Mechanics'), so there will
+	never be more than a few hundred paths (if that) used.
+`size`::
+	The size of the slice *excluding* the name list.  In other words, the
+	size of the portion mapped to memory.
+`sha1`::
+	The cache slice SHA-1.
+`names_size`::
+	The size of the name list.  `size` + `names_size` = size of slice
+
+Revision Cache Index
+^^^^^^^^^^^^^^^^^^^^
+The index is a single file that associates SHA-1s with cache slices and file
+positions.  It is somewhat similar to pack-file indexes, containing a fanout
+table and a list of index entries sorted by hash.
+
+----
+         -- +--------------------------------+
+header      | object #, cache #, etc.        |
+         -- +--------------------------------+
+sha1s of    | SHA-1                          |
+slices      | ...                            |
+         -- +--------------------------------+
+fanout      | fanout[0x00]                   |
+table       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+            | fanout[0xff]                   |
+         -- +--------------------------------+
+index       | SHA-1 of object                |
+entries     | index of cache slice SHA-1     |
+            | position in cache slice        |
+            +--------------------------------+
+            |                                |
+            ...
+            +--------------------------------+
+----
+
+The header:
+
+----
+struct rc_index_header {
+	char signature[8]; /* REVINDEX */
+	unsigned char version;
+	uint32_t ofs_objects;
+
+	uint32_t object_nr;
+	unsigned char cache_nr;
+
+	uint32_t max_date;
+};
+----
+
+Explanations:
+
+`signature`::
+	Always "REVINDEX".
+`version`::
+	Version number, currently 1.
+`ofs_objects`::
+	Offset at which the entry objects begin.  This is more obviously useful
+	in the index because the list of slice SHA-1s is variably-sized.
+`object_nr`::
+	Number of index entry objects present.
+`cache_nr`::
+	Number of cache slices to which the index maps, and hence the number of
+slice SHA-1s listed.
+`max_date`::
+	The oldest commit represented in the index.  This is used to help speed
+up lookup times by knowing what range of commits we definitely don't have
+cached.  Normal usage of 'rev-cache' would leave no "holes" in its coverage of
+commit history -- once a commit is cached, everything reachable from it should
+be cached as well.  Most of the time refs are added to rev-cache simultaneous
+as well.  This means that in most situations almost everything <= `max_date`
+will be cached.
+
+Mechanics
+~~~~~~~~~
+
+The most important part of rev-cache is its method of encoding topological
+relations.  To ensure fluid traversal and reconstruction, commits are related
+through high-level "streams"/"channels" rather than individual
+interconnections.  Intuitively, rev-cache stores history the way gitk shows it:
+commits strung up on lines, which interconnect at merges and branches.
+
+Each commit is associated to a given channel/path via a 'path id', and
+variable-length fields govern which paths (if any) are closed or opened at that
+object.  This means that topo-data can be preserved in only a few bytes extra
+per object entry.  Other information stored per entry is the sha-1 hash, type,
+date, size, name, and status in cache slice.  Here is format of an object
+entry, both on-disk and in-memory:
+
+----
+struct object_entry {
+        unsigned type : 3;
+        unsigned is_end : 1;
+        unsigned is_start : 1;
+        unsigned uninteresting : 1;
+        unsigned include : 1;
+        unsigned flags : 1;
+        unsigned char sha1[20];
+
+        unsigned char merge_nr;
+        unsigned char split_nr;
+        unsigned size_size : 3;
+        unsigned name_size : 3;
+
+        uint32_t date;
+        uint16_t path;
+
+        /* merge paths */
+        /* split paths */
+        /* size */
+        /* name index */
+};
+----
+
+An explanation of each field:
+
+`type`::
+	Object type
+`is_end`::
+	The commit has some parents outside the cache slice (all if slice has
+	legs)
+`is_start`::
+	The commit has no children in cache slice
+`uninteresting`::
+	Run-time flag, used in traversal
+`include`::
+	Run-time flag, used in traversal (initialization)
+`flags`::
+	Currently unused, extra bit
+`sha1`::
+	Object SHA-1 hash
+
+`merge_nr`::
+	The number of paths the current channel diverges into; the current path
+	ends upon any merge.
+`split_nr`::
+	The number of paths this commit ends; used on both merging and
+	branching.
+`size_size`::
+	Number of bytes the object size takes up.
+`name_size`::
+	Number of bytes the name index takes up.
+
+`date`::
+	The date of the commit.
+`path`::
+	The path ID of the channel with which this commit is associated.
+
+merge paths::
+	The path IDs (16-bit) that are to be created.  Overflow is not a
+	problem as path IDs are reused, leaving even complicated projects to
+	consume no more than a few hundred IDs.
+split paths::
+	The path IDs (16-bit) that are to be ended.
+size::
+	The size split into the minimum number of bytes.  That is, 1-8 bytes
+	representing the size, least-significant byte first.
+name index::
+	An offset for the null-seperated, object name list at the end of the
+	cache slice.  Also split into the minimum number of bytes.
+
+Each path ID refers to an index in a 'path array', which stores the current
+status (eg. active, interestingness) of each channel.
+
+Due to topo-relations and boundary tracking, all of a commit's parents must be
+encountered before the path is reallocated.  This is achieved by using a
+counter system per merge: starting at the parent number, the counter is
+decremented as each parent is encountered (dictated by 'split paths'); at 0 the
+path is cleared.
+
+Boundary tracking is necessary because non-commits are stored relative to the
+commit in which they were introduced.  If a series of commits is not included
+in the output, the last interesting commit must be parsed manually to ensure
+all objects are accounted for.
+
+To prevent list-objects from recursing into trees that we've already taken care
+of, the flag `FACE_VALUE` is introduced.  An object with this flag is not
+explored (= "taken at face value"), significantly reducing I/O and processing
+time.
+
+Notes
+~~~~~
+
+Due to rev-cache's internal storage format, walking may lead to some
+discrepencies between cached and uncached repositories.  Although noticeable to
+users directly calling rev-list, these are unused or corner cases and
+internally a non-issue.
+
+First note that rev-cache records commits in topological order.  Large portions
+of commit history will already be sorted topologically in the revision walk,
+yielding a different output for unsorted calls to rev-list.  A more obscure
+consquence occurs when two objects of the same SHA-1, but different name, are
+introduced seperately in parallel branches: different names might be shown for
+that object depending on which object entry was encountered first.
+
+A similar disparity arises when two objects of same SHA-1/different name are
+present in the same tree structure.  rev-cache, walking objects as they were
+introduced, lists the youngest file's name; rev-list, walking the full trees
+each commit, shows the first file encountered.
-- 
tg: (e7578ba..) t/revcache/docs (depends on: t/revcache/integration)

^ permalink raw reply related

* Re: [RFC PATCH] git-gui: Allow staging multiple lines at once
From: Geert Bosch @ 2009-10-19 20:08 UTC (permalink / raw)
  To: Jeff Epler; +Cc: git
In-Reply-To: <20091019195456.GA11121@unpythonic.net>


On Oct 19, 2009, at 15:54, Jeff Epler wrote:

> When applying less than a full hunk, it's still often desirable to  
> apply
> a number of consecutive lines.
>
> This change makes it possible to sweep out a range of lines in the  
> diff view
> with the left mouse button, then right click and "Stage Lines For  
> Commit".
>
> The selected lines may span multiple hunks.

Great! I've wished for this feature...

Thanks,
    -Geert

^ permalink raw reply

* Re: What's cooking in git.git (Oct 2009, #03; Mon, 19)
From: Erik Faye-Lund @ 2009-10-19 20:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git
In-Reply-To: <7vhbtv6c76.fsf@alter.siamese.dyndns.org>

On Mon, Oct 19, 2009 at 10:05 AM, Junio C Hamano <gitster@pobox.com> wrote:
> * ef/msys-imap (2009-10-13) 8 commits.
>  - MSVC: Enable OpenSSL, and translate -lcrypto
>  - mingw: enable OpenSSL
>  - mingw: wrap SSL_set_(w|r)fd to call _get_osfhandle
>  - imap-send: build imap-send on Windows
>  - imap-send: fix compilation-error on Windows
>  - imap-send: use run-command API for tunneling
>  - imap-send: use separate read and write fds
>  - imap-send: remove useless uid code
>
> Is this good to go yet?

As far as I know, yes. I sent out a new iteration a couple of hours
ago (after you sent this mail). This version addresses all issues that
has been raised AFAIK. But let's give people some more time to shoot
holes in it, OK?

-- 
Erik "kusma" Faye-Lund

^ permalink raw reply

* [RFC PATCH] git-gui: Allow staging multiple lines at once
From: Jeff Epler @ 2009-10-19 19:54 UTC (permalink / raw)
  To: git

When applying less than a full hunk, it's still often desirable to apply
a number of consecutive lines.

This change makes it possible to sweep out a range of lines in the diff view
with the left mouse button, then right click and "Stage Lines For Commit".

The selected lines may span multiple hunks.

Signed-off-by: Jeff Epler <jepler@unpythonic.net>
---
The diff looks bigger than it is because it changed the indentation level
of about 80 lines, and that made it necessary to reflow a lengthy comment
block as well.

This introduces new user interface strings.  I felt this was probably a
better decision than using the inaccurate "Stage Line For Commit" when
a block of text was swept out.

I wonder a bit about the message [mc "Apply/Reverse Line"] -- as far as
I can tell, it is never shown to a user, so why is it translated?

 git-gui/git-gui.sh   |   15 +++-
 git-gui/lib/diff.tcl |  222 +++++++++++++++++++++++++++----------------------
 2 files changed, 134 insertions(+), 103 deletions(-)

diff --git a/git-gui/git-gui.sh b/git-gui/git-gui.sh
index 14b92ba..a80ed0d 100755
--- a/git-gui/git-gui.sh
+++ b/git-gui/git-gui.sh
@@ -3165,7 +3165,7 @@ set ui_diff_applyhunk [$ctxm index last]
 lappend diff_actions [list $ctxm entryconf $ui_diff_applyhunk -state]
 $ctxm add command \
 	-label [mc "Apply/Reverse Line"] \
-	-command {apply_line $cursorX $cursorY; do_rescan}
+	-command {apply_range_or_line $cursorX $cursorY; do_rescan}
 set ui_diff_applyline [$ctxm index last]
 lappend diff_actions [list $ctxm entryconf $ui_diff_applyline -state]
 $ctxm add separator
@@ -3205,12 +3205,21 @@ proc popup_diff_menu {ctxm ctxmmg x y X Y} {
 	if {[string first {U} $state] >= 0} {
 		tk_popup $ctxmmg $X $Y
 	} else {
+		set has_range [expr {[$::ui_diff tag nextrange sel 0.0] != {}}]
 		if {$::ui_index eq $::current_diff_side} {
 			set l [mc "Unstage Hunk From Commit"]
-			set t [mc "Unstage Line From Commit"]
+			if {$has_range} {
+				set t [mc "Unstage Lines From Commit"]
+			} else {
+				set t [mc "Unstage Line From Commit"]
+			}
 		} else {
 			set l [mc "Stage Hunk For Commit"]
-			set t [mc "Stage Line For Commit"]
+			if {$has_range} {
+				set t [mc "Stage Lines For Commit"]
+			} else {
+				set t [mc "Stage Line For Commit"]
+			}
 		}
 		if {$::is_3way_diff
 			|| $current_diff_path eq {}
diff --git a/git-gui/lib/diff.tcl b/git-gui/lib/diff.tcl
index 925b3f5..30ac659 100644
--- a/git-gui/lib/diff.tcl
+++ b/git-gui/lib/diff.tcl
@@ -505,10 +505,23 @@ proc apply_hunk {x y} {
 	}
 }
 
-proc apply_line {x y} {
+proc apply_range_or_line {x y} {
 	global current_diff_path current_diff_header current_diff_side
 	global ui_diff ui_index file_states
 
+	set selected [$ui_diff tag nextrange sel 0.0]
+
+	if {$selected == {}} {
+		set first [$ui_diff index "@$x,$y"]
+		set last $first
+	} else {
+		set first [lindex $selected 0]
+		set last [lindex $selected 1]
+	}
+
+	set first_l [$ui_diff index "$first linestart"]
+	set last_l [$ui_diff index "$last lineend"]
+
 	if {$current_diff_path eq {} || $current_diff_header eq {}} return
 	if {![lock_index apply_hunk]} return
 
@@ -531,119 +544,128 @@ proc apply_line {x y} {
 		}
 	}
 
-	set the_l [$ui_diff index @$x,$y]
+	set wholepatch {}
 
-	# operate only on change lines
-	set c1 [$ui_diff get "$the_l linestart"]
-	if {$c1 ne {+} && $c1 ne {-}} {
-		unlock_index
-		return
-	}
-	set sign $c1
-
-	set i_l [$ui_diff search -backwards -regexp ^@@ $the_l 0.0]
-	if {$i_l eq {}} {
-		unlock_index
-		return
-	}
-	# $i_l is now at the beginning of a line
+	while {$first_l < $last_l} {
+		set i_l [$ui_diff search -backwards -regexp ^@@ $first_l 0.0]
+		if {$i_l eq {}} {
+			# If there's not a @@ above, then the selected range
+			# must have come before the first_l @@
+			set i_l [$ui_diff search -regexp ^@@ $first_l $last_l]
+		}
+		if {$i_l eq {}} {
+			unlock_index
+			return
+		}
+		# $i_l is now at the beginning of a line
 
-	# pick start line number from hunk header
-	set hh [$ui_diff get $i_l "$i_l + 1 lines"]
-	set hh [lindex [split $hh ,] 0]
-	set hln [lindex [split $hh -] 1]
+		# pick start line number from hunk header
+		set hh [$ui_diff get $i_l "$i_l + 1 lines"]
+		set hh [lindex [split $hh ,] 0]
+		set hln [lindex [split $hh -] 1]
 
-	# There is a special situation to take care of. Consider this hunk:
-	#
-	#    @@ -10,4 +10,4 @@
-	#     context before
-	#    -old 1
-	#    -old 2
-	#    +new 1
-	#    +new 2
-	#     context after
-	#
-	# We used to keep the context lines in the order they appear in the
-	# hunk. But then it is not possible to correctly stage only
-	# "-old 1" and "+new 1" - it would result in this staged text:
-	#
-	#    context before
-	#    old 2
-	#    new 1
-	#    context after
-	#
-	# (By symmetry it is not possible to *un*stage "old 2" and "new 2".)
-	#
-	# We resolve the problem by introducing an asymmetry, namely, when
-	# a "+" line is *staged*, it is moved in front of the context lines
-	# that are generated from the "-" lines that are immediately before
-	# the "+" block. That is, we construct this patch:
-	#
-	#    @@ -10,4 +10,5 @@
-	#     context before
-	#    +new 1
-	#     old 1
-	#     old 2
-	#     context after
-	#
-	# But we do *not* treat "-" lines that are *un*staged in a special
-	# way.
-	#
-	# With this asymmetry it is possible to stage the change
-	# "old 1" -> "new 1" directly, and to stage the change
-	# "old 2" -> "new 2" by first staging the entire hunk and
-	# then unstaging the change "old 1" -> "new 1".
-
-	# This is non-empty if and only if we are _staging_ changes;
-	# then it accumulates the consecutive "-" lines (after converting
-	# them to context lines) in order to be moved after the "+" change
-	# line.
-	set pre_context {}
-
-	set n 0
-	set i_l [$ui_diff index "$i_l + 1 lines"]
-	set patch {}
-	while {[$ui_diff compare $i_l < "end - 1 chars"] &&
-	       [$ui_diff get $i_l "$i_l + 2 chars"] ne {@@}} {
-		set next_l [$ui_diff index "$i_l + 1 lines"]
-		set c1 [$ui_diff get $i_l]
-		if {[$ui_diff compare $i_l <= $the_l] &&
-		    [$ui_diff compare $the_l < $next_l]} {
-			# the line to stage/unstage
-			set ln [$ui_diff get $i_l $next_l]
-			if {$c1 eq {-}} {
-				set n [expr $n+1]
+		# There is a special situation to take care of. Consider this
+		# hunk:
+		#
+		#    @@ -10,4 +10,4 @@
+		#     context before
+		#    -old 1
+		#    -old 2
+		#    +new 1
+		#    +new 2
+		#     context after
+		#
+		# We used to keep the context lines in the order they appear in
+		# the hunk. But then it is not possible to correctly stage only
+		# "-old 1" and "+new 1" - it would result in this staged text:
+		#
+		#    context before
+		#    old 2
+		#    new 1
+		#    context after
+		#
+		# (By symmetry it is not possible to *un*stage "old 2" and "new
+		# 2".)
+		#
+		# We resolve the problem by introducing an asymmetry, namely,
+		# when a "+" line is *staged*, it is moved in front of the
+		# context lines that are generated from the "-" lines that are
+		# immediately before the "+" block. That is, we construct this
+		# patch:
+		#
+		#    @@ -10,4 +10,5 @@
+		#     context before
+		#    +new 1
+		#     old 1
+		#     old 2
+		#     context after
+		#
+		# But we do *not* treat "-" lines that are *un*staged in a
+		# special way.
+		#
+		# With this asymmetry it is possible to stage the change "old
+		# 1" -> "new 1" directly, and to stage the change "old 2" ->
+		# "new 2" by first staging the entire hunk and then unstaging
+		# the change "old 1" -> "new 1".
+
+		# This is non-empty if and only if we are _staging_ changes;
+		# then it accumulates the consecutive "-" lines (after
+		# converting them to context lines) in order to be moved after
+		# the "+" change line.
+		set pre_context {}
+
+		set n 0
+		set m 0
+		set i_l [$ui_diff index "$i_l + 1 lines"]
+		set patch {}
+		while {[$ui_diff compare $i_l < "end - 1 chars"] &&
+		       [$ui_diff get $i_l "$i_l + 2 chars"] ne {@@}} {
+			set next_l [$ui_diff index "$i_l + 1 lines"]
+			set c1 [$ui_diff get $i_l]
+			if {[$ui_diff compare $first_l <= $i_l] &&
+			    [$ui_diff compare $i_l < $last_l] &&
+			    ($c1 eq {-} || $c1 eq {+})} {
+				# the line to stage/unstage
+				set ln [$ui_diff get $i_l $next_l]
+				if {$c1 eq {-}} {
+					set n [expr $n+1]
+					set patch "$patch$pre_context$ln"
+				} else {
+					set m [expr $m+1]
+					set patch "$patch$ln$pre_context"
+				}
+				set pre_context {}
+			} elseif {$c1 ne {-} && $c1 ne {+}} {
+				# context line
+				set ln [$ui_diff get $i_l $next_l]
 				set patch "$patch$pre_context$ln"
+				set n [expr $n+1]
+				set m [expr $m+1]
+				set pre_context {}
+			} elseif {$c1 eq $to_context} {
+				# turn change line into context line
+				set ln [$ui_diff get "$i_l + 1 chars" $next_l]
+				if {$c1 eq {-}} {
+					set pre_context "$pre_context $ln"
+				} else {
+					set patch "$patch $ln"
+				}
+				set n [expr $n+1]
+				set m [expr $m+1]
 			} else {
-				set patch "$patch$ln$pre_context"
-			}
-			set pre_context {}
-		} elseif {$c1 ne {-} && $c1 ne {+}} {
-			# context line
-			set ln [$ui_diff get $i_l $next_l]
-			set patch "$patch$pre_context$ln"
-			set n [expr $n+1]
-			set pre_context {}
-		} elseif {$c1 eq $to_context} {
-			# turn change line into context line
-			set ln [$ui_diff get "$i_l + 1 chars" $next_l]
-			if {$c1 eq {-}} {
-				set pre_context "$pre_context $ln"
-			} else {
-				set patch "$patch $ln"
 			}
-			set n [expr $n+1]
+			set i_l $next_l
 		}
-		set i_l $next_l
+		set wholepatch "$wholepatch@@ -$hln,$n +$hln,$m @@\n$patch"
+		set first_l [$ui_diff index "$next_l + 1 lines"]
 	}
-	set patch "@@ -$hln,$n +$hln,[eval expr $n $sign 1] @@\n$patch"
 
 	if {[catch {
 		set enc [get_path_encoding $current_diff_path]
 		set p [eval git_write $apply_cmd]
 		fconfigure $p -translation binary -encoding $enc
 		puts -nonewline $p $current_diff_header
-		puts -nonewline $p $patch
+		puts -nonewline $p $wholepatch
 		close $p} err]} {
 		error_popup [append $failed_msg "\n\n$err"]
 	}
-- 
1.6.5.rc1.49.ge970

^ permalink raw reply related

* Re: How to revert one of multiple merges
From: Michael J Gruber @ 2009-10-19 19:36 UTC (permalink / raw)
  To: Bill Lear; +Cc: git
In-Reply-To: <19164.44099.875117.96168@lisa.zopyra.com>

Bill Lear venit, vidit, dixit 19.10.2009 20:13:
> On Monday, October 19, 2009 at 17:19:35 (+0200) Michael J Gruber writes:
>> Bill Lear venit, vidit, dixit 18.10.2009 04:31:
>>> Branch A, B, C each have 20 commits, 0-19.
>>>
>>> Branch v1.0.0 created, then merge of A, B, C performed.
>>>
>>> After testing, we realize that the branch B is not ready for
>>> production release and we'd like to remove it from branch
>>> v1.0.0.
>>>
>>> If I do
>>>
>>> % git merge A B C
>>>
>>> I get a single commit:
>>>
>>> % git log -p
>>>
>>> commit 1644a0b98c01869aa83e59aa41374c22098c47b6
>>> [...]
>>> Date:   Fri Oct 16 09:52:32 2009 -0500
>>>
>>>     Merge branches 'A', 'B' and 'C' into v1.0.0
>>>
>>> [20 x 3 commits]
>>>
>>> If I do
>>>
>>> % git merge A
>>> % git merge B
>>> % git merge C
>>>
>>> Then:
>>>
>>> % git log -p
>>>
>>> commit 8946edd381384d0882221c87b5b3b7bf47127d70
>>> [...]
>>> Date:   Sat Oct 17 21:28:36 2009 -0500
>>>
>>>     Merge branch 'B' into v1.0.0
>>>
>>> commit 076ed422443e3684e564f7cae2b92e4538088ae6
>>> [...]
>>> Date:   Sat Oct 17 21:28:35 2009 -0500
>>>
>>>     Merge branch 'A' into v1.0.0
>>>
>>> but no "Merge branch 'C' into v1.0.0".
>>
>> Do you get any commits after the merge of B? If yes, then v1.0.0 got
>> fast-forwarded (you can avoid that using --no-ff). If no, C was
>> contained in v1.0.0 already.
> 
> BTW this is all with git 1.6.5.
> 
> My test script that set all of this up makes no commit to any branch
> after the merge of any branch is done.  C was not in v1.0.0 already.
> Here is the script I used to set this up:
> 
> % cat scripto
> rm -rf branch_test
> mkdir branch_test
> cd branch_test
> git init
> 
> echo foo > foo
> git add foo
> git commit -a -m "foo"
> 
> git checkout -b A
> for ((i=0; i < 20; ++i)); do
>     echo "bar $i" > bar
>     git add bar
>     git commit -a -m "bar $i"
> done
> 
> git checkout master
> git checkout -b B
> for ((i=0; i < 20; ++i)); do
>     echo "baz $i" > baz
>     git add baz
>     git commit -a -m "baz $i"
> done
> 
> git checkout master
> git checkout -b C
> for ((i=0; i < 20; ++i)); do
>     echo "buz $i" > buz
>     git add buz
>     git commit -a -m "buz $i"
> done
> 
> git checkout master
> git checkout -b v1.0.0
> 
> After that, I did the merges this way:
> 
> % git merge A
> % git merge B
> % git merge C
> 
> and then the git log shows no merge of C, as above.  Hmm, actually, when I
> just ran this, I get no output showing branch A was merged.  I just
> did this again and here is the merge output:
> 
> % git branch -a
>   A
>   B
>   C
>   master
> * v1.0.0
> % git merge A
> Updating af6c884..c7e5f2c
> Fast forward

A was based off master, and v1.0.0 equals that base commit, so it's a f-f.

>  bar |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
>  create mode 100644 bar
> % git merge B
> Merge made by recursive.
>  baz |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
>  create mode 100644 baz
> % git merge C
> Merge made by recursive.
>  buz |    1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
>  create mode 100644 buz
> 
> Then, git log -p shows no branch A merge:
> 
> % git log -p | grep -i merge
> Merge: f462b95 2c2b064
>     Merge branch 'C' into v1.0.0
> Merge: c7e5f2c adde6ff
>     Merge branch 'B' into v1.0.0
> 
>> In both cases, it's not clear how C could have been "ready" when B was not.
> 
> A, B, and C, are entirely independent of one another.  I'm trying to
> replicate an instance in which a feature is developed and submitted for
> inclusion in a release, accepted for inclusion, but then later found
> to be defective.
> 
>>> And so, I'm faced with git rebase -i posing some unanswerable questions
>>> to our release manager.  She cannot easily remove B from the merge after
>>> doint either merge A B C, or merge A, merge B, merge C.
>>
>> The way you described the situation there are no commits after the
>> merges. So, why not reset to before the merge and do a "git merge A C"?
> 
> Presumably, I would need to tag the v1.0.0 branch after creating it,
> which I was hoping not to have to do.  I wanted the equivalent of
> "git unmerge B" after doing three separate merges as above, or an octopus
> merge.  I'm just trying to make life simpler for our release manager,
> who is not equipped with git fu.

I think the octopus case is difficult. So let's treat the other one ;)

If you have separate merges you can revert them separately. You don't
need to tag them, you only need to be able to find the merge commit, say
using

git log --grep='Merge branch'

Then you can revert that merge using

git revert -m 1 sha1ofthatmerge

Now comes the difficult part: If, later on, you want to merge B *and
include also the parts of B from before that previous merge* you need to
revert that revert, then merge. If, on the other hand, you only want to
merge the "new parts" of B then simply merge B.

The reason is that reverting a merge undoes its changes, but still
leaves the merge in the DAG, so that the commits of the merged branch
are still considered part of the history, and as such won't be merged in
again.

howto/revert-a-faulty-merge.txt explains this, but the above is a short
summary.

Cheers,
Michael

^ permalink raw reply

* Re: git fsck not identifying corrupted packs
From: Wesley J. Landaker @ 2009-10-19 19:27 UTC (permalink / raw)
  To: git
In-Reply-To: <7v7hur1a0h.fsf@alter.siamese.dyndns.org>

(Not CCing everyone, since this is mostly curiosa in the "using git as it 
was never intended" section):

On Monday 19 October 2009 13:03:42 Junio C Hamano wrote:
> Once a packfile is created and we always use it read-only, there didn't
> seem to be much point in suspecting that the underlying filesystems or
> disks may corrupt them in such a way that is not caught by the SHA-1
> checksum over the entire packfile and per object checksum.  That trust in
> the filesystems might have been a good tradeoff between fsck performance
> and reliability on platforms git was initially developed on and for, but
> it might not be true anymore as we run on more platforms these days.

Filesystems are mostly reliable, but only until your crazy users do strange 
and terrible things. I have a real, non-toy environment where I use this 
stack as a [horrible] workaround for some issues beyond my control:

git -> ext4 -> lvm -> dmcrypt -> loop -> sshfs -> cygwin sshd -> SMB share

Amazingly, this works pretty reliably with many gigabytes of data in a git 
repository, even with the occasional crash because of flakiness with the 
"sshfs -> cygwin sshd" piece of the puzzle. But a good "git fsck" sure 
doesn't hurt in this environment! =)

^ permalink raw reply

* Re: git-pack-objects gitattributes?
From: Michael J Gruber @ 2009-10-19 19:15 UTC (permalink / raw)
  To: Nasser Grainawi; +Cc: Git Mailing List
In-Reply-To: <4ADCB457.8050601@codeaurora.org>

Nasser Grainawi venit, vidit, dixit 19.10.2009 20:47:
> Nasser Grainawi wrote:
>> Hello,
>>
>> I'm trying to avoid doing delta compression on a number of large binary 
>> files. I got a suggestion to use $GIT_DIR/info/attributes and a line 
>> like this:
>> *.bin -delta
>>
>> This doesn't seem to show a big improvement (and honestly I can't find 
>> where in the git-pack-objects source the value of this attribute is used).
>>
>> Could someone shed some light on this attribute and any other 
>> improvements I could make for efficiently serving up a repo over 
>> git-daemon with near-weekly revisions of 100MB+ files?
>>
>> Thanks,
>> Nasser
> 
> ping? any help? anyone?

Well, describing a reproducable test case would help... as well as
telling us your git version.

builtin-pack-objects.c certainly refers to the delta attribute, see
no_try_delta() and its callers.

Have you checked your attrs with git-check-attr? How do you measure the
improvements you expect?

Michael

^ permalink raw reply

* Re: git fsck not identifying corrupted packs
From: Wesley J. Landaker @ 2009-10-19 19:07 UTC (permalink / raw)
  To: git; +Cc: Sergio Callegari, Johannes Sixt, Junio C Hamano
In-Reply-To: <4ADC45C7.6090907@gmail.com>

On Monday 19 October 2009 04:56:07 Sergio Callegari wrote:
> Johannes Sixt wrote:
> > Sergio Callegari schrieb:
> >> Is there a means to have fsck to a truly full check on the sanity of a
> >> repo?
> >
> > git fsck --full
> >
> > RTFM, please.
>
> Right... sorry for the noise, I mismatched --strict for --full in a
> script.
>
> BTW, the short help for fsck at --full only says "consider objects in
> alternate repositories".

Until I read this thread, I didn't realize you needed --full to check
objects in packs.

Since just every git repository I ever use has 99%+ of it's objects in
packs, this means every time I've run "git fsck" it's essentially been
a no-op and I didn't know it. I imagine this is a common confusion.

Also, having --full mean both "check alternate object pools", and "check
objects in packs" seems to be rolling up two orthogonal issues.

But anyway, here is a patch that at least fixes the short option help to
match the manual and the current behavior:

--- 8< ---
From 8fc3cd68d496bf00faad4f0a7b6ae4fee9437e68 Mon Sep 17 00:00:00 2001
From: Wesley J. Landaker <wjl@icecavern.net>
Date: Mon, 19 Oct 2009 12:48:07 -0600
Subject: [PATCH] Update git fsck --full short description to mention packs

The '--full' option to git fsck does two things:

  1) Check objects in packs
  2) Check alternate objects

This is documented in the git fsck manual; this patch reflects that in
the short git fsck option help message as well.

Signed-off-by: Wesley J. Landaker <wjl@icecavern.net>
---
 builtin-fsck.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/builtin-fsck.c b/builtin-fsck.c
index c58b0e3..63212ea 100644
--- a/builtin-fsck.c
+++ b/builtin-fsck.c
@@ -576,7 +576,7 @@ static struct option fsck_opts[] = {
 	OPT_BOOLEAN(0, "root", &show_root, "report root nodes"),
 	OPT_BOOLEAN(0, "cache", &keep_cache_objects, "make index objects head nodes"),
 	OPT_BOOLEAN(0, "reflogs", &include_reflogs, "make reflogs head nodes (default)"),
-	OPT_BOOLEAN(0, "full", &check_full, "also consider alternate objects"),
+	OPT_BOOLEAN(0, "full", &check_full, "also consider packs and alternate objects"),
 	OPT_BOOLEAN(0, "strict", &check_strict, "enable more strict checking"),
 	OPT_BOOLEAN(0, "lost-found", &write_lost_and_found,
 				"write dangling objects in .git/lost-found"),
-- 
1.6.5

^ permalink raw reply related

* Re: git fsck not identifying corrupted packs
From: Junio C Hamano @ 2009-10-19 19:03 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Johannes Sixt, Sergio Callegari, git
In-Reply-To: <alpine.DEB.1.00.0910191202020.4985@pacific.mpi-cbg.de>

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> On Mon, 19 Oct 2009, Johannes Sixt wrote:
>
>> Sergio Callegari schrieb:
>> > Is there a means to have fsck to a truly full check on the sanity of a 
>> > repo?
>> 
>> git fsck --full
>> 
>> RTFM, please.
>
> Now, now.
>
> If you were to test a new filesystem, say, wonderfulfs, and wanted to 
> check its integrity, would you not just run "fsck-wonderfulfs" if that 
> exists, rather than reading the fantamagastic manual?  Would you not 
> expect that it Does The Right Thing?  Would you not expect that it 
> follows the Law Of Minimal Surprise?
>
> So FWIW I can see where Sergio is coming from.

Linus and other git developers from the early days trained their fingers
to type the command, every once in a while even without thinking, to check
the consistency of the repository back when the lower core part of the git
was still being developed.  Developers who wanted to make sure that git
correctly dealt with packfiles could deliberately trigger their creation
and checked them after they were created carefully, but loose objects are
the ones that are written by various commands from random codepaths.  It
made some technical sense to have a mode that checked only loose objects
from the debugging point of view for that reason.

    Side note.  I think the help description of --full option is wrong (or
    at least stale).  We always look at alternate object store these days
    since e15ef66 (fsck: check loose objects from alternate object stores
    by default, 2009-01-30).  It probably should read "check packed
    objects fully" or something.

The above paragraph is merely a historical background, and in this case
the "history" refers to early-to-mid 2005.  Even for git developers there
no longer is any reason to type "git fsck" in fear of some newly created
objects might be corrupt due to recent change to git these days.

The reason we did not make "--full" the default is probably we trust our
filesystems a bit too much.  At least, we trusted filesystems more than we
trusted the lower core part of git that was under development ;-)

Once a packfile is created and we always use it read-only, there didn't
seem to be much point in suspecting that the underlying filesystems or
disks may corrupt them in such a way that is not caught by the SHA-1
checksum over the entire packfile and per object checksum.  That trust in
the filesystems might have been a good tradeoff between fsck performance
and reliability on platforms git was initially developed on and for, but
it might not be true anymore as we run on more platforms these days.

It probably makes sense to ship 1.7.0 with a version of "fsck" in which
"--full" is the default; it would still accept "--full" but it would be a
no-op.  This would be a backward incompatible change, but the difference
is primarily about performance ("it takes a lot longer than before!"), and
not correctness, so we probably can live with it.  As I already said,
there is not much reason to run "fsck" every five minutes anymore to begin
with (unless your filesystem is so unreliable that it might eat one file
every five minutes, that is).

It probably is also a good idea to add a "--loose" option that does what
"fsck" currently does without "--full".  It is a good name because (1) to
people who do not know the internal of git, it means "check only loosely",
which would discourage them from running "fack" with that option to begin
with, and (2) to others, it exactly tells what the option makes the
command check.

^ permalink raw reply

* Re: What's cooking in git.git (Oct 2009, #03; Mon, 19)
From: Junio C Hamano @ 2009-10-19 18:58 UTC (permalink / raw)
  To: Johan Herland; +Cc: git, Johannes.Schindelin
In-Reply-To: <200910191125.19997.johan@herland.net>

Johan Herland <johan@herland.net> writes:

> On Monday 19 October 2009, Junio C Hamano wrote:
>> * jh/notes (2009-10-09) 22 commits.
>>  - fast-import: Proper notes tree manipulation using the notes API
>>  ...
>>  - Introduce commit notes
>>  (this branch uses sr/gfi-options.)
>
> Nope. This branch does not use sr/gfi-options. The jh/cvs-helper branch
> does, though.

We should rebase jh/notes on top of v1.6.5 then.

What you are seeing is a mechanically generated mark that notes that the
branch is not forked directly on master but are forked from the commit on
the other branch.  The parent commit of "Introduce commit notes" is
91d578a (fast-import: test the new option command, 2009-09-06).

>> Is this good for 'next' now?
>
> Not all of it.
>
> I suspect the first 14 patches are stable and 'next'-worthy, although
> it would be nice if Dscho had the time to ACK them.
>
> The last patch (#22) needs some major rework based on Shawn's comments
> (I'm working on that, although I currently have less time than I hoped
> for), and that rework might ripple into patches #15 through #21.

Thanks for the status update.

^ permalink raw reply

* Re: git-pack-objects gitattributes?
From: Nasser Grainawi @ 2009-10-19 18:47 UTC (permalink / raw)
  Cc: Git Mailing List
In-Reply-To: <4AD3B4F8.6030007@codeaurora.org>

Nasser Grainawi wrote:
> Hello,
> 
> I'm trying to avoid doing delta compression on a number of large binary 
> files. I got a suggestion to use $GIT_DIR/info/attributes and a line 
> like this:
> *.bin -delta
> 
> This doesn't seem to show a big improvement (and honestly I can't find 
> where in the git-pack-objects source the value of this attribute is used).
> 
> Could someone shed some light on this attribute and any other 
> improvements I could make for efficiently serving up a repo over 
> git-daemon with near-weekly revisions of 100MB+ files?
> 
> Thanks,
> Nasser

ping? any help? anyone?

^ permalink raw reply

* Re: git fsck not identifying corrupted packs
From: Gabor Gombas @ 2009-10-19 18:36 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Sergio Callegari, git
In-Reply-To: <4ADC2D45.3020803@viscovery.net>

On Mon, Oct 19, 2009 at 11:11:33AM +0200, Johannes Sixt wrote:

> > Is there a means to have fsck to a truly full check on the sanity of a repo?
> 
> git fsck --full
> 
> RTFM, please.

That still does not catch everything. About a week ago I wanted to check
out a branch in a local repo and I got an error that it was corrupt. But
"git fsck --full" only complained about some dangling objects, it did
not notice the corruption. I used git version 1.6.4.3 (Debian unstable
at that time). It would be nice to have a
"git fsck --i-really-want-to-check-everything".

Gabor

-- 
     ---------------------------------------------------------
     MTA SZTAKI Computer and Automation Research Institute
                Hungarian Academy of Sciences
     ---------------------------------------------------------

^ permalink raw reply

* Re: [msysGit] [PATCH v4 7/8] mingw: enable OpenSSL
From: Johannes Sixt @ 2009-10-19 18:20 UTC (permalink / raw)
  To: Erik Faye-Lund; +Cc: msysgit, git, Erik Faye-Lund
In-Reply-To: <1255966929-1280-8-git-send-email-kusmabite@gmail.com>

On Montag, 19. Oktober 2009, Erik Faye-Lund wrote:
> Since we have OpenSSL in msysgit now, enable it to support SSL
> encryption for imap-send.
>
> Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>

Acked-by: Johannes Sixt <j6t@kdbg.org>

-- Hannes

^ permalink raw reply

* Re: [msysGit] [PATCH v4 6/8] mingw: wrap SSL_set_(w|r)fd to call _get_osfhandle
From: Johannes Sixt @ 2009-10-19 18:20 UTC (permalink / raw)
  To: Erik Faye-Lund; +Cc: msysgit, git, Erik Faye-Lund
In-Reply-To: <1255966929-1280-7-git-send-email-kusmabite@gmail.com>

On Montag, 19. Oktober 2009, Erik Faye-Lund wrote:
> SSL_set_fd (and friends) expects a OS file handle on Windows, not
> a file descriptor as on UNIX(-ish).
>
> This patch makes the Windows version of SSL_set_fd behave like the
> UNIX versions, by calling _get_osfhandle on it's input.
>
> Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>

Acked-by: Johannes Sixt <j6t@kdbg.org>

-- Hannes

^ permalink raw reply

* Re: How to revert one of multiple merges
From: Bill Lear @ 2009-10-19 18:13 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: git
In-Reply-To: <4ADC8387.9010808@drmicha.warpmail.net>

On Monday, October 19, 2009 at 17:19:35 (+0200) Michael J Gruber writes:
>Bill Lear venit, vidit, dixit 18.10.2009 04:31:
>> Branch A, B, C each have 20 commits, 0-19.
>> 
>> Branch v1.0.0 created, then merge of A, B, C performed.
>> 
>> After testing, we realize that the branch B is not ready for
>> production release and we'd like to remove it from branch
>> v1.0.0.
>> 
>> If I do
>> 
>> % git merge A B C
>> 
>> I get a single commit:
>> 
>> % git log -p
>> 
>> commit 1644a0b98c01869aa83e59aa41374c22098c47b6
>> [...]
>> Date:   Fri Oct 16 09:52:32 2009 -0500
>> 
>>     Merge branches 'A', 'B' and 'C' into v1.0.0
>> 
>> [20 x 3 commits]
>> 
>> If I do
>> 
>> % git merge A
>> % git merge B
>> % git merge C
>> 
>> Then:
>> 
>> % git log -p
>> 
>> commit 8946edd381384d0882221c87b5b3b7bf47127d70
>> [...]
>> Date:   Sat Oct 17 21:28:36 2009 -0500
>> 
>>     Merge branch 'B' into v1.0.0
>> 
>> commit 076ed422443e3684e564f7cae2b92e4538088ae6
>> [...]
>> Date:   Sat Oct 17 21:28:35 2009 -0500
>> 
>>     Merge branch 'A' into v1.0.0
>> 
>> but no "Merge branch 'C' into v1.0.0".
>
>Do you get any commits after the merge of B? If yes, then v1.0.0 got
>fast-forwarded (you can avoid that using --no-ff). If no, C was
>contained in v1.0.0 already.

BTW this is all with git 1.6.5.

My test script that set all of this up makes no commit to any branch
after the merge of any branch is done.  C was not in v1.0.0 already.
Here is the script I used to set this up:

% cat scripto
rm -rf branch_test
mkdir branch_test
cd branch_test
git init

echo foo > foo
git add foo
git commit -a -m "foo"

git checkout -b A
for ((i=0; i < 20; ++i)); do
    echo "bar $i" > bar
    git add bar
    git commit -a -m "bar $i"
done

git checkout master
git checkout -b B
for ((i=0; i < 20; ++i)); do
    echo "baz $i" > baz
    git add baz
    git commit -a -m "baz $i"
done

git checkout master
git checkout -b C
for ((i=0; i < 20; ++i)); do
    echo "buz $i" > buz
    git add buz
    git commit -a -m "buz $i"
done

git checkout master
git checkout -b v1.0.0

After that, I did the merges this way:

% git merge A
% git merge B
% git merge C

and then the git log shows no merge of C, as above.  Hmm, actually, when I
just ran this, I get no output showing branch A was merged.  I just
did this again and here is the merge output:

% git branch -a
  A
  B
  C
  master
* v1.0.0
% git merge A
Updating af6c884..c7e5f2c
Fast forward
 bar |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 bar
% git merge B
Merge made by recursive.
 baz |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 baz
% git merge C
Merge made by recursive.
 buz |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 buz

Then, git log -p shows no branch A merge:

% git log -p | grep -i merge
Merge: f462b95 2c2b064
    Merge branch 'C' into v1.0.0
Merge: c7e5f2c adde6ff
    Merge branch 'B' into v1.0.0

>In both cases, it's not clear how C could have been "ready" when B was not.

A, B, and C, are entirely independent of one another.  I'm trying to
replicate an instance in which a feature is developed and submitted for
inclusion in a release, accepted for inclusion, but then later found
to be defective.

>> And so, I'm faced with git rebase -i posing some unanswerable questions
>> to our release manager.  She cannot easily remove B from the merge after
>> doint either merge A B C, or merge A, merge B, merge C.
>
>The way you described the situation there are no commits after the
>merges. So, why not reset to before the merge and do a "git merge A C"?

Presumably, I would need to tag the v1.0.0 branch after creating it,
which I was hoping not to have to do.  I wanted the equivalent of
"git unmerge B" after doing three separate merges as above, or an octopus
merge.  I'm just trying to make life simpler for our release manager,
who is not equipped with git fu.


Bill

^ permalink raw reply

* Re: [PATCH] am: allow some defaults to be specified via git-config
From: Wesley J. Landaker @ 2009-10-19 17:49 UTC (permalink / raw)
  To: Sam Vilain; +Cc: git, Nigel McNie
In-Reply-To: <1255650627-17576-1-git-send-email-sam.vilain@catalyst.net.nz>

On Thursday 15 October 2009 17:50:27 Sam Vilain wrote:
> +am.*::
> +	Specify defaults for linkgit:git-am[1].  Currently, the three
> +	boolean options, 'sign', 'utf8' and 'keep' may be specified.
> +

The 'git am' option is 'signoff', not 'sign'. Shouldn't the command option 
and config option names match?

^ permalink raw reply

* Re: [PATCH] git add -e documentation: rephrase note
From: Wesley J. Landaker @ 2009-10-19 17:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Miklos Vajna, git
In-Reply-To: <7vaazn7tg4.fsf@alter.siamese.dyndns.org>

On Monday 19 October 2009 01:07:23 Junio C Hamano wrote:
> That is Ok; the comment was not about stage vs add.
>
> > But beyond that, yes, you are right that removing a "+" line may have a
> > different conceptual meaning to the user depending on the surrounding
> > text. I wonder if such a "check-list" document really makes much sense,
> > given that using "-e" at all means you need to understand the patch
> > format and what makes sense (i.e., anybody who understands 'patch'
> > knows that you can't just delete context lines and expect it to apply).
>
> Yeah, that is really what I wanted people who are in this discussion to
> eventually realize ;-)

Comment from the peanut gallery:

I still think a quick summary checklist is useful even for a seasoned 
developer that is intimate with the 'patch' format, as it lets users know 
what git will do with your patch modifications.

For example, when I first tried "add -e", my first thought was: "Awesome, 
but, I wonder if git will do the right thing if I modify the patch in THIS 
way ...". Fortunately, git did the right thing, but I wasn't really sure 
until I tried it.

^ permalink raw reply

* [PATCH resend] Make the MSVC projects use PDB/IDB files named after the project
From: Sebastian Schuberth @ 2009-10-19 16:40 UTC (permalink / raw)
  To: git

Instead of having all PDB files for all projects named "vc90.pdb", name them
after the respective project to make the relation more clear (and to avoid name
clashes when copying files around).

Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
Acked-by: Marius Storm-Olsen <mstormo@gmail.com>
---
 contrib/buildsystems/Generators/Vcproj.pm |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/contrib/buildsystems/Generators/Vcproj.pm b/contrib/buildsystems/Generators/Vcproj.pm
index be94ba1..cfa74ad 100644
--- a/contrib/buildsystems/Generators/Vcproj.pm
+++ b/contrib/buildsystems/Generators/Vcproj.pm
@@ -178,6 +178,7 @@ sub createLibProject {
 				MinimalRebuild="true"
 				RuntimeLibrary="1"
 				UsePrecompiledHeader="0"
+				ProgramDataBaseFileName="\$(IntDir)\\\$(TargetName).pdb"
 				WarningLevel="3"
 				DebugInformationFormat="3"
 			/>
@@ -244,6 +245,7 @@ sub createLibProject {
 				RuntimeLibrary="0"
 				EnableFunctionLevelLinking="true"
 				UsePrecompiledHeader="0"
+				ProgramDataBaseFileName="\$(IntDir)\\\$(TargetName).pdb"
 				WarningLevel="3"
 				DebugInformationFormat="3"
 			/>
@@ -401,6 +403,7 @@ sub createAppProject {
 				MinimalRebuild="true"
 				RuntimeLibrary="1"
 				UsePrecompiledHeader="0"
+				ProgramDataBaseFileName="\$(IntDir)\\\$(TargetName).pdb"
 				WarningLevel="3"
 				DebugInformationFormat="3"
 			/>
@@ -472,6 +475,7 @@ sub createAppProject {
 				RuntimeLibrary="0"
 				EnableFunctionLevelLinking="true"
 				UsePrecompiledHeader="0"
+				ProgramDataBaseFileName="\$(IntDir)\\\$(TargetName).pdb"
 				WarningLevel="3"
 				DebugInformationFormat="3"
 			/>
-- 
1.6.5.rc2.13.g1be2

^ permalink raw reply related

* Re: denying branch creation in a shared repository
From: Mohit Aron @ 2009-10-19 16:43 UTC (permalink / raw)
  To: Sitaram Chamarty; +Cc: git
In-Reply-To: <2e24e5b90910190143j5579d9dfle15df8625eb20a00@mail.gmail.com>

>
> That was the main reason I wrote gitolite
> (http://github.com/sitaramc/gitolite), though now it does a heck of a
> lot more than just that.
>

That's great. You might want to consider making it a deb package
that's available from one of the Ubuntu/Debian repositories. An apt
search on Ubuntu 9.10 doesn't reveal it. I usually shy away from
installing software on my machines that is not automatically managed.


- Mohit

^ permalink raw reply

* [PATCH] Use faster byte swapping when compiling with MSVC
From: Sebastian Schuberth @ 2009-10-19 16:37 UTC (permalink / raw)
  To: git

When compiling with MSVC on x86-compatible, use an intrinsic for byte swapping.
In contrast to the GCC path, we do not prefer inline assembly here as it is not
supported for the x64 platform.

Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
---
 compat/bswap.h |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/compat/bswap.h b/compat/bswap.h
index 5cc4acb..279e0b4 100644
--- a/compat/bswap.h
+++ b/compat/bswap.h
@@ -28,6 +28,16 @@ static inline uint32_t default_swab32(uint32_t val)
 	} \
 	__res; })
 
+#elif defined(_MSC_VER) && (defined(_M_IX86) || defined(_M_X64))
+
+#include <stdlib.h>
+
+#define bswap32(x) _byteswap_ulong(x)
+
+#endif
+
+#ifdef bswap32
+
 #undef ntohl
 #undef htonl
 #define ntohl(x) bswap32(x)
-- 
1.6.5.rc2.13.g1be2

^ permalink raw reply related

* [PATCH v4 5/5] stash list: drop the default limit of 10 stashes
From: Thomas Rast @ 2009-10-19 15:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Jef Driesen, Nanako Shiraishi, git
In-Reply-To: <cover.1255966426.git.trast@student.ethz.ch>

'git stash list' had an undocumented limit of 10 stashes, unless other
git-log arguments were specified.  This surprised at least one user,
but possibly served to cut the output below a screenful without using
a pager.

Since the last commit, 'git stash list' will fire up a pager according
to the same rules as the 'git log' it calls, so we can drop the limit.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
---
 Documentation/git-stash.txt |    3 +--
 git-stash.sh                |    5 -----
 2 files changed, 1 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-stash.txt b/Documentation/git-stash.txt
index fafe728..3f14b72 100644
--- a/Documentation/git-stash.txt
+++ b/Documentation/git-stash.txt
@@ -78,8 +78,7 @@ stash@{1}: On master: 9cc0589... Add git-stash
 ----------------------------------------------------------------
 +
 The command takes options applicable to the 'git-log'
-command to control what is shown and how. If no options are set, the
-default is `-n 10`. See linkgit:git-log[1].
+command to control what is shown and how. See linkgit:git-log[1].
 
 show [<stash>]::
 
diff --git a/git-stash.sh b/git-stash.sh
index f8847c1..f796c2f 100755
--- a/git-stash.sh
+++ b/git-stash.sh
@@ -382,11 +382,6 @@ test -n "$seen_non_option" || set "save" "$@"
 case "$1" in
 list)
 	shift
-	if test $# = 0
-	then
-		set x -n 10
-		shift
-	fi
 	list_stash "$@"
 	;;
 show)
-- 
1.6.5.1.137.gefbc6

^ permalink raw reply related

* [PATCH v4 4/5] stash list: use new %g formats instead of sed
From: Thomas Rast @ 2009-10-19 15:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Jef Driesen, Nanako Shiraishi, git
In-Reply-To: <cover.1255966426.git.trast@student.ethz.ch>

With the new formats, we can rewrite 'git stash list' in terms of an
appropriate pretty format, instead of hand-editing with sed.  This has
the advantage that it obeys the normal settings for git-log, notably
the pager.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
---
 git-stash.sh |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/git-stash.sh b/git-stash.sh
index 4febbbf..f8847c1 100755
--- a/git-stash.sh
+++ b/git-stash.sh
@@ -205,8 +205,7 @@ have_stash () {
 
 list_stash () {
 	have_stash || return 0
-	git log --no-color --pretty=oneline -g "$@" $ref_stash -- |
-	sed -n -e 's/^[.0-9a-f]* refs\///p'
+	git log --format="%gd: %gs" -g "$@" $ref_stash --
 }
 
 show_stash () {
-- 
1.6.5.1.137.gefbc6

^ permalink raw reply related

* [PATCH v4 3/5] Introduce new pretty formats %g[sdD] for reflog information
From: Thomas Rast @ 2009-10-19 15:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Jef Driesen, Nanako Shiraishi, git
In-Reply-To: <cover.1255966426.git.trast@student.ethz.ch>

Add three new --pretty=format escapes:

  %gD  long  reflog descriptor (e.g. refs/stash@{0})
  %gd  short reflog descriptor (e.g. stash@{0})
  %gs  reflog message

This is achieved by passing down the reflog info, if any, inside the
pretty_print_context struct.

We use the newly refactored get_reflog_selector(), and give it some
extra functionality to extract a shortened ref.  The shortening is
cached inside the commit_reflogs struct; the only allocation of it
happens in read_complete_reflog(), where it is initialised to 0.  Also
add another helper get_reflog_message() for the message extraction.

Note that the --format="%h %gD: %gs" tests may not work in real
repositories, as the --pretty formatter doesn't know to leave away the
": " on the last commit in an incomplete (because git-gc removed the
old part) reflog.  This equivalence is nevertheless the main goal of
this patch.

Thanks to Jeff King for reviews, the %gd testcase and documentation.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
---
 Documentation/pretty-formats.txt |    9 +++++++++
 commit.h                         |    1 +
 log-tree.c                       |    1 +
 pretty.c                         |   17 +++++++++++++++++
 reflog-walk.c                    |   35 ++++++++++++++++++++++++++++++++---
 reflog-walk.h                    |    8 ++++++++
 t/t6006-rev-list-format.sh       |   18 ++++++++++++++++++
 7 files changed, 86 insertions(+), 3 deletions(-)

diff --git a/Documentation/pretty-formats.txt b/Documentation/pretty-formats.txt
index 2a845b1..38b9904 100644
--- a/Documentation/pretty-formats.txt
+++ b/Documentation/pretty-formats.txt
@@ -123,6 +123,9 @@ The placeholders are:
 - '%s': subject
 - '%f': sanitized subject line, suitable for a filename
 - '%b': body
+- '%gD': reflog selector, e.g., `refs/stash@\{1\}`
+- '%gd': shortened reflog selector, e.g., `stash@\{1\}`
+- '%gs': reflog subject
 - '%Cred': switch color to red
 - '%Cgreen': switch color to green
 - '%Cblue': switch color to blue
@@ -132,6 +135,12 @@ The placeholders are:
 - '%n': newline
 - '%x00': print a byte from a hex code
 
+NOTE: Some placeholders may depend on other options given to the
+revision traversal engine. For example, the `%g*` reflog options will
+insert an empty string unless we are traversing reflog entries (e.g., by
+`git log -g`). The `%d` placeholder will use the "short" decoration
+format if `--decorate` was not already provided on the command line.
+
 * 'tformat:'
 +
 The 'tformat:' format works exactly like 'format:', except that it
diff --git a/commit.h b/commit.h
index 084cf55..422f778 100644
--- a/commit.h
+++ b/commit.h
@@ -70,6 +70,7 @@ struct pretty_print_context
 	const char *after_subject;
 	enum date_mode date_mode;
 	int need_8bit_cte;
+	struct reflog_walk_info *reflog_info;
 };
 
 extern int non_ascii(int);
diff --git a/log-tree.c b/log-tree.c
index 1675035..0fdf159 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -411,6 +411,7 @@ void show_log(struct rev_info *opt)
 	ctx.date_mode = opt->date_mode;
 	ctx.abbrev = opt->diffopt.abbrev;
 	ctx.after_subject = extra_headers;
+	ctx.reflog_info = opt->reflog_info;
 	pretty_print_commit(opt->commit_format, commit, &msgbuf, &ctx);
 
 	if (opt->add_signoff)
diff --git a/pretty.c b/pretty.c
index 37ad6eb..da15cf2 100644
--- a/pretty.c
+++ b/pretty.c
@@ -7,6 +7,7 @@
 #include "mailmap.h"
 #include "log-tree.h"
 #include "color.h"
+#include "reflog-walk.h"
 
 static char *user_format;
 
@@ -701,6 +702,22 @@ static size_t format_commit_item(struct strbuf *sb, const char *placeholder,
 	case 'd':
 		format_decoration(sb, commit);
 		return 1;
+	case 'g':		/* reflog info */
+		switch(placeholder[1]) {
+		case 'd':	/* reflog selector */
+		case 'D':
+			if (c->pretty_ctx->reflog_info)
+				get_reflog_selector(sb,
+						    c->pretty_ctx->reflog_info,
+						    c->pretty_ctx->date_mode,
+						    (placeholder[1] == 'd'));
+			return 2;
+		case 's':	/* reflog message */
+			if (c->pretty_ctx->reflog_info)
+				get_reflog_message(sb, c->pretty_ctx->reflog_info);
+			return 2;
+		}
+		return 0;	/* unknown %g placeholder */
 	}
 
 	/* For the rest we have to parse the commit header. */
diff --git a/reflog-walk.c b/reflog-walk.c
index 596bafe..caba4f7 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -8,6 +8,7 @@
 
 struct complete_reflogs {
 	char *ref;
+	const char *short_ref;
 	struct reflog_info {
 		unsigned char osha1[20], nsha1[20];
 		char *email;
@@ -243,15 +244,26 @@ void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
 
 void get_reflog_selector(struct strbuf *sb,
 			 struct reflog_walk_info *reflog_info,
-			 enum date_mode dmode)
+			 enum date_mode dmode,
+			 int shorten)
 {
 	struct commit_reflog *commit_reflog = reflog_info->last_commit_reflog;
 	struct reflog_info *info;
+	const char *printed_ref;
 
 	if (!commit_reflog)
 		return;
 
-	strbuf_addf(sb, "%s@{", commit_reflog->reflogs->ref);
+	if (shorten) {
+		if (!commit_reflog->reflogs->short_ref)
+			commit_reflog->reflogs->short_ref
+				= shorten_unambiguous_ref(commit_reflog->reflogs->ref, 0);
+		printed_ref = commit_reflog->reflogs->short_ref;
+	} else {
+		printed_ref = commit_reflog->reflogs->ref;
+	}
+
+	strbuf_addf(sb, "%s@{", printed_ref);
 	if (commit_reflog->flag || dmode) {
 		info = &commit_reflog->reflogs->items[commit_reflog->recno+1];
 		strbuf_addstr(sb, show_date(info->timestamp, info->tz, dmode));
@@ -263,6 +275,23 @@ void get_reflog_selector(struct strbuf *sb,
 	strbuf_addch(sb, '}');
 }
 
+void get_reflog_message(struct strbuf *sb,
+			struct reflog_walk_info *reflog_info)
+{
+	struct commit_reflog *commit_reflog = reflog_info->last_commit_reflog;
+	struct reflog_info *info;
+	size_t len;
+
+	if (!commit_reflog)
+		return;
+
+	info = &commit_reflog->reflogs->items[commit_reflog->recno+1];
+	len = strlen(info->message);
+	if (len > 0)
+		len--; /* strip away trailing newline */
+	strbuf_add(sb, info->message, len);
+}
+
 void show_reflog_message(struct reflog_walk_info *reflog_info, int oneline,
 	enum date_mode dmode)
 {
@@ -272,7 +301,7 @@ void show_reflog_message(struct reflog_walk_info *reflog_info, int oneline,
 		struct strbuf selector = STRBUF_INIT;
 
 		info = &commit_reflog->reflogs->items[commit_reflog->recno+1];
-		get_reflog_selector(&selector, reflog_info, dmode);
+		get_reflog_selector(&selector, reflog_info, dmode, 0);
 		if (oneline) {
 			printf("%s: %s", selector.buf, info->message);
 		}
diff --git a/reflog-walk.h b/reflog-walk.h
index 74c9096..7bd2cd4 100644
--- a/reflog-walk.h
+++ b/reflog-walk.h
@@ -3,6 +3,8 @@
 
 #include "cache.h"
 
+struct reflog_walk_info;
+
 extern void init_reflog_walk(struct reflog_walk_info** info);
 extern int add_reflog_for_walk(struct reflog_walk_info *info,
 		struct commit *commit, const char *name);
@@ -10,5 +12,11 @@ extern void fake_reflog_parent(struct reflog_walk_info *info,
 		struct commit *commit);
 extern void show_reflog_message(struct reflog_walk_info *info, int,
 		enum date_mode);
+extern void get_reflog_message(struct strbuf *sb,
+		struct reflog_walk_info *reflog_info);
+extern void get_reflog_selector(struct strbuf *sb,
+		struct reflog_walk_info *reflog_info,
+		enum date_mode dmode,
+		int shorten);
 
 #endif
diff --git a/t/t6006-rev-list-format.sh b/t/t6006-rev-list-format.sh
index 59d1f62..7f61ab0 100755
--- a/t/t6006-rev-list-format.sh
+++ b/t/t6006-rev-list-format.sh
@@ -162,4 +162,22 @@ test_expect_success 'empty email' '
 	}
 '
 
+test_expect_success '"%h %gD: %gs" is same as git-reflog' '
+	git reflog >expect &&
+	git log -g --format="%h %gD: %gs" >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success '"%h %gD: %gs" is same as git-reflog (with date)' '
+	git reflog --date=raw >expect &&
+	git log -g --format="%h %gD: %gs" --date=raw >actual &&
+	test_cmp expect actual
+'
+
+test_expect_success '%gd shortens ref name' '
+	echo "master@{0}" >expect.gd-short &&
+	git log -g -1 --format=%gd refs/heads/master >actual.gd-short &&
+	test_cmp expect.gd-short actual.gd-short
+'
+
 test_done
-- 
1.6.5.1.137.gefbc6

^ permalink raw reply related

* [PATCH v4 1/5] Refactor pretty_print_commit arguments into a struct
From: Thomas Rast @ 2009-10-19 15:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Jef Driesen, Nanako Shiraishi, git
In-Reply-To: <cover.1255966426.git.trast@student.ethz.ch>

pretty_print_commit() has a bunch of rarely-used arguments, and
introducing more of them requires yet another update of all the call
sites.  Refactor most of them into a struct to make future extensions
easier.

The ones that stay "plain" arguments were chosen on the grounds that
all callers put real arguments there, whereas some callers have 0/NULL
for all arguments that were factored into the struct.

We declare the struct 'const' to ensure none of the callers are bitten
by the changed (no longer call-by-value) semantics.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
---
 archive.c             |    4 +++-
 builtin-branch.c      |    3 ++-
 builtin-checkout.c    |    3 ++-
 builtin-commit.c      |    8 ++++++--
 builtin-log.c         |    3 ++-
 builtin-merge.c       |    7 +++++--
 builtin-rev-list.c    |    7 ++++---
 builtin-shortlog.c    |    9 ++++++---
 builtin-show-branch.c |    4 ++--
 commit.h              |   19 +++++++++++++------
 log-tree.c            |   24 +++++++++++++-----------
 pretty.c              |   27 ++++++++++++++-------------
 12 files changed, 72 insertions(+), 46 deletions(-)

diff --git a/archive.c b/archive.c
index 0cc79d2..55b2732 100644
--- a/archive.c
+++ b/archive.c
@@ -31,6 +31,8 @@ static void format_subst(const struct commit *commit,
 {
 	char *to_free = NULL;
 	struct strbuf fmt = STRBUF_INIT;
+	struct pretty_print_context ctx = {0};
+	ctx.date_mode = DATE_NORMAL;
 
 	if (src == buf->buf)
 		to_free = strbuf_detach(buf, NULL);
@@ -48,7 +50,7 @@ static void format_subst(const struct commit *commit,
 		strbuf_add(&fmt, b + 8, c - b - 8);
 
 		strbuf_add(buf, src, b - src);
-		format_commit_message(commit, fmt.buf, buf, DATE_NORMAL);
+		format_commit_message(commit, fmt.buf, buf, &ctx);
 		len -= c + 1 - src;
 		src  = c + 1;
 	}
diff --git a/builtin-branch.c b/builtin-branch.c
index 9f57992..05e876e 100644
--- a/builtin-branch.c
+++ b/builtin-branch.c
@@ -387,8 +387,9 @@ static void print_ref_item(struct ref_item *item, int maxwidth, int verbose,
 
 		commit = item->commit;
 		if (commit && !parse_commit(commit)) {
+			struct pretty_print_context ctx = {0};
 			pretty_print_commit(CMIT_FMT_ONELINE, commit,
-					    &subject, 0, NULL, NULL, 0, 0);
+					    &subject, &ctx);
 			sub = subject.buf;
 		}
 
diff --git a/builtin-checkout.c b/builtin-checkout.c
index d050c37..075a49f 100644
--- a/builtin-checkout.c
+++ b/builtin-checkout.c
@@ -302,8 +302,9 @@ static void show_local_changes(struct object *head)
 static void describe_detached_head(char *msg, struct commit *commit)
 {
 	struct strbuf sb = STRBUF_INIT;
+	struct pretty_print_context ctx = {0};
 	parse_commit(commit);
-	pretty_print_commit(CMIT_FMT_ONELINE, commit, &sb, 0, NULL, NULL, 0, 0);
+	pretty_print_commit(CMIT_FMT_ONELINE, commit, &sb, &ctx);
 	fprintf(stderr, "%s %s... %s\n", msg,
 		find_unique_abbrev(commit->object.sha1, DEFAULT_ABBREV), sb.buf);
 	strbuf_release(&sb);
diff --git a/builtin-commit.c b/builtin-commit.c
index 200ffda..ac173f9 100644
--- a/builtin-commit.c
+++ b/builtin-commit.c
@@ -684,8 +684,10 @@ static int message_is_empty(struct strbuf *sb)
 	prepare_revision_walk(&revs);
 	commit = get_revision(&revs);
 	if (commit) {
+		struct pretty_print_context ctx = {0};
+		ctx.date_mode = DATE_NORMAL;
 		strbuf_release(&buf);
-		format_commit_message(commit, "%an <%ae>", &buf, DATE_NORMAL);
+		format_commit_message(commit, "%an <%ae>", &buf, &ctx);
 		return strbuf_detach(&buf, NULL);
 	}
 	die("No existing author found with '%s'", name);
@@ -942,8 +944,10 @@ static void print_summary(const char *prefix, const unsigned char *sha1)
 		initial_commit ? " (root-commit)" : "");
 
 	if (!log_tree_commit(&rev, commit)) {
+		struct pretty_print_context ctx = {0};
+		ctx.date_mode = DATE_NORMAL;
 		struct strbuf buf = STRBUF_INIT;
-		format_commit_message(commit, format + 7, &buf, DATE_NORMAL);
+		format_commit_message(commit, format + 7, &buf, &ctx);
 		printf("%s\n", buf.buf);
 		strbuf_release(&buf);
 	}
diff --git a/builtin-log.c b/builtin-log.c
index 25e21ed..207a361 100644
--- a/builtin-log.c
+++ b/builtin-log.c
@@ -1304,8 +1304,9 @@ int cmd_cherry(int argc, const char **argv, const char *prefix)
 
 		if (verbose) {
 			struct strbuf buf = STRBUF_INIT;
+			struct pretty_print_context ctx = {0};
 			pretty_print_commit(CMIT_FMT_ONELINE, commit,
-			                    &buf, 0, NULL, NULL, 0, 0);
+					    &buf, &ctx);
 			printf("%c %s %s\n", sign,
 			       sha1_to_hex(commit->object.sha1), buf.buf);
 			strbuf_release(&buf);
diff --git a/builtin-merge.c b/builtin-merge.c
index b6b8428..c69a305 100644
--- a/builtin-merge.c
+++ b/builtin-merge.c
@@ -264,6 +264,7 @@ static void squash_message(void)
 	struct strbuf out = STRBUF_INIT;
 	struct commit_list *j;
 	int fd;
+	struct pretty_print_context ctx = {0};
 
 	printf("Squash commit -- not updating HEAD\n");
 	fd = open(git_path("SQUASH_MSG"), O_WRONLY | O_CREAT, 0666);
@@ -285,13 +286,15 @@ static void squash_message(void)
 	if (prepare_revision_walk(&rev))
 		die("revision walk setup failed");
 
+	ctx.abbrev = rev.abbrev;
+	ctx.date_mode = rev.date_mode;
+
 	strbuf_addstr(&out, "Squashed commit of the following:\n");
 	while ((commit = get_revision(&rev)) != NULL) {
 		strbuf_addch(&out, '\n');
 		strbuf_addf(&out, "commit %s\n",
 			sha1_to_hex(commit->object.sha1));
-		pretty_print_commit(rev.commit_format, commit, &out, rev.abbrev,
-			NULL, NULL, rev.date_mode, 0);
+		pretty_print_commit(rev.commit_format, commit, &out, &ctx);
 	}
 	if (write(fd, out.buf, out.len) < 0)
 		die_errno("Writing SQUASH_MSG");
diff --git a/builtin-rev-list.c b/builtin-rev-list.c
index 4ba1c12..42cc8d8 100644
--- a/builtin-rev-list.c
+++ b/builtin-rev-list.c
@@ -96,9 +96,10 @@ static void show_commit(struct commit *commit, void *data)
 
 	if (revs->verbose_header && commit->buffer) {
 		struct strbuf buf = STRBUF_INIT;
-		pretty_print_commit(revs->commit_format, commit,
-				    &buf, revs->abbrev, NULL, NULL,
-				    revs->date_mode, 0);
+		struct pretty_print_context ctx = {0};
+		ctx.abbrev = revs->abbrev;
+		ctx.date_mode = revs->date_mode;
+		pretty_print_commit(revs->commit_format, commit, &buf, &ctx);
 		if (revs->graph) {
 			if (buf.len) {
 				if (revs->commit_format != CMIT_FMT_ONELINE)
diff --git a/builtin-shortlog.c b/builtin-shortlog.c
index 4d4a3c8..8aa63c7 100644
--- a/builtin-shortlog.c
+++ b/builtin-shortlog.c
@@ -158,9 +158,12 @@ void shortlog_add_commit(struct shortlog *log, struct commit *commit)
 		    sha1_to_hex(commit->object.sha1));
 	if (log->user_format) {
 		struct strbuf buf = STRBUF_INIT;
-
-		pretty_print_commit(CMIT_FMT_USERFORMAT, commit, &buf,
-			DEFAULT_ABBREV, "", "", DATE_NORMAL, 0);
+		struct pretty_print_context ctx = {0};
+		ctx.abbrev = DEFAULT_ABBREV;
+		ctx.subject = "";
+		ctx.after_subject = "";
+		ctx.date_mode = DATE_NORMAL;
+		pretty_print_commit(CMIT_FMT_USERFORMAT, commit, &buf, &ctx);
 		insert_one_record(log, author, buf.buf);
 		strbuf_release(&buf);
 		return;
diff --git a/builtin-show-branch.c b/builtin-show-branch.c
index be95930..9f13caa 100644
--- a/builtin-show-branch.c
+++ b/builtin-show-branch.c
@@ -293,8 +293,8 @@ static void show_one_commit(struct commit *commit, int no_name)
 	struct commit_name *name = commit->util;
 
 	if (commit->object.parsed) {
-		pretty_print_commit(CMIT_FMT_ONELINE, commit,
-				    &pretty, 0, NULL, NULL, 0, 0);
+		struct pretty_print_context ctx = {0};
+		pretty_print_commit(CMIT_FMT_ONELINE, commit, &pretty, &ctx);
 		pretty_str = pretty.buf;
 	}
 	if (!prefixcmp(pretty_str, "[PATCH] "))
diff --git a/commit.h b/commit.h
index 95f981a..084cf55 100644
--- a/commit.h
+++ b/commit.h
@@ -63,6 +63,15 @@ enum cmit_fmt {
 	CMIT_FMT_UNSPECIFIED,
 };
 
+struct pretty_print_context
+{
+	int abbrev;
+	const char *subject;
+	const char *after_subject;
+	enum date_mode date_mode;
+	int need_8bit_cte;
+};
+
 extern int non_ascii(int);
 extern int has_non_ascii(const char *text);
 struct rev_info; /* in revision.h, it circularly uses enum cmit_fmt */
@@ -71,12 +80,10 @@ enum cmit_fmt {
 extern void get_commit_format(const char *arg, struct rev_info *);
 extern void format_commit_message(const struct commit *commit,
 				  const char *format, struct strbuf *sb,
-				  enum date_mode dmode);
-extern void pretty_print_commit(enum cmit_fmt fmt, const struct commit*,
-                                struct strbuf *,
-                                int abbrev, const char *subject,
-                                const char *after_subject, enum date_mode,
-				int need_8bit_cte);
+				  const struct pretty_print_context *context);
+extern void pretty_print_commit(enum cmit_fmt fmt, const struct commit *commit,
+				struct strbuf *sb,
+				const struct pretty_print_context *context);
 void pp_user_info(const char *what, enum cmit_fmt fmt, struct strbuf *sb,
 		   const char *line, enum date_mode dmode,
 		   const char *encoding);
diff --git a/log-tree.c b/log-tree.c
index f7d54f2..1675035 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -179,8 +179,10 @@ void get_patch_filename(struct commit *commit, int nr, const char *suffix,
 	strbuf_addf(buf, commit ? "%04d-" : "%d", nr);
 	if (commit) {
 		int max_len = start_len + FORMAT_PATCH_NAME_MAX - suffix_len;
+		struct pretty_print_context ctx = {0};
+		ctx.date_mode = DATE_NORMAL;
 
-		format_commit_message(commit, "%f", buf, DATE_NORMAL);
+		format_commit_message(commit, "%f", buf, &ctx);
 		if (max_len < buf->len)
 			strbuf_setlen(buf, max_len);
 		strbuf_addstr(buf, suffix);
@@ -277,10 +279,9 @@ void show_log(struct rev_info *opt)
 	struct strbuf msgbuf = STRBUF_INIT;
 	struct log_info *log = opt->loginfo;
 	struct commit *commit = log->commit, *parent = log->parent;
-	int abbrev = opt->diffopt.abbrev;
 	int abbrev_commit = opt->abbrev_commit ? opt->abbrev : 40;
-	const char *subject = NULL, *extra_headers = opt->extra_headers;
-	int need_8bit_cte = 0;
+	const char *extra_headers = opt->extra_headers;
+	struct pretty_print_context ctx = {0};
 
 	opt->loginfo = NULL;
 	if (!opt->verbose_header) {
@@ -347,8 +348,8 @@ void show_log(struct rev_info *opt)
 	 */
 
 	if (opt->commit_format == CMIT_FMT_EMAIL) {
-		log_write_email_headers(opt, commit, &subject, &extra_headers,
-					&need_8bit_cte);
+		log_write_email_headers(opt, commit, &ctx.subject, &extra_headers,
+					&ctx.need_8bit_cte);
 	} else if (opt->commit_format != CMIT_FMT_USERFORMAT) {
 		fputs(diff_get_color_opt(&opt->diffopt, DIFF_COMMIT), stdout);
 		if (opt->commit_format != CMIT_FMT_ONELINE)
@@ -405,11 +406,12 @@ void show_log(struct rev_info *opt)
 	/*
 	 * And then the pretty-printed message itself
 	 */
-	if (need_8bit_cte >= 0)
-		need_8bit_cte = has_non_ascii(opt->add_signoff);
-	pretty_print_commit(opt->commit_format, commit, &msgbuf,
-			    abbrev, subject, extra_headers, opt->date_mode,
-			    need_8bit_cte);
+	if (ctx.need_8bit_cte >= 0)
+		ctx.need_8bit_cte = has_non_ascii(opt->add_signoff);
+	ctx.date_mode = opt->date_mode;
+	ctx.abbrev = opt->diffopt.abbrev;
+	ctx.after_subject = extra_headers;
+	pretty_print_commit(opt->commit_format, commit, &msgbuf, &ctx);
 
 	if (opt->add_signoff)
 		append_signoff(&msgbuf, opt->add_signoff);
diff --git a/pretty.c b/pretty.c
index 587101f..37ad6eb 100644
--- a/pretty.c
+++ b/pretty.c
@@ -442,7 +442,7 @@ struct chunk {
 
 struct format_commit_context {
 	const struct commit *commit;
-	enum date_mode dmode;
+	const struct pretty_print_context *pretty_ctx;
 	unsigned commit_header_parsed:1;
 	unsigned commit_message_parsed:1;
 
@@ -711,11 +711,11 @@ static size_t format_commit_item(struct strbuf *sb, const char *placeholder,
 	case 'a':	/* author ... */
 		return format_person_part(sb, placeholder[1],
 				   msg + c->author.off, c->author.len,
-				   c->dmode);
+				   c->pretty_ctx->date_mode);
 	case 'c':	/* committer ... */
 		return format_person_part(sb, placeholder[1],
 				   msg + c->committer.off, c->committer.len,
-				   c->dmode);
+				   c->pretty_ctx->date_mode);
 	case 'e':	/* encoding */
 		strbuf_add(sb, msg + c->encoding.off, c->encoding.len);
 		return 1;
@@ -741,13 +741,13 @@ static size_t format_commit_item(struct strbuf *sb, const char *placeholder,
 
 void format_commit_message(const struct commit *commit,
 			   const char *format, struct strbuf *sb,
-			   enum date_mode dmode)
+			   const struct pretty_print_context *pretty_ctx)
 {
 	struct format_commit_context context;
 
 	memset(&context, 0, sizeof(context));
 	context.commit = commit;
-	context.dmode = dmode;
+	context.pretty_ctx = pretty_ctx;
 	strbuf_expand(sb, format, format_commit_item, &context);
 }
 
@@ -900,18 +900,18 @@ void pp_remainder(enum cmit_fmt fmt,
 }
 
 void pretty_print_commit(enum cmit_fmt fmt, const struct commit *commit,
-			 struct strbuf *sb, int abbrev,
-			 const char *subject, const char *after_subject,
-			 enum date_mode dmode, int need_8bit_cte)
+			 struct strbuf *sb,
+			 const struct pretty_print_context *context)
 {
 	unsigned long beginning_of_body;
 	int indent = 4;
 	const char *msg = commit->buffer;
 	char *reencoded;
 	const char *encoding;
+	int need_8bit_cte = context->need_8bit_cte;
 
 	if (fmt == CMIT_FMT_USERFORMAT) {
-		format_commit_message(commit, user_format, sb, dmode);
+		format_commit_message(commit, user_format, sb, context);
 		return;
 	}
 
@@ -946,8 +946,9 @@ void pretty_print_commit(enum cmit_fmt fmt, const struct commit *commit,
 		}
 	}
 
-	pp_header(fmt, abbrev, dmode, encoding, commit, &msg, sb);
-	if (fmt != CMIT_FMT_ONELINE && !subject) {
+	pp_header(fmt, context->abbrev, context->date_mode, encoding,
+		  commit, &msg, sb);
+	if (fmt != CMIT_FMT_ONELINE && !context->subject) {
 		strbuf_addch(sb, '\n');
 	}
 
@@ -956,8 +957,8 @@ void pretty_print_commit(enum cmit_fmt fmt, const struct commit *commit,
 
 	/* These formats treat the title line specially. */
 	if (fmt == CMIT_FMT_ONELINE || fmt == CMIT_FMT_EMAIL)
-		pp_title_line(fmt, &msg, sb, subject,
-			      after_subject, encoding, need_8bit_cte);
+		pp_title_line(fmt, &msg, sb, context->subject,
+			      context->after_subject, encoding, need_8bit_cte);
 
 	beginning_of_body = sb->len;
 	if (fmt != CMIT_FMT_ONELINE)
-- 
1.6.5.1.137.gefbc6

^ permalink raw reply related

* [PATCH v4 2/5] reflog-walk: refactor the branch@{num} formatting
From: Thomas Rast @ 2009-10-19 15:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Jef Driesen, Nanako Shiraishi, git
In-Reply-To: <cover.1255966426.git.trast@student.ethz.ch>

We'll use the same output in an upcoming commit, so refactor its
formatting (which was duplicated anyway) into a separate function.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
---
 reflog-walk.c |   54 ++++++++++++++++++++++++++++++++----------------------
 1 files changed, 32 insertions(+), 22 deletions(-)

diff --git a/reflog-walk.c b/reflog-walk.c
index 5623ea6..596bafe 100644
--- a/reflog-walk.c
+++ b/reflog-walk.c
@@ -241,36 +241,46 @@ void fake_reflog_parent(struct reflog_walk_info *info, struct commit *commit)
 	commit->object.flags &= ~(ADDED | SEEN | SHOWN);
 }
 
-void show_reflog_message(struct reflog_walk_info *info, int oneline,
+void get_reflog_selector(struct strbuf *sb,
+			 struct reflog_walk_info *reflog_info,
+			 enum date_mode dmode)
+{
+	struct commit_reflog *commit_reflog = reflog_info->last_commit_reflog;
+	struct reflog_info *info;
+
+	if (!commit_reflog)
+		return;
+
+	strbuf_addf(sb, "%s@{", commit_reflog->reflogs->ref);
+	if (commit_reflog->flag || dmode) {
+		info = &commit_reflog->reflogs->items[commit_reflog->recno+1];
+		strbuf_addstr(sb, show_date(info->timestamp, info->tz, dmode));
+	} else {
+		strbuf_addf(sb, "%d", commit_reflog->reflogs->nr
+			    - 2 - commit_reflog->recno);
+	}
+
+	strbuf_addch(sb, '}');
+}
+
+void show_reflog_message(struct reflog_walk_info *reflog_info, int oneline,
 	enum date_mode dmode)
 {
-	if (info && info->last_commit_reflog) {
-		struct commit_reflog *commit_reflog = info->last_commit_reflog;
+	if (reflog_info && reflog_info->last_commit_reflog) {
+		struct commit_reflog *commit_reflog = reflog_info->last_commit_reflog;
 		struct reflog_info *info;
+		struct strbuf selector = STRBUF_INIT;
 
 		info = &commit_reflog->reflogs->items[commit_reflog->recno+1];
+		get_reflog_selector(&selector, reflog_info, dmode);
 		if (oneline) {
-			printf("%s@{", commit_reflog->reflogs->ref);
-			if (commit_reflog->flag || dmode)
-				printf("%s", show_date(info->timestamp,
-						       info->tz,
-						       dmode));
-			else
-				printf("%d", commit_reflog->reflogs->nr
-				       - 2 - commit_reflog->recno);
-			printf("}: %s", info->message);
+			printf("%s: %s", selector.buf, info->message);
 		}
 		else {
-			printf("Reflog: %s@{", commit_reflog->reflogs->ref);
-			if (commit_reflog->flag || dmode)
-				printf("%s", show_date(info->timestamp,
-							info->tz,
-							dmode));
-			else
-				printf("%d", commit_reflog->reflogs->nr
-				       - 2 - commit_reflog->recno);
-			printf("} (%s)\nReflog message: %s",
-			       info->email, info->message);
+			printf("Reflog: %s (%s)\nReflog message: %s",
+			       selector.buf, info->email, info->message);
 		}
+
+		strbuf_release(&selector);
 	}
 }
-- 
1.6.5.1.137.gefbc6

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox