From: Adam Simpkins <adam@adamsimpkins.net>
To: git@vger.kernel.org
Cc: Junio C Hamano <gitster@pobox.com>
Subject: entry terminator/separator behavior in show_log()
Date: Sun, 27 Apr 2008 21:55:05 -0700 [thread overview]
Message-ID: <20080428045504.GA24981@adamsimpkins.net> (raw)
This is a write-up describing the current entry terminator/separator
behavior of show_log(). I wrote it while updating my "log --graph"
code to work with Junio's new use_terminator flag.
I was initially hoping to find places where we could simplify the
behavior. However, after examining all of the cases, I'm not sure if
there is much we can do without breaking backwards compatibility too
badly. Perhaps we should just check this description into
Documentation/technical, for the next person who needs to work with
this part of the code.
Towards the end of the document, I've also listed how the new --graph
option will interact with each of the cases. I'd love to have
feedback to see if my current plans sound reasonable.
Variables affecting the behavior
--------------------------------
- The sep argument to show_log()
As far as I can tell, this is always "". show_log() is sometimes
invoked with a sep argument of diffopt->msg_sep. However,
"git grep -w msg_sep" shows that msg_sep is never set to anything but
the empty string.
Perhaps we could just remove this argument, to reduce complexity and
confusion?
- The line_termination member of struct diff_options
This is normally '\n', but is set to '\0' if the -z option is
supplied.
- The use_terminator flag in struct rev_info
This flag is enabled by --pretty=oneline and --pretty=format. It is
off for all other formats.
This flag has no effect if opt->verbose_header is not set. If
opt->verbose_header is set, the behavior is as follows:
When use_terminator is set, a newline is added to the end of each log
entry. diffopt.line_termination is not printed in between entries.
When use_terminator is not set, diffopt.line_termination is printed
before all entries except the very first one.
- Whether or not the log entry ends in a terminating newline
The log entry for CMIT_FMT_ONELINE never ends in a newline.
The log entry for CMIT_FMT_USERFORMAT only ends in a newline if the
user explicitly places a newline at the end of the format.
For all other formats, the entry always ends in a terminating newline.
- Whether or not the log message buffer is empty
If use_terminator is set and the message buffer returned by
pretty_print_commit() is empty, no newline is printed between entries.
This can happen, for example, if CMIT_FMT_USERFORMAT is used with an
empty format string. Even though pretty_print_commit() returns an
empty buffer, the log entry itself does not necessarily have to be
empty. For example, the log size will be printed if if the --log-size
argument is used.
I'm not sure if there any are other code paths that can also cause
this to happen.
Current Behavior
----------------
(1) For all formats except CMIT_FMT_ONELINE and CMIT_FMT_USERFORMAT:
diffopt->line_termination is printed before each log entry but the
first. This results in diffopt->line_termination in between each
entry, and not after the final entry.
(1.1)
When diffopt->line_termination is '\0', this prints a NUL
character between log entries.
(1.2)
When diffopt->line_termination is '\n', this essentially adds an
extra blank line of padding between log entries, since each log
entry already ends in a newline.
(2) For CMIT_FMT_ONELINE and CMIT_FMT_USERFORMAT:
(2.1)
If use_terminator is set (--pretty=oneline and --pretty=tformat):
(2.1.1)
If the message buffer returned by pretty_print_commit() is
non-empty:
A newline is printed at the end of each entry. Since the log
entry does not end in a newline already, the newline ends the
last line of the log entry, and there is no extra blank line
between entries.
With --pretty=tformat, if the user's format explicitly ends in a
newline, this will create a blank line at the end of each entry.
(This looks like a padding line, very similar to case 1 above.
However, it shows up even after the very last entry.)
(2.1.2)
If the message buffer returned by pretty_print_commit() is
empty:
A newline is not printed at the end of each entry. Therefore
there are no blank lines between log entries. If the log
entries are completely empty, nothing at all is printed.
(2.2)
If use_terminator is not set (--pretty=format):
diffopt->line_termination is printed before each log entry but the
first. This results in diffopt->line_termination in between each
entry, and not after the final entry.
(2.2.1)
When diffopt->line_termination is '\0', this is reasonable
behavior, and makes sense for consumption by other scripts.
(2.2.2)
When diffopt->line_termination is '\n':
(2.2.2.1)
If the user's format is non-empty and does not end in an
explicit newline, log entries do not end in a newline.
However, the line_termination character printed before the
next entry appears to terminate the previous entry, so it
looks human-readable. There are no blank lines between
entries.
However, no newline is printed after the very last entry.
This seems like a bug to me, since I would expect that the
output is normally being read by a human instead of another
script when diffopt->line_termination is '\n'.
Fortunately, the log output is normally piped through less,
and less hides the fact that the output does not end in a
newline. It prints a terminating newline for us if the git
output doesn't end in a newline.
(2.2.2.2)
If the user's format does end in an explicit newline, the
diffopt->line_termination character between entries creates a
blank line between entries. The output looks exactly like
case 1: a blank line show up between entries, but not after
the very last entry.
(2.2.2.3)
If the user's format is the empty string, the log entry will
either be completely empty, or will already end in a newline.
The diffopt->line_termination characters printed between
entries appear as blank padding lines between entries. If the
log entries are completely empty the resulting output
consists solely of padding lines.
(3) opt->verbose_header is unset
Only a single line of output is printed, terminated with the
diffopt->line_termination character.
Note that this differs from case 1! The line_termination character
is printed even after the very last entry.
Comments on the current behavior
--------------------------------
The current behavior is fairly complicated. There are quite a few
corner cases, especially around uncommon behavior (for example,
"--pretty=format:" with an empty format string). It would be nice to
eliminate some of the complexity. However, doing so would change the
behavior and potentially break existing user scripts relying on this
behavior.
- Cases 1.1 and 1.2 make perfect sense to me.
- Case 2.1 makes perfect sense to me.
- Case 2.1.2 seems reasonable, but not really necessary. It might be
nicer to simply always print a terminating newline even if the message
buffer is empty. This would appear as an extra blank line after each
entry.
The --pretty=tformat option is quite new, so I doubt that many people
would complain if we changed this behavior.
- Case 2.2.1 makes perfect sense. There are probably lots of scripts
out there relying on this behavior.
- I don't really like the behavior for any of the cases under 2.2.2.
I especially don't like the fact that the output does not end in a
terminating newline for case 2.2.2.1.
If the output is for human consumption, the user probably really wants
case 2.1 (i.e., --pretty=tformat instead of --pretty=format.)
Hopefully most scripts that parse the log output use the -z option,
but there are probably also many that don't. Changing this behavior
would break these scripts. Teaching users to use --pretty=tformat
instead of --pretty=format will probably cause less pain than
breaking backwards compatibility.
Desired behavior with the new --graph option
--------------------------------------------
- Case 1.1
(All formats but CMIT_FMT_ONELINE and CMIT_FMT_USERFORMAT,
line_termination is '\0')
The graph output should be printed before each line in the log entry.
Graph output should not be printed between the last newline in the log
entry and the line_termination character ('\0') for the next log entry.
- Case 1.2
(All formats but CMIT_FMT_ONELINE and CMIT_FMT_USERFORMAT,
line_termination is '\n')
The graph output should be printed before each line in the log entry.
For each entry but the first, graph padding should also be printed
*before* the line_termination character. Otherwise the blank padding
line between entries will be completely blank, which would leave gaps
in the graph output.
- Cases 2.1.1 and 2.1.2
(--pretty=oneline and --pretty=tformat)
The graph output should be printed before each line in the log entry.
There is no additional separator between entries, so there is nothing
else to worry about here. All log entries always end in a newline,
which also helps keep things simple.
- Case 2.2.1
(--pretty=format and line_termination is '\0')
The graph output should be printed before each line in the log entry.
Graph output should not be printed between the last newline in the log
entry and the line_termination character ('\0') for the next log entry.
This behavior is identical to that for case 1.1.
- Case 2.2.2.1
(--pretty=format, line_termination is '\n', and with a non-empty
format that doesn't end with a newline)
The graph output should be printed before each line in the log entry.
Graph output should not be printed before the line_termination
character, since this character is merely terminating the last line of
output from the previous entry.
- Cases 2.2.2.2 and 2.2.2.3
(--pretty=format, line_termination is '\n', and with a format that
ends with an explicit newline, or with an empty format)
The graph output should be printed before each line in the log entry.
Graph output *should* be before the line_termination character,
since this character is creating a new blank padding line.
- Case 3
(opt->verbose_header is unset)
Each log entry consists of a single line, terminated with the
line_termination character. The graph output should be printed before
each line.
However, this case currently can't happen via the "git log" command,
since this command always sets the verbose_header flag.
With the current code, there's not really enough information for
the graphing code in show_log() to distinguish between cases cases
2.2.2.1 and 2.2.2.{2,3}.
In order to do so, I plan to add a new flag to struct rev_info to
indicate whether or not the previous log entry ended in a newline. The
next invocation of show_log() can check this flag to see if it should
print graph padding before the line_termination character or not.
Thanks for reading, if you made it all the way down here!
--
Adam Simpkins
adam@adamsimpkins.net
next reply other threads:[~2008-04-28 4:56 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-04-28 4:55 Adam Simpkins [this message]
2008-04-28 16:10 ` entry terminator/separator behavior in show_log() Junio C Hamano
2008-04-29 8:58 ` Adam Simpkins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080428045504.GA24981@adamsimpkins.net \
--to=adam@adamsimpkins.net \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).