git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Marco Costalba" <mcostalba@gmail.com>
To: "Junio C Hamano" <gitster@pobox.com>
Cc: "Git Mailing List" <git@vger.kernel.org>
Subject: Re: [PATCH] Add --show-size to git log to print message size
Date: Sat, 14 Jul 2007 22:46:39 +0200	[thread overview]
Message-ID: <e5bfff550707141346q2eba4ab8ka1c85e8b5a2c1b1d@mail.gmail.com> (raw)
In-Reply-To: <7vodiehko7.fsf@assigned-by-dhcp.cox.net>

On 7/14/07, Junio C Hamano <gitster@pobox.com> wrote:
>
> "size" is a bit vague here.  What if we later want to extend
> things so that you can ask for the entire log entry size
> including the patch output part (I am not saying that would be
> an easy change --- I am more worried about the stability of the
> external interface).  So is --show-"size".  "message-size" would
> have been a bit easier to swallow, but I sense the problem runs
> deeper.

What about --section-sizes?

You can add in the output line all the sizes you want, message, patch
and future extensions separated by a space. An example output for
message and patch sizes.

sizes 456 565\n

Or, as a stream friendly alternative (and also more elegant) you can
output 'section size' before each section, so as example

commit d9e940....
section size 456
< log header and message body>
section size 565
<patch diff content>
section size 232
<other type of content>


> I have a more basic question. If you are reading from non "-p"
> output, where do you exactly have the wasted cycles in your
> reader's processing?

qgit loading works like this:

git log output is read as a series of big binary chunks by a Qt
library function that calls read(), these chunks are read each one in
a different buffer and there they stay for all the application life
time (or until a data refresh), so there is no copy of data in qgit,
the buffers are allocated and the pointers passed to read(), that's
all.

It's a kind of software DMA ;-)

The only information that qgit needs to infere at startup is where to
find the first line of each commit, for parent information and
revision's counting, all the other data is read and consumed only on
demand, i.e. for showing to user, but because only the screen visible
part of the list is needed, data is read from these buffers and parsed
*in small chunks* and only when user scrolls the view.

The problem is that to get the first line of each revision the message
boundaries of _all_ the commits must be known/found.

Because currently there is no message size information the application have to:

-get the offset of commit first line
-try to find the delimiting '\0' if existing (binary chunks could be
truncated at any point)
-get the offset of commit first line of next revision
-and so on for all the revisions

Finding the delimiting '\0' it means to loop across the whole buffers
and _this_ is the expensive and not needed part. If just after the
first line would be possible to point to the beginning of the next
revision this seeking for '\0' would be not necessary anymore.

When user asks for data of revision 'x' then because offset of
revision 'x' is known, application could just point to the correct
offset in the correct data buffer and parse out the (small) needed
info.

Hope I have explained clearly enough, I have some problems writing in
at late evening ;-)


Marco

  reply	other threads:[~2007-07-14 20:46 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-07-14 16:52 [PATCH] Add --show-size to git log to print message size Marco Costalba
2007-07-14 19:03 ` Junio C Hamano
2007-07-14 20:46   ` Marco Costalba [this message]
2007-07-15  9:35     ` Alex Riesen
2007-07-15 10:06       ` Marco Costalba
2007-07-15 10:48         ` Alex Riesen
2007-07-15 11:32           ` Marco Costalba
2007-07-15 12:29             ` Marco Costalba
2007-07-15 12:35               ` Sean
2007-07-15 14:58                 ` Marco Costalba
2007-07-15 15:04                   ` Sean
2007-07-15 15:58                     ` Marco Costalba
2007-07-15 16:16                       ` Sean
2007-07-15 16:27                         ` Marco Costalba
2007-07-15 16:34                           ` Sean
2007-07-15 16:54                             ` Marco Costalba
2007-07-15 18:14               ` Linus Torvalds
2007-07-15 18:45                 ` Marco Costalba
2007-07-16 12:04   ` Marco Costalba
2007-07-16 12:31     ` Alex Riesen
2007-07-16 17:50       ` Junio C Hamano
2007-07-16 17:55         ` Marco Costalba
2007-07-16 18:02           ` Marco Costalba
2007-07-16 22:37             ` Junio C Hamano
2007-07-16 17:50   ` Marco Costalba
2007-07-17  7:49 ` Andy Parkins
2007-07-17 16:36   ` Marco Costalba
2007-07-25  4:03 ` Junio C Hamano
2007-07-25  9:38   ` Marco Costalba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e5bfff550707141346q2eba4ab8ka1c85e8b5a2c1b1d@mail.gmail.com \
    --to=mcostalba@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).