All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andreas Ericsson <ae@op5.se>
To: Marco Costalba <mcostalba@gmail.com>
Cc: Linus Torvalds <torvalds@osdl.org>,
	Git Mailing List <git@vger.kernel.org>,
	Junio C Hamano <junkio@cox.net>, Alex Riesen <raa.lkml@gmail.com>,
	Shawn Pearce <spearce@spearce.org>
Subject: Re: [RFC \ WISH] Add -o option to git-rev-list
Date: Mon, 11 Dec 2006 14:40:55 +0100	[thread overview]
Message-ID: <457D5FE7.3010309@op5.se> (raw)
In-Reply-To: <e5bfff550612110459w205cb9b3lf735359012f84f7c@mail.gmail.com>

Marco Costalba wrote:
> On 12/11/06, Andreas Ericsson <ae@op5.se> wrote:
>> Marco Costalba wrote:
>> > On 12/10/06, Linus Torvalds <torvalds@osdl.org> wrote:
>> >>
>> >> Why don't you use the pipe and standard read()?
>> >>
>> >> Even if you use "popen()" and get a "FILE *" back, you can still do
>> >>
>> >>         int fd = fileno(file);
>> >>
>> >> and use the raw IO capabilities.
>> >>
>> >> The thing is, temporary files can actually be faster under Linux just
>> >> because the Linux page-cache simply kicks ass. But it's not going 
>> to be
>> >> _that_ big of a difference, and you need all that crazy "wait for
>> >> rev-list
>> >> to finish" and the "clean up temp-file on errors" etc crap, so 
>> there's no
>> >> way it's a better solution.
>> >>
>> >
>> > Two things.
>> >
>> > - memory use: the next natural step with files is, instead of loading
>> > the file content in memory and *keep it there*, we could load one
>> > chunk at a time, index the chunk and discard. At the end we keep in
>> > memory only indexing info to quickly get to the data when needed, but
>> > the big part of data stay on the file.
>> >
>>
>> memory usage vs speed tradeoff. Since qgit is a pure user-app, I think
>> it's safe to opt for the memory hungry option. If people run it on too
>> lowbie hardware they'll just have to make do with other ways of viewing
>> the DAG or shutting down some other programs.
>>
>> > - This is probably my ignorance, but experimenting with popen() I
>> > found I could not know *when* git-rev-list ends because both feof()
>> > and ferror() give 0 after a fread() with git-rev-list already defunct.
>> > Not having a reference to the process (it is hidden behind popen() ),
>> > I had to check for 0 bytes read after a successful read (to avoid
>> > racing in case I ask the pipe before the first data it's ready) to
>> > know that job is finished and call pclose().
>> >
>>
>> (coding in MUA, so highly untested)
>>
> 
> Thanks Andreas, I will do some tests with your code. But at first
> sight I fail to see (I'm not an expert on this tough ;-)  ) where is
> the difference from using popen() and fileno() to get the file
> descriptors.
> 

read() vs fread(), so no libc buffers. When I did comparisons with this 
(a long time ago, I don't have the test-program around) in style of

	read(out[0], buf, sizeof(buf));
	write(fileno(stdout), buf, sizeof(buf));

with a command line like this;

	cat any-file | test-program > /dev/null

I saw a static ~10ms increase in execution time compared to

	cat any-file > /dev/null

regardless of the size of "any-file", so I assume this overhead comes 
from the extra fork(), which you'll never get rid of unless you use 
libgit.a.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se

      reply	other threads:[~2006-12-11 13:41 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-12-10 11:38 [RFC \ WISH] Add -o option to git-rev-list Marco Costalba
2006-12-10 14:54 ` Alex Riesen
2006-12-10 18:16 ` Linus Torvalds
2006-12-10 19:51   ` Marco Costalba
2006-12-10 20:00     ` globs in partial checkout? Michael S. Tsirkin
2006-12-10 20:13       ` Linus Torvalds
2006-12-10 21:07         ` Michael S. Tsirkin
2006-12-10 20:08     ` [RFC \ WISH] Add -o option to git-rev-list Linus Torvalds
2006-12-10 20:19       ` Linus Torvalds
2006-12-10 22:05         ` Marco Costalba
2006-12-10 22:09           ` Marco Costalba
2006-12-10 22:16           ` Linus Torvalds
2006-12-10 22:35             ` Marco Costalba
2006-12-10 22:53               ` Linus Torvalds
2006-12-11  0:15                 ` Marco Costalba
2006-12-11  0:51                   ` Linus Torvalds
2006-12-11  7:17                     ` Marco Costalba
2006-12-11 10:00                       ` Alex Riesen
2006-12-11 16:59                       ` Linus Torvalds
2006-12-11 17:07                         ` Linus Torvalds
2006-12-11 17:39                           ` Marco Costalba
2006-12-11 18:15                             ` Linus Torvalds
2006-12-11 18:59                               ` Marco Costalba
2006-12-11 19:25                                 ` Linus Torvalds
2006-12-11 20:28                                 ` Josef Weidendorfer
2006-12-11 20:40                                   ` Linus Torvalds
2006-12-11 20:54                                     ` Josef Weidendorfer
2006-12-11 21:14                                       ` Linus Torvalds
2006-12-15 18:45                                         ` Marco Costalba
2006-12-15 19:20                                           ` Linus Torvalds
2006-12-15 20:41                                             ` Marco Costalba
2006-12-15 21:04                                               ` Marco Costalba
2006-12-11  9:26                   ` Josef Weidendorfer
2006-12-11 12:52                     ` Marco Costalba
2006-12-11 13:28                       ` Josef Weidendorfer
2006-12-11 17:28                         ` Marco Costalba
2006-12-11 11:39     ` Andreas Ericsson
2006-12-11 12:59       ` Marco Costalba
2006-12-11 13:40         ` Andreas Ericsson [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=457D5FE7.3010309@op5.se \
    --to=ae@op5.se \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    --cc=mcostalba@gmail.com \
    --cc=raa.lkml@gmail.com \
    --cc=spearce@spearce.org \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.