* QGit: Shrink used memory with custom git log format @ 2007-11-24 8:14 Marco Costalba 2007-11-27 1:52 ` Shawn O. Pearce 0 siblings, 1 reply; 7+ messages in thread From: Marco Costalba @ 2007-11-24 8:14 UTC (permalink / raw) To: Git Mailing List Hi all, I have pushed a patch series to git://git.kernel.org/pub/scm/qgit/qgit4.git that changes the format of git log used to read data from a git repository. Now instead of --pretty=raw a custom made --pretty=format is given, this shrinks loaded data of 30% (17MB less on Linux tree) and gives a good speed up when you are low on memory (especially on big repos) Next step _would_ be to load log message body on demand (another 50% reduction) but this has two drawbacks: (1) Text search/filter on log message would be broken (2) Slower to browse through revisions because for each revision an additional git-rev-list /git-log command should be executed to read the body The second point is worsted by the fact that it is not possible to keep a command running and "open" like as example git-diff-tree --stdin and feed with additional revision's sha when needed. Avoiding the burden to startup a new process each time to read a new log message given an sha would let the answer much more quick especially on lesser OS's Indeed there is a git-rev-list --stdin option but with different behaviour from git-diff-tree --stdin and not suitable for this. Marco ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: QGit: Shrink used memory with custom git log format 2007-11-24 8:14 QGit: Shrink used memory with custom git log format Marco Costalba @ 2007-11-27 1:52 ` Shawn O. Pearce 2007-11-27 10:48 ` Johannes Schindelin 0 siblings, 1 reply; 7+ messages in thread From: Shawn O. Pearce @ 2007-11-27 1:52 UTC (permalink / raw) To: Marco Costalba; +Cc: Git Mailing List Marco Costalba <mcostalba@gmail.com> wrote: > Now instead of --pretty=raw a custom made --pretty=format is given, > this shrinks loaded data of 30% (17MB less on Linux tree) and gives a > good speed up when you are low on memory (especially on big repos) > > Next step _would_ be to load log message body on demand (another 50% > reduction) but this has two drawbacks: > > (1) Text search/filter on log message would be broken > > (2) Slower to browse through revisions because for each revision an > additional git-rev-list /git-log command should be executed to read > the body > > The second point is worsted by the fact that it is not possible to > keep a command running and "open" like as example git-diff-tree > --stdin and feed with additional revision's sha when needed. Avoiding > the burden to startup a new process each time to read a new log > message given an sha would let the answer much more quick especially > on lesser OS's > > Indeed there is a git-rev-list --stdin option but with different > behaviour from git-diff-tree --stdin and not suitable for this. There was a proposed patch for git-cat-file that would let you run it in a --stdin mode; the git-svn folks wanted this to speed up fetching raw objects from the repository. That may help as you could get commit bodies (in raw format - not reencoded format!) quite efficiently. Otherwise I think what you really want here is a libgit that you can link into your process and that can efficiently inflate an object on demand for you. Like the work Luiz was working on this past summer for GSOC. Lots of downsides to that current tree though... like die() kills the GUI... -- Shawn. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: QGit: Shrink used memory with custom git log format 2007-11-27 1:52 ` Shawn O. Pearce @ 2007-11-27 10:48 ` Johannes Schindelin 2007-11-27 12:36 ` Marco Costalba 2007-11-27 19:19 ` Jan Hudec 0 siblings, 2 replies; 7+ messages in thread From: Johannes Schindelin @ 2007-11-27 10:48 UTC (permalink / raw) To: Shawn O. Pearce; +Cc: Marco Costalba, Git Mailing List Hi, On Mon, 26 Nov 2007, Shawn O. Pearce wrote: > Marco Costalba <mcostalba@gmail.com> wrote: > > Now instead of --pretty=raw a custom made --pretty=format is given, > > this shrinks loaded data of 30% (17MB less on Linux tree) and gives a > > good speed up when you are low on memory (especially on big repos) > > > > Next step _would_ be to load log message body on demand (another 50% > > reduction) but this has two drawbacks: > > > > (1) Text search/filter on log message would be broken > > > > (2) Slower to browse through revisions because for each revision an > > additional git-rev-list /git-log command should be executed to read > > the body > > > > The second point is worsted by the fact that it is not possible to > > keep a command running and "open" like as example git-diff-tree > > --stdin and feed with additional revision's sha when needed. Avoiding > > the burden to startup a new process each time to read a new log > > message given an sha would let the answer much more quick especially > > on lesser OS's > > > > Indeed there is a git-rev-list --stdin option but with different > > behaviour from git-diff-tree --stdin and not suitable for this. > > There was a proposed patch for git-cat-file that would let you run > it in a --stdin mode; the git-svn folks wanted this to speed up > fetching raw objects from the repository. That may help as you > could get commit bodies (in raw format - not reencoded format!) > quite efficiently. > > Otherwise I think what you really want here is a libgit that you can > link into your process and that can efficiently inflate an object > on demand for you. Like the work Luiz was working on this past > summer for GSOC. Lots of downsides to that current tree though... > like die() kills the GUI... But then, die() calls die_routine, which you can override. And C++ has this funny exception mechanism which just begs to be used here. The only thing you need to add is a way to flush all singletons like the object array. Ciao, Dscho ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: QGit: Shrink used memory with custom git log format 2007-11-27 10:48 ` Johannes Schindelin @ 2007-11-27 12:36 ` Marco Costalba 2007-11-27 19:19 ` Jan Hudec 1 sibling, 0 replies; 7+ messages in thread From: Marco Costalba @ 2007-11-27 12:36 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Shawn O. Pearce, Git Mailing List On Nov 27, 2007 11:48 AM, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote: > > > > Indeed there is a git-rev-list --stdin option but with different > > > behaviour from git-diff-tree --stdin and not suitable for this. > > > > There was a proposed patch for git-cat-file that would let you run > > it in a --stdin mode; the git-svn folks wanted this to speed up > > fetching raw objects from the repository. That may help as you > > could get commit bodies (in raw format - not reencoded format!) > > quite efficiently. > That would be nice. > > Otherwise I think what you really want here is a libgit that you can > > link into your process and that can efficiently inflate an object > > on demand for you. I would think libgit is overkilling for this. You probably would not use libgit to just add a single feature but to change completely the interface with git because the required work is heavy both on git side and qgit side (you probably would want to run the libgit linked part in a separated thread to avoid GUI soft locks during slow processing, now, because the executed git command is a different process from qgit, the OS scheduler takes care of this 'for free'). Marco ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: QGit: Shrink used memory with custom git log format 2007-11-27 10:48 ` Johannes Schindelin 2007-11-27 12:36 ` Marco Costalba @ 2007-11-27 19:19 ` Jan Hudec 2007-11-28 12:01 ` Johannes Schindelin 1 sibling, 1 reply; 7+ messages in thread From: Jan Hudec @ 2007-11-27 19:19 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Shawn O. Pearce, Marco Costalba, Git Mailing List [-- Attachment #1: Type: text/plain, Size: 1213 bytes --] On Tue, Nov 27, 2007 at 10:48:00 +0000, Johannes Schindelin wrote: > On Mon, 26 Nov 2007, Shawn O. Pearce wrote: > > [...] > > Otherwise I think what you really want here is a libgit that you can > > link into your process and that can efficiently inflate an object > > on demand for you. Like the work Luiz was working on this past > > summer for GSOC. Lots of downsides to that current tree though... > > like die() kills the GUI... > > But then, die() calls die_routine, which you can override. And C++ has > this funny exception mechanism which just begs to be used here. The only > thing you need to add is a way to flush all singletons like the object > array. Unfortunately, exceptions won't really work. Why? Because to use exceptions, you need to have an exception-safe code. That is the code needs to free any allocated resources when it's aborted by exception. And git code is not exceptions safe. Given the lack of destructors in C, it means registering all resource allocation in some kind of pool, so they can be freed en masse in case of failure. Than you can also use longjmp for die (for C they really behave the same). -- Jan 'Bulb' Hudec <bulb@ucw.cz> [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: QGit: Shrink used memory with custom git log format 2007-11-27 19:19 ` Jan Hudec @ 2007-11-28 12:01 ` Johannes Schindelin 2007-11-28 15:53 ` jhud7196 0 siblings, 1 reply; 7+ messages in thread From: Johannes Schindelin @ 2007-11-28 12:01 UTC (permalink / raw) To: Jan Hudec; +Cc: Shawn O. Pearce, Marco Costalba, Git Mailing List Hi, On Tue, 27 Nov 2007, Jan Hudec wrote: > On Tue, Nov 27, 2007 at 10:48:00 +0000, Johannes Schindelin wrote: > > On Mon, 26 Nov 2007, Shawn O. Pearce wrote: > > > [...] > > > Otherwise I think what you really want here is a libgit that you can > > > link into your process and that can efficiently inflate an object > > > on demand for you. Like the work Luiz was working on this past > > > summer for GSOC. Lots of downsides to that current tree though... > > > like die() kills the GUI... > > > > But then, die() calls die_routine, which you can override. And C++ has > > this funny exception mechanism which just begs to be used here. The only > > thing you need to add is a way to flush all singletons like the object > > array. > > Unfortunately, exceptions won't really work. Why? Because to use > exceptions, you need to have an exception-safe code. That is the code > needs to free any allocated resources when it's aborted by exception. > And git code is not exceptions safe. Given the lack of destructors in C, > it means registering all resource allocation in some kind of pool, so > they can be freed en masse in case of failure. Than you can also use > longjmp for die (for C they really behave the same). Sorry, I just assumed that you can read my mind (or alternatively remember what I suggested a few months ago, namely to "override" xmalloc(), xcalloc(), xrealloc() and xfree() (probably you need to create the latter)). Ciao, Dscho ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: QGit: Shrink used memory with custom git log format 2007-11-28 12:01 ` Johannes Schindelin @ 2007-11-28 15:53 ` jhud7196 0 siblings, 0 replies; 7+ messages in thread From: jhud7196 @ 2007-11-28 15:53 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Shawn O. Pearce, Marco Costalba, Git Mailing List > Hi, > > On Tue, 27 Nov 2007, Jan Hudec wrote: > >> On Tue, Nov 27, 2007 at 10:48:00 +0000, Johannes Schindelin wrote: >> > On Mon, 26 Nov 2007, Shawn O. Pearce wrote: >> > > [...] >> > > Otherwise I think what you really want here is a libgit that you can >> > > link into your process and that can efficiently inflate an object >> > > on demand for you. Like the work Luiz was working on this past >> > > summer for GSOC. Lots of downsides to that current tree though... >> > > like die() kills the GUI... >> > >> > But then, die() calls die_routine, which you can override. And C++ >> has >> > this funny exception mechanism which just begs to be used here. The >> only >> > thing you need to add is a way to flush all singletons like the object >> > array. >> >> Unfortunately, exceptions won't really work. Why? Because to use >> exceptions, you need to have an exception-safe code. That is the code >> needs to free any allocated resources when it's aborted by exception. >> And git code is not exceptions safe. Given the lack of destructors in C, >> it means registering all resource allocation in some kind of pool, so >> they can be freed en masse in case of failure. Than you can also use >> longjmp for die (for C they really behave the same). > > Sorry, I just assumed that you can read my mind (or alternatively remember > what I suggested a few months ago, namely to "override" xmalloc(), > xcalloc(), xrealloc() and xfree() (probably you need to create the > latter)). That sounds like the easiest (but not necessarily easy) direction towards the goal. Thread-local or global (I don't think git is currently reentrant anyway) would do. Also filehanles would have to be taken care of and everything checked for using malloc, calloc, strdup and other libc functions directly. Than die could longjmp out to a specified buffer and could be safely overriden to throw exception for C++ apps. -- - Jan Hudec <bulb@ucw.cz> ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2007-11-28 15:53 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-11-24 8:14 QGit: Shrink used memory with custom git log format Marco Costalba 2007-11-27 1:52 ` Shawn O. Pearce 2007-11-27 10:48 ` Johannes Schindelin 2007-11-27 12:36 ` Marco Costalba 2007-11-27 19:19 ` Jan Hudec 2007-11-28 12:01 ` Johannes Schindelin 2007-11-28 15:53 ` jhud7196
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).