* Fast access git-rev-list output: some OS knowledge required @ 2006-12-06 19:24 Marco Costalba 2006-12-06 19:28 ` Shawn Pearce 0 siblings, 1 reply; 17+ messages in thread From: Marco Costalba @ 2006-12-06 19:24 UTC (permalink / raw) To: Git Mailing List I ask help to the list because my knowledge on this is not enough. Currently qgit uses, socket based, QProcess class to read data from 'git rev-list' when loading the repository at startup. The time it takes to read, without processing, the whole Linux tree with this approach it's almost _double_ of the time it takes 'git rev-list' to write to a file: $git rev-list --header --boundary --parents --topo-order HEAD >> tmp.txt We are talking of about 7s against less then 4s, on my box (warm cache). So I have a patch to make 'git rev-list' writing into a temporary file and then read it in memory, perhaps it's not the cleaner way, but it's faster, about 1s less. I have browsed Qt sources and found that QProcess uses internal buffers that are then copied again before to be used by the application. File approach uses a call to read() /fread() buired inside the Qt's QFile class, and no intermediate buffers, so perhaps this could be the reason the second way it's faster. Anyway there are some issues: 1) File tmp.txt is deleted as soon as read, but this is not enough sometimes to avoid a costly and wasteful write access to disk by the OS. What is the easiest, portable way to create a temporary 'in memory only' file, with no disk access? Or at least delay the HD write access enough to be able to read and delete the file before the fist block of tmp.txt is flushed to disk? 2) There is a faster/cleaner (and *safe* ) way to access directly 'git rev-list' output, something like (just as an example): $git rev-list --header --boundary --parents --topo-order HEAD >> /dev/mem Or something similar, possibly _simple_ and _portable_ , so to be able to copy the big amount of 'git rev-list' output just once (about 30MB with current tree). 3) Other suggestions? ;-) Thanks ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fast access git-rev-list output: some OS knowledge required 2006-12-06 19:24 Fast access git-rev-list output: some OS knowledge required Marco Costalba @ 2006-12-06 19:28 ` Shawn Pearce 2006-12-06 19:34 ` Marco Costalba 2006-12-06 23:27 ` Johannes Schindelin 0 siblings, 2 replies; 17+ messages in thread From: Shawn Pearce @ 2006-12-06 19:28 UTC (permalink / raw) To: Marco Costalba; +Cc: Git Mailing List Marco Costalba <mcostalba@gmail.com> wrote: > The time it takes to read, without processing, the whole Linux tree > with this approach it's almost _double_ of the time it takes 'git > rev-list' to write to a file: > > 3) Other suggestions? ;-) The revision listing machinery is fairly well isolated behind some pretty clean APIs in Git. Why not link qgit against libgit.a and just do the revision listing in process? -- ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fast access git-rev-list output: some OS knowledge required 2006-12-06 19:28 ` Shawn Pearce @ 2006-12-06 19:34 ` Marco Costalba 2006-12-06 19:42 ` Shawn Pearce 2006-12-06 23:27 ` Johannes Schindelin 1 sibling, 1 reply; 17+ messages in thread From: Marco Costalba @ 2006-12-06 19:34 UTC (permalink / raw) To: Shawn Pearce; +Cc: Git Mailing List On 12/6/06, Shawn Pearce <spearce@spearce.org> wrote: > Marco Costalba <mcostalba@gmail.com> wrote: > > The time it takes to read, without processing, the whole Linux tree > > with this approach it's almost _double_ of the time it takes 'git > > rev-list' to write to a file: > > > > 3) Other suggestions? ;-) > > The revision listing machinery is fairly well isolated behind some > pretty clean APIs in Git. Why not link qgit against libgit.a and > just do the revision listing in process? > Where can I found some documentation (yes I know RTFS, but...) or, better, an example of using the API to read git-rev-list output? if it is possible I also would like to avoid to mess with internal git API's, of course *if it is possible* ;-) Thanks ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fast access git-rev-list output: some OS knowledge required 2006-12-06 19:34 ` Marco Costalba @ 2006-12-06 19:42 ` Shawn Pearce 2006-12-06 19:51 ` Shawn Pearce 0 siblings, 1 reply; 17+ messages in thread From: Shawn Pearce @ 2006-12-06 19:42 UTC (permalink / raw) To: Marco Costalba; +Cc: Git Mailing List Marco Costalba <mcostalba@gmail.com> wrote: > On 12/6/06, Shawn Pearce <spearce@spearce.org> wrote: > >Marco Costalba <mcostalba@gmail.com> wrote: > >> The time it takes to read, without processing, the whole Linux tree > >> with this approach it's almost _double_ of the time it takes 'git > >> rev-list' to write to a file: > >> > >> 3) Other suggestions? ;-) > > > >The revision listing machinery is fairly well isolated behind some > >pretty clean APIs in Git. Why not link qgit against libgit.a and > >just do the revision listing in process? > > > > Where can I found some documentation (yes I know RTFS, but...) or, > better, an example of using the API to read git-rev-list output? builtin-rev-list.c. :-) I think all you may need is: #include "revision.h" ... struct rev_info revs; init_revisions(&revs, prefix); revs.abbrev = 0; revs.commit_format = CMIT_FMT_UNSPECIFIED; argc = setup_revisions(argc, argv, &revs, NULL); where argv just a char** of the arguments you were going to hand to rev-list on the command line. then get the data back: static void show_commit(struct commit *commit) { const char * hex = sha1_to_hex(commit->object.sha1); ... copy from hex to your own structures ... } static void show_object(struct object_array_entry *p) { /* do nothing */ } prepare_revision_walk(&revs); traverse_commit_list(&revs, show_commit, show_object); -- ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fast access git-rev-list output: some OS knowledge required 2006-12-06 19:42 ` Shawn Pearce @ 2006-12-06 19:51 ` Shawn Pearce 2006-12-06 20:08 ` Marco Costalba 2006-12-07 13:25 ` Andreas Ericsson 0 siblings, 2 replies; 17+ messages in thread From: Shawn Pearce @ 2006-12-06 19:51 UTC (permalink / raw) To: Marco Costalba; +Cc: Git Mailing List Shawn Pearce <spearce@spearce.org> wrote: > I think all you may need is: > > #include "revision.h" > ... You'll also need to call: setup_git_directory(); before any of the below; but that should be done once per process. > struct rev_info revs; > init_revisions(&revs, prefix); > revs.abbrev = 0; > revs.commit_format = CMIT_FMT_UNSPECIFIED; > argc = setup_revisions(argc, argv, &revs, NULL); Although now that I think about it the library may not be enough of a library. Some data (e.g. commits) will stay in memory forever once loaded. Pack files won't be released once read; a pack recently made available while the application is running may not get noticed. Perhaps there is some fast IPC API supported by Qt that you could use to run the revision listing outside of the main UI process, to eliminate the bottlenecks you are seeing and remove the problems noted above? One that doesn't involve reading from a pipe I mean... -- ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fast access git-rev-list output: some OS knowledge required 2006-12-06 19:51 ` Shawn Pearce @ 2006-12-06 20:08 ` Marco Costalba 2006-12-06 20:18 ` Shawn Pearce 2006-12-07 13:25 ` Andreas Ericsson 1 sibling, 1 reply; 17+ messages in thread From: Marco Costalba @ 2006-12-06 20:08 UTC (permalink / raw) To: Shawn Pearce; +Cc: Git Mailing List On 12/6/06, Shawn Pearce <spearce@spearce.org> wrote: > Shawn Pearce <spearce@spearce.org> wrote: > > Perhaps there is some fast IPC API supported by Qt that you could > use to run the revision listing outside of the main UI process, > to eliminate the bottlenecks you are seeing and remove the problems > noted above? One that doesn't involve reading from a pipe I mean... > Qt it's very fast in reading from files, also git-rev-list is fast in write to a file...the problem is I would not want the file to be saved on disk, but stay cached in the OS memory for the few seconds needed to be written and read back, and then deleted. It's a kind of shared memory at the end. But I don't know how to realize it. Also let git-rev-list to write directly in qgit process address space would be nice, indeed very nice. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fast access git-rev-list output: some OS knowledge required 2006-12-06 20:08 ` Marco Costalba @ 2006-12-06 20:18 ` Shawn Pearce 0 siblings, 0 replies; 17+ messages in thread From: Shawn Pearce @ 2006-12-06 20:18 UTC (permalink / raw) To: Marco Costalba; +Cc: Git Mailing List Marco Costalba <mcostalba@gmail.com> wrote: > On 12/6/06, Shawn Pearce <spearce@spearce.org> wrote: > >Shawn Pearce <spearce@spearce.org> wrote: > > > >Perhaps there is some fast IPC API supported by Qt that you could > >use to run the revision listing outside of the main UI process, > >to eliminate the bottlenecks you are seeing and remove the problems > >noted above? One that doesn't involve reading from a pipe I mean... > > > > Qt it's very fast in reading from files, also git-rev-list is fast in > write to a file...the problem is I would not want the file to be saved > on disk, but stay cached in the OS memory for the few seconds needed > to be written and read back, and then deleted. It's a kind of shared > memory at the end. But I don't know how to realize it. On a modern Linux (probably your largest target audience) a small file which has a very short lifespan (few seconds) is unlikey to hit the platter. Most filesystems will put the data into buffer cache and delay writing to disk because temporary files are so common on UNIX. Though our resident Linux experts may chime in with more details... > Also let git-rev-list to write directly in qgit process address space > would be nice, indeed very nice. And ugly. :-) SysV IPC (shared memory, semaphores) are messy and difficult to get right. mmap against a random file in the filesystem tends to work better on those systems which support it well, provided that the file isn't on a network mount. But again you still need semaphores or something like them to control access to the data in the mmap'd region. I was thinking that maybe if Qt had a bounded buffer available for use between a process and its child, that you could use that to run your own "qgit-rev-list" child and get the data back more quickly, without the need for a temporary file. But it doesn't look like they have one. Oh well. Your current temporary file approach is probably the best you can get, and has the simplest possible implementation. Doing better would require linking against libgit.a, and getting the core Git hackers to make at least the revision machinery more useful in a library setting. -- ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fast access git-rev-list output: some OS knowledge required 2006-12-06 19:51 ` Shawn Pearce 2006-12-06 20:08 ` Marco Costalba @ 2006-12-07 13:25 ` Andreas Ericsson 2006-12-07 14:53 ` Johannes Schindelin 2006-12-08 18:34 ` Marco Costalba 1 sibling, 2 replies; 17+ messages in thread From: Andreas Ericsson @ 2006-12-07 13:25 UTC (permalink / raw) To: Shawn Pearce; +Cc: Marco Costalba, Git Mailing List Shawn Pearce wrote: > > Perhaps there is some fast IPC API supported by Qt that you could > use to run the revision listing outside of the main UI process, > to eliminate the bottlenecks you are seeing and remove the problems > noted above? One that doesn't involve reading from a pipe I mean... > Why not just fork() + exec() and read from the filedescriptor? You can up the output buffer of the forked program to something suitable, which means the OS will cache it for you until you copy it to a buffer in qgit (i.e., read from the descriptor). -- Andreas Ericsson andreas.ericsson@op5.se OP5 AB www.op5.se ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fast access git-rev-list output: some OS knowledge required 2006-12-07 13:25 ` Andreas Ericsson @ 2006-12-07 14:53 ` Johannes Schindelin 2006-12-07 15:28 ` Andreas Ericsson 2006-12-08 18:34 ` Marco Costalba 1 sibling, 1 reply; 17+ messages in thread From: Johannes Schindelin @ 2006-12-07 14:53 UTC (permalink / raw) To: Andreas Ericsson; +Cc: Shawn Pearce, Marco Costalba, Git Mailing List Hi, On Thu, 7 Dec 2006, Andreas Ericsson wrote: > Shawn Pearce wrote: > > > > Perhaps there is some fast IPC API supported by Qt that you could use > > to run the revision listing outside of the main UI process, to > > eliminate the bottlenecks you are seeing and remove the problems noted > > above? One that doesn't involve reading from a pipe I mean... > > > > Why not just fork() + exec() and read from the filedescriptor? You can > up the output buffer of the forked program to something suitable, which > means the OS will cache it for you until you copy it to a buffer in qgit > (i.e., read from the descriptor). Could somebody remind me why different processes are needed? I thought that the revision machinery should be used directly, by linking to libgit.a... Ciao, Dscho ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fast access git-rev-list output: some OS knowledge required 2006-12-07 14:53 ` Johannes Schindelin @ 2006-12-07 15:28 ` Andreas Ericsson 2006-12-07 16:01 ` Johannes Schindelin 0 siblings, 1 reply; 17+ messages in thread From: Andreas Ericsson @ 2006-12-07 15:28 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Shawn Pearce, Marco Costalba, Git Mailing List Johannes Schindelin wrote: > Hi, > > On Thu, 7 Dec 2006, Andreas Ericsson wrote: > >> Shawn Pearce wrote: >>> Perhaps there is some fast IPC API supported by Qt that you could use >>> to run the revision listing outside of the main UI process, to >>> eliminate the bottlenecks you are seeing and remove the problems noted >>> above? One that doesn't involve reading from a pipe I mean... >>> >> Why not just fork() + exec() and read from the filedescriptor? You can >> up the output buffer of the forked program to something suitable, which >> means the OS will cache it for you until you copy it to a buffer in qgit >> (i.e., read from the descriptor). > > Could somebody remind me why different processes are needed? I thought > that the revision machinery should be used directly, by linking to > libgit.a... > You wrote: --%<--%<--%<-- Because, depending on what you do, the revision machinery is not reentrable. For example, if you filter by filename, the history is rewritten in-memory to simulate a history where just that filename was tracked, and nothing else. These changes are not cleaned up after calling the internal revision machinery. --%<--%<--%<-- When I wrote the above suggestion, I hadn't read the posts following the email where I cut this text from (where Linus said "we can add a 'reset' thingie to the revision walking machinery" and Marco replied with some more questions). -- Andreas Ericsson andreas.ericsson@op5.se OP5 AB www.op5.se ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fast access git-rev-list output: some OS knowledge required 2006-12-07 15:28 ` Andreas Ericsson @ 2006-12-07 16:01 ` Johannes Schindelin 0 siblings, 0 replies; 17+ messages in thread From: Johannes Schindelin @ 2006-12-07 16:01 UTC (permalink / raw) To: Andreas Ericsson; +Cc: Shawn Pearce, Marco Costalba, Git Mailing List Hi, On Thu, 7 Dec 2006, Andreas Ericsson wrote: > Johannes Schindelin wrote: > > Hi, > > > > On Thu, 7 Dec 2006, Andreas Ericsson wrote: > > > > > Shawn Pearce wrote: > > > > Perhaps there is some fast IPC API supported by Qt that you could use to > > > > run the revision listing outside of the main UI process, to eliminate > > > > the bottlenecks you are seeing and remove the problems noted above? One > > > > that doesn't involve reading from a pipe I mean... > > > > > > > Why not just fork() + exec() and read from the filedescriptor? You can up > > > the output buffer of the forked program to something suitable, which means > > > the OS will cache it for you until you copy it to a buffer in qgit (i.e., > > > read from the descriptor). > > > > Could somebody remind me why different processes are needed? I thought that > > the revision machinery should be used directly, by linking to libgit.a... > > > > You wrote: > --%<--%<--%<-- > Because, depending on what you do, the revision machinery is not > reentrable. For example, if you filter by filename, the history is > rewritten in-memory to simulate a history where just that filename was > tracked, and nothing else. These changes are not cleaned up after calling the > internal revision machinery. > --%<--%<--%<-- > > When I wrote the above suggestion, I hadn't read the posts following the > email where I cut this text from (where Linus said "we can add a 'reset' > thingie to the revision walking machinery" and Marco replied with some > more questions). Yes. The reset thingie is already in place: clear_commit_marks(). It would have to be enhanced a little, though: 1) the function rewrite_parents(), should add another flag, HALFORPHANED, and 2) clear_commit_marks() should unset the "parsed" flag of the commits for which HALFORPHANED is reset. -- snip -- diff --git a/commit.c b/commit.c index d5103cd..fd225c8 100644 --- a/commit.c +++ b/commit.c @@ -431,6 +431,10 @@ void clear_commit_marks(struct commit *commit, unsigned int mark) { struct commit_list *parents; + /* were parents rewritten? */ + if ((mark & commit->object.flags) & HALFORPHANED) + commit->object.parsed = 0; + commit->object.flags &= ~mark; parents = commit->parents; while (parents) { diff --git a/revision.c b/revision.c index 993bb66..461ee06 100644 --- a/revision.c +++ b/revision.c @@ -1097,6 +1097,7 @@ static void rewrite_parents(struct rev_info *revs, struct commit *commit) struct commit_list *parent = *pp; if (rewrite_one(revs, &parent->item) < 0) { *pp = parent->next; + commit->object.flags |= HALFORPHANED; continue; } pp = &parent->next; diff --git a/revision.h b/revision.h index 3adab95..544238c 100644 --- a/revision.h +++ b/revision.h @@ -9,6 +9,7 @@ #define BOUNDARY (1u<<5) #define BOUNDARY_SHOW (1u<<6) #define ADDED (1u<<7) /* Parents already parsed and added? */ +#define HALFORPHANED (1u<<8) /* parents were rewritten */ struct rev_info; struct log_info; -- snap -- Note that this is just the idea. This particular implementation opens a gaping memory leak, since the buffer of the commit is not free()d, and a reparse would probably not pick up on the fact that the parent commits are already in memory. Ciao, Dscho ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: Fast access git-rev-list output: some OS knowledge required 2006-12-07 13:25 ` Andreas Ericsson 2006-12-07 14:53 ` Johannes Schindelin @ 2006-12-08 18:34 ` Marco Costalba 2006-12-08 20:10 ` Michael K. Edwards 1 sibling, 1 reply; 17+ messages in thread From: Marco Costalba @ 2006-12-08 18:34 UTC (permalink / raw) To: Andreas Ericsson; +Cc: Shawn Pearce, Git Mailing List On 12/7/06, Andreas Ericsson <ae@op5.se> wrote: > Shawn Pearce wrote: > > > > Perhaps there is some fast IPC API supported by Qt that you could > > use to run the revision listing outside of the main UI process, > > to eliminate the bottlenecks you are seeing and remove the problems > > noted above? One that doesn't involve reading from a pipe I mean... > > > > Why not just fork() + exec() and read from the filedescriptor? You can > up the output buffer of the forked program to something suitable, which > means the OS will cache it for you until you copy it to a buffer in qgit > (i.e., read from the descriptor). > Please, what do you mean with "something suitable"? How can I redirect the output to a memory buffer or to a file that the OS will cache *until* I've copied it? If I redirect to a 'normal' file, this will be flushed by OS after some time, normally few seconds. Could you please post links with examples/docs about this kind of implementation? Thanks ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fast access git-rev-list output: some OS knowledge required 2006-12-08 18:34 ` Marco Costalba @ 2006-12-08 20:10 ` Michael K. Edwards 2006-12-09 12:15 ` Marco Costalba 0 siblings, 1 reply; 17+ messages in thread From: Michael K. Edwards @ 2006-12-08 20:10 UTC (permalink / raw) To: Marco Costalba; +Cc: Andreas Ericsson, Shawn Pearce, Git Mailing List There is a very handy solution to this problem called "tmpfs". It should already be mounted at /tmp. Put tmp.txt there and your problem will go away. Cheers, ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fast access git-rev-list output: some OS knowledge required 2006-12-08 20:10 ` Michael K. Edwards @ 2006-12-09 12:15 ` Marco Costalba 0 siblings, 0 replies; 17+ messages in thread From: Marco Costalba @ 2006-12-09 12:15 UTC (permalink / raw) To: Michael K. Edwards; +Cc: Andreas Ericsson, Shawn Pearce, Git Mailing List On 12/8/06, Michael K. Edwards <medwards.linux@gmail.com> wrote: > There is a very handy solution to this problem called "tmpfs". It > should already be mounted at /tmp. Put tmp.txt there and your problem > will go away. > Thanks Michael, It seems to work! patch pushed. Marco P.S: I've looked again to Shawn idea (and code) of linking qgit against libgit.a but I found these two difficult points: - traverse_commit_list(&revs, show_commit, show_object) is blocking, i.e. the GUI will stop responding for few seconds while traversing the list. This is easily and transparently solved by the OS scheduler if an external process is used for git-rev-list. To solve this in qgit I have two ways: 1) call QEventLoop() once in a while from inside show_commit()/ show_object() to process pending events 2) Use a separate thread (QThread class). The first idea is not nice, the second opens a whole a new set of problems and it's a big amount of not trivial new code to add. - traverse_commit_list() having an internal state it's not re-entrant. git-rev-list it's used to load main view data but also file history in another tab, and the two calls _could_ be ran concurrently. With external process I simply run two instances of DataLoader class and consequently two external git-rev-list processes, ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fast access git-rev-list output: some OS knowledge required 2006-12-06 19:28 ` Shawn Pearce 2006-12-06 19:34 ` Marco Costalba @ 2006-12-06 23:27 ` Johannes Schindelin 2006-12-07 0:47 ` Linus Torvalds 1 sibling, 1 reply; 17+ messages in thread From: Johannes Schindelin @ 2006-12-06 23:27 UTC (permalink / raw) To: Shawn Pearce; +Cc: Marco Costalba, Git Mailing List Hi, On Wed, 6 Dec 2006, Shawn Pearce wrote: > Marco Costalba <mcostalba@gmail.com> wrote: > > The time it takes to read, without processing, the whole Linux tree > > with this approach it's almost _double_ of the time it takes 'git > > rev-list' to write to a file: > > > > 3) Other suggestions? ;-) > > The revision listing machinery is fairly well isolated behind some > pretty clean APIs in Git. Why not link qgit against libgit.a and just > do the revision listing in process? Because, depending on what you do, the revision machinery is not reentrable. For example, if you filter by filename, the history is rewritten in-memory to simulate a history where just that filename was tracked, and nothing else. These changes are not cleaned up after calling the internal revision machinery. Hth, Dscho ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fast access git-rev-list output: some OS knowledge required 2006-12-06 23:27 ` Johannes Schindelin @ 2006-12-07 0:47 ` Linus Torvalds 2006-12-07 6:46 ` Marco Costalba 0 siblings, 1 reply; 17+ messages in thread From: Linus Torvalds @ 2006-12-07 0:47 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Shawn Pearce, Marco Costalba, Git Mailing List On Thu, 7 Dec 2006, Johannes Schindelin wrote: > > Because, depending on what you do, the revision machinery is not > reentrable. For example, if you filter by filename, the history is > rewritten in-memory to simulate a history where just that filename was > tracked, and nothing else. These changes are not cleaned up after calling > the internal revision machinery. Well, it really wouldn't be that hard to add a new library interface to "reset object state". We could fairly trivially either: - walk all objects in the object hashes and clear all the flags. - just clear all objects _and_ the hashes. Yes, it implies a small amount of manual "management", but considering that the reason it needs to be manual is that the functions simply _need_ the state, is that such a big deal? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Fast access git-rev-list output: some OS knowledge required 2006-12-07 0:47 ` Linus Torvalds @ 2006-12-07 6:46 ` Marco Costalba 0 siblings, 0 replies; 17+ messages in thread From: Marco Costalba @ 2006-12-07 6:46 UTC (permalink / raw) To: Linus Torvalds; +Cc: Johannes Schindelin, Shawn Pearce, Git Mailing List On 12/7/06, Linus Torvalds <torvalds@osdl.org> wrote: > > > On Thu, 7 Dec 2006, Johannes Schindelin wrote: > > > > Because, depending on what you do, the revision machinery is not > > reentrable. For example, if you filter by filename, the history is > > rewritten in-memory to simulate a history where just that filename was > > tracked, and nothing else. These changes are not cleaned up after calling > > the internal revision machinery. > > Well, it really wouldn't be that hard to add a new library interface to > "reset object state". We could fairly trivially either: > So the library approach sounds like the best? Of course in this case the producer git-rev-list and the receiver use the same address space. In the case of a temporary file data is first copied to OS disk cache buffers and then again to userspace, in qgit address space. But the real pain is that the temporary file is always flushed to disk after 4-5 seconds from creation, also if under heavy read/write activity. This is a problem for big repos. I really don't know how to workaround this useless disk flush. Finally, what about using some kind of shared memory at run time, instead of _sharing_ developer libraries ;-) ? is it too messy? Probably the concurrent reading while writing is possible without syncro if the reader understands that a sequence of _two_ or more \0 it means the end of current write stream if producer is still running or the end of data if producer is not running anymore. I use a similar approach in the 'temporary file' patch where receiver is able to read while producer writes without explicit synchronization. In that case a read() of a block smaller then maximum with producer still running is used as the 'break' condition in the receiver while loop. ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2006-12-09 12:15 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-12-06 19:24 Fast access git-rev-list output: some OS knowledge required Marco Costalba 2006-12-06 19:28 ` Shawn Pearce 2006-12-06 19:34 ` Marco Costalba 2006-12-06 19:42 ` Shawn Pearce 2006-12-06 19:51 ` Shawn Pearce 2006-12-06 20:08 ` Marco Costalba 2006-12-06 20:18 ` Shawn Pearce 2006-12-07 13:25 ` Andreas Ericsson 2006-12-07 14:53 ` Johannes Schindelin 2006-12-07 15:28 ` Andreas Ericsson 2006-12-07 16:01 ` Johannes Schindelin 2006-12-08 18:34 ` Marco Costalba 2006-12-08 20:10 ` Michael K. Edwards 2006-12-09 12:15 ` Marco Costalba 2006-12-06 23:27 ` Johannes Schindelin 2006-12-07 0:47 ` Linus Torvalds 2006-12-07 6:46 ` Marco Costalba
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).