* Sort of a feature proposal
@ 2008-05-07 14:48 David Kastrup
2008-05-07 15:41 ` Nicolas Pitre
2008-05-07 16:25 ` Linus Torvalds
0 siblings, 2 replies; 8+ messages in thread
From: David Kastrup @ 2008-05-07 14:48 UTC (permalink / raw)
To: git
Hi, I have some large git repositories on a USB drive (ext3 file
system). That means that when replugging the drive, the recorded st_dev
data in the index is off, meaning that the whole repo directory
structure gets reread as the stat data of all directories has changed.
That's a nuisance. Can't we have some heuristic or configuration option
where we, say, record the st_dev of the _index_ file, and if that has
changed, we propagate that change to the st_dev of its contents? I'd
like to see something that works more efficiently than rescanning the
whole disk every time I hibernate my computer.
Thanks,
--
David Kastrup
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Sort of a feature proposal
2008-05-07 14:48 Sort of a feature proposal David Kastrup
@ 2008-05-07 15:41 ` Nicolas Pitre
2008-05-07 16:00 ` Stephen R. van den Berg
2008-05-07 16:03 ` Avery Pennarun
2008-05-07 16:25 ` Linus Torvalds
1 sibling, 2 replies; 8+ messages in thread
From: Nicolas Pitre @ 2008-05-07 15:41 UTC (permalink / raw)
To: David Kastrup; +Cc: git
On Wed, 7 May 2008, David Kastrup wrote:
>
> Hi, I have some large git repositories on a USB drive (ext3 file
> system). That means that when replugging the drive, the recorded st_dev
> data in the index is off, meaning that the whole repo directory
> structure gets reread as the stat data of all directories has changed.
>
> That's a nuisance. Can't we have some heuristic or configuration option
> where we, say, record the st_dev of the _index_ file, and if that has
> changed, we propagate that change to the st_dev of its contents? I'd
> like to see something that works more efficiently than rescanning the
> whole disk every time I hibernate my computer.
Maybe simply ignoring st_dev is the solution? I hardly can see what
value it had to the other stat fields.
Nicolas
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Sort of a feature proposal
2008-05-07 15:41 ` Nicolas Pitre
@ 2008-05-07 16:00 ` Stephen R. van den Berg
2008-05-07 16:03 ` Avery Pennarun
1 sibling, 0 replies; 8+ messages in thread
From: Stephen R. van den Berg @ 2008-05-07 16:00 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: David Kastrup, git
Nicolas Pitre wrote:
>Maybe simply ignoring st_dev is the solution? I hardly can see what
>value it had to the other stat fields.
It determines the scope of st_ino.
--
Sincerely, srb@cuci.nl
Stephen R. van den Berg.
Lady Astor: "Winston, if you were my husband, I'd put poison in your coffee."
Churchill: "Nancy, if you were my wife, I'd drink it."
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Sort of a feature proposal
2008-05-07 15:41 ` Nicolas Pitre
2008-05-07 16:00 ` Stephen R. van den Berg
@ 2008-05-07 16:03 ` Avery Pennarun
1 sibling, 0 replies; 8+ messages in thread
From: Avery Pennarun @ 2008-05-07 16:03 UTC (permalink / raw)
To: Nicolas Pitre; +Cc: David Kastrup, git
On 5/7/08, Nicolas Pitre <nico@cam.org> wrote:
> On Wed, 7 May 2008, David Kastrup wrote:
> > Hi, I have some large git repositories on a USB drive (ext3 file
> > system). That means that when replugging the drive, the recorded st_dev
> > data in the index is off, meaning that the whole repo directory
> > structure gets reread as the stat data of all directories has changed.
> >
> > That's a nuisance. Can't we have some heuristic or configuration option
> > where we, say, record the st_dev of the _index_ file, and if that has
> > changed, we propagate that change to the st_dev of its contents? I'd
> > like to see something that works more efficiently than rescanning the
> > whole disk every time I hibernate my computer.
>
> Maybe simply ignoring st_dev is the solution? I hardly can see what
> value it had to the other stat fields.
If I understand correctly, you can be sure a file hasn't changed if it
has exactly the same (dev,inode,ctime,length) attributes. If you
don't track the dev, you can't be certain whether file attributes look
identical but it was actually on another disk, and therefore might
have different content after all.
It's obviously a pretty rare case, but nobody likes a version control
system that works properly "almost" all the time :)
Have fun,
Avery
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Sort of a feature proposal
2008-05-07 14:48 Sort of a feature proposal David Kastrup
2008-05-07 15:41 ` Nicolas Pitre
@ 2008-05-07 16:25 ` Linus Torvalds
2008-05-07 17:39 ` David Kastrup
1 sibling, 1 reply; 8+ messages in thread
From: Linus Torvalds @ 2008-05-07 16:25 UTC (permalink / raw)
To: David Kastrup; +Cc: git
On Wed, 7 May 2008, David Kastrup wrote:
>
> Hi, I have some large git repositories on a USB drive (ext3 file
> system). That means that when replugging the drive, the recorded st_dev
> data in the index is off, meaning that the whole repo directory
> structure gets reread as the stat data of all directories has changed.
>
> That's a nuisance. Can't we have some heuristic or configuration option
> where we, say, record the st_dev of the _index_ file, and if that has
> changed, we propagate that change to the st_dev of its contents? I'd
> like to see something that works more efficiently than rescanning the
> whole disk every time I hibernate my computer.
Hmm. We shouldn't even be using st_dev any more.
How did you compile your git version? By default USE_STDEV should be off,
and it's been that way for a long time (because st_dev is also not
reliable on NFS etc).
Linus
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Sort of a feature proposal
2008-05-07 16:25 ` Linus Torvalds
@ 2008-05-07 17:39 ` David Kastrup
2008-05-07 17:50 ` Dmitry Potapov
0 siblings, 1 reply; 8+ messages in thread
From: David Kastrup @ 2008-05-07 17:39 UTC (permalink / raw)
To: git
Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Wed, 7 May 2008, David Kastrup wrote:
>>
>> Hi, I have some large git repositories on a USB drive (ext3 file
>> system). That means that when replugging the drive, the recorded st_dev
>> data in the index is off, meaning that the whole repo directory
>> structure gets reread as the stat data of all directories has changed.
>>
>> That's a nuisance. Can't we have some heuristic or configuration option
>> where we, say, record the st_dev of the _index_ file, and if that has
>> changed, we propagate that change to the st_dev of its contents? I'd
>> like to see something that works more efficiently than rescanning the
>> whole disk every time I hibernate my computer.
>
> Hmm. We shouldn't even be using st_dev any more.
>
> How did you compile your git version? By default USE_STDEV should be off,
> and it's been that way for a long time (because st_dev is also not
> reliable on NFS etc).
Looks that way in my Makefile. Maybe I am confused: I just did some
timings (this is ext3 on a USB drive) and got
git svn rebase
Current branch master is up to date.
dak@lisa:/lisa/texlive$ time git svn rebase
Current branch master is up to date.
real 0m4.581s
user 0m2.244s
sys 0m1.492s
dak@lisa:/lisa/texlive$ cd
dak@lisa:~$ sudo umount /lisa;sudo mount /dev/mapper/Medion-reps /lisa;cd /lisa/texlive;time git svn rebase
Current branch master is up to date.
real 0m53.588s
user 0m2.248s
sys 0m2.388s
dak@lisa:/lisa/texlive$ cd;sudo umount /lisa;sudo dmsetup remove /dev/mapper/Medion-reps
[Unplug and replug the USB drive]
dak@lisa:~$ sudo mount /dev/mapper/Medion-reps /lisa;cd /lisa/texlive;time git svn rebase
Current branch master is up to date.
real 0m53.101s
user 0m2.324s
sys 0m2.380s
dak@lisa:/lisa/texlive$
If my guess that the device number of LVM does not change when merely
un- and remounting, but does change when unplugging and replugging is
correct, it would appear that my idea where the time went was wrong and
that the device number has nothing whatsoever to do with the large
amount of lookups (this is a USB2.0 device at High Speed).
Is there a way to completely invalidate the disk cache without
unmounting? How do I verify device numbers?
Thanks,
--
David Kastrup
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Sort of a feature proposal
2008-05-07 17:39 ` David Kastrup
@ 2008-05-07 17:50 ` Dmitry Potapov
2008-05-07 18:05 ` David Kastrup
0 siblings, 1 reply; 8+ messages in thread
From: Dmitry Potapov @ 2008-05-07 17:50 UTC (permalink / raw)
To: David Kastrup; +Cc: git
> Is there a way to completely invalidate the disk cache without
> unmounting?
echo 3 > /proc/sys/vm/drop_caches
Dmitry
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Sort of a feature proposal
2008-05-07 17:50 ` Dmitry Potapov
@ 2008-05-07 18:05 ` David Kastrup
0 siblings, 0 replies; 8+ messages in thread
From: David Kastrup @ 2008-05-07 18:05 UTC (permalink / raw)
To: git
"Dmitry Potapov" <dpotapov@gmail.com> writes:
>> Is there a way to completely invalidate the disk cache without
>> unmounting?
>
> echo 3 > /proc/sys/vm/drop_caches
Sigh. It is the disk cache after all. Looks like "git svn rebase"
can't just work from the index (ok, it checks whether there are
unstashed modifications). And indeed, flushing the cache will make much
less of a difference for "git svn fetch" than for rebase.
Sorry for the noise.
--
David Kastrup
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2008-05-07 18:06 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-07 14:48 Sort of a feature proposal David Kastrup
2008-05-07 15:41 ` Nicolas Pitre
2008-05-07 16:00 ` Stephen R. van den Berg
2008-05-07 16:03 ` Avery Pennarun
2008-05-07 16:25 ` Linus Torvalds
2008-05-07 17:39 ` David Kastrup
2008-05-07 17:50 ` Dmitry Potapov
2008-05-07 18:05 ` David Kastrup
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox