From: fork0@t-online.de (Alex Riesen)
To: Shawn Pearce <spearce@spearce.org>
Cc: Jakub Narebski <jnareb@gmail.com>,
Junio C Hamano <junkio@cox.net>,
git@vger.kernel.org
Subject: Re: win2k/cygwin cannot handle even moderately sized packs
Date: Wed, 8 Nov 2006 22:33:14 +0100 [thread overview]
Message-ID: <20061108213314.GA4437@steel.home> (raw)
In-Reply-To: <20061108171131.GA13487@spearce.org>
Shawn Pearce, Wed, Nov 08, 2006 18:11:31 +0100:
> > >All true. However what happens when the header spans two windows?
> > >Lets say I have the first 4 MiB mapped and the next 4 MiB mapped in
> > >a different window; these are not necessarily at the same locations
> > >within memory. Now if an object header is split over these two
> > >then some bytes are at the end of the first window and the rest
> > >are at the start of the next window.
> >
> > Assuming these are adjacent windows, we can just increment counters on the
> > all touched pages (at least the two together) and return the pointer into
> > the lowest page. Otherwise - time for garbage collection (why produce the
> > garbage at all, btw?) and remap.
>
> They are adjacent in the pack file but not necessarily in virtual memory!
Oh, right! Don't know why I thought the mapped regions would be
connected.
> The garbage creation is to account for the 2-4 windows required
> by most applications. Most of the time each window is unused;
> we really only have two windows in use during delta decompression,
> at all other times we really only have 1 window in use. The commit
> parsing applications don't keep the commit window in use when they
> go access a tree or a blob.
So they actually can call unuse_pack to unmap the window,
but it's kept for caching reasons?
> Consequently we want the garbage there. Actually I shouldn't have
> used garbage: the correct term would be LRU managed cache. :-)
> When we need a new window and we would exceed our maximum limit
> (128 MiB in my implementation) we unmap the least recently used
> window which is not currently in use.
Yep, noticed that :) Just wondered why.
> I could be wrong. It may not matter. But I think its crazy to
> unmap otherwise valid mappings just because 2 bytes are on the
> wrong side of an arbitrary boundary.
You're right, would be unfortunate to remap too often.
use_pack always maps at least 20 bytes, if I understand in_window and
its use correctly. Actually, now I'm staring at it longer, I think the
interface I suggested does almost the same, just allows to configure
(well, hint at) the amount of bytes to be mapped in.
I still can't let go of the idea to get as much data as possible with
just one call to sliding window code. Calling use_pack for every byte
just does not seem right.
next prev parent reply other threads:[~2006-11-08 21:33 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-11-07 11:02 win2k/cygwin cannot handle even moderately sized packs Alex Riesen
2006-11-07 12:17 ` Noel Grandin
2006-11-07 13:55 ` Alex Riesen
2006-11-07 15:50 ` Jakub Narebski
2006-11-07 17:28 ` Alex Riesen
2006-11-07 17:48 ` Shawn Pearce
2006-11-07 18:13 ` Alex Riesen
2006-11-07 18:18 ` Shawn Pearce
2006-11-07 18:26 ` Shawn Pearce
2006-11-07 18:56 ` Shawn Pearce
2006-11-07 23:11 ` Alex Riesen
2006-11-08 5:19 ` Shawn Pearce
2006-11-08 13:37 ` Alex Riesen
2006-11-08 17:11 ` Shawn Pearce
2006-11-08 21:33 ` Alex Riesen [this message]
2006-11-08 22:28 ` Shawn Pearce
2006-11-07 19:27 ` Alex Riesen
2006-11-08 19:22 ` Christopher Faylor
2006-11-13 12:45 ` Johannes Schindelin
2006-11-13 17:34 ` Alex Riesen
2006-11-13 17:36 ` Alex Riesen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20061108213314.GA4437@steel.home \
--to=fork0@t-online.de \
--cc=git@vger.kernel.org \
--cc=jnareb@gmail.com \
--cc=junkio@cox.net \
--cc=raa.lkml@gmail.com \
--cc=spearce@spearce.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).