git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: fork0@t-online.de (Alex Riesen)
To: Shawn Pearce <spearce@spearce.org>
Cc: Jakub Narebski <jnareb@gmail.com>,
	Junio C Hamano <junkio@cox.net>,
	git@vger.kernel.org
Subject: Re: win2k/cygwin cannot handle even moderately sized packs
Date: Wed, 8 Nov 2006 22:33:14 +0100	[thread overview]
Message-ID: <20061108213314.GA4437@steel.home> (raw)
In-Reply-To: <20061108171131.GA13487@spearce.org>

Shawn Pearce, Wed, Nov 08, 2006 18:11:31 +0100:
> > >All true.  However what happens when the header spans two windows?
> > >Lets say I have the first 4 MiB mapped and the next 4 MiB mapped in
> > >a different window; these are not necessarily at the same locations
> > >within memory.  Now if an object header is split over these two
> > >then some bytes are at the end of the first window and the rest
> > >are at the start of the next window.
> > 
> > Assuming these are adjacent windows, we can just increment counters on the
> > all touched pages (at least the two together) and return the pointer into
> > the lowest page. Otherwise - time for garbage collection (why produce the
> > garbage at all, btw?) and remap.
> 
> They are adjacent in the pack file but not necessarily in virtual memory!

Oh, right! Don't know why I thought the mapped regions would be
connected.

> The garbage creation is to account for the 2-4 windows required
> by most applications.  Most of the time each window is unused;
> we really only have two windows in use during delta decompression,
> at all other times we really only have 1 window in use.  The commit
> parsing applications don't keep the commit window in use when they
> go access a tree or a blob.

So they actually can call unuse_pack to unmap the window,
but it's kept for caching reasons?

> Consequently we want the garbage there.  Actually I shouldn't have
> used garbage: the correct term would be LRU managed cache.  :-)
> When we need a new window and we would exceed our maximum limit
> (128 MiB in my implementation) we unmap the least recently used
> window which is not currently in use.

Yep, noticed that :) Just wondered why.

> I could be wrong.  It may not matter.  But I think its crazy to
> unmap otherwise valid mappings just because 2 bytes are on the
> wrong side of an arbitrary boundary.

You're right, would be unfortunate to remap too often.

use_pack always maps at least 20 bytes, if I understand in_window and
its use correctly. Actually, now I'm staring at it longer, I think the
interface I suggested does almost the same, just allows to configure
(well, hint at) the amount of bytes to be mapped in.

I still can't let go of the idea to get as much data as possible with
just one call to sliding window code. Calling use_pack for every byte
just does not seem right.

  reply	other threads:[~2006-11-08 21:33 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-11-07 11:02 win2k/cygwin cannot handle even moderately sized packs Alex Riesen
2006-11-07 12:17 ` Noel Grandin
2006-11-07 13:55   ` Alex Riesen
2006-11-07 15:50     ` Jakub Narebski
2006-11-07 17:28       ` Alex Riesen
2006-11-07 17:48         ` Shawn Pearce
2006-11-07 18:13           ` Alex Riesen
2006-11-07 18:18             ` Shawn Pearce
2006-11-07 18:26               ` Shawn Pearce
2006-11-07 18:56                 ` Shawn Pearce
2006-11-07 23:11                   ` Alex Riesen
2006-11-08  5:19                     ` Shawn Pearce
2006-11-08 13:37                       ` Alex Riesen
2006-11-08 17:11                         ` Shawn Pearce
2006-11-08 21:33                           ` Alex Riesen [this message]
2006-11-08 22:28                             ` Shawn Pearce
2006-11-07 19:27                 ` Alex Riesen
2006-11-08 19:22     ` Christopher Faylor
2006-11-13 12:45 ` Johannes Schindelin
2006-11-13 17:34   ` Alex Riesen
2006-11-13 17:36     ` Alex Riesen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20061108213314.GA4437@steel.home \
    --to=fork0@t-online.de \
    --cc=git@vger.kernel.org \
    --cc=jnareb@gmail.com \
    --cc=junkio@cox.net \
    --cc=raa.lkml@gmail.com \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).