From: Shawn Pearce <spearce@spearce.org>
To: Alex Riesen <raa.lkml@gmail.com>
Cc: Jakub Narebski <jnareb@gmail.com>,
Junio C Hamano <junkio@cox.net>,
git@vger.kernel.org
Subject: Re: win2k/cygwin cannot handle even moderately sized packs
Date: Wed, 8 Nov 2006 17:28:37 -0500 [thread overview]
Message-ID: <20061108222837.GA14446@spearce.org> (raw)
In-Reply-To: <20061108213314.GA4437@steel.home>
Alex Riesen <fork0@t-online.de> wrote:
> Shawn Pearce, Wed, Nov 08, 2006 18:11:31 +0100:
> > The garbage creation is to account for the 2-4 windows required
> > by most applications. Most of the time each window is unused;
> > we really only have two windows in use during delta decompression,
> > at all other times we really only have 1 window in use. The commit
> > parsing applications don't keep the commit window in use when they
> > go access a tree or a blob.
>
> So they actually can call unuse_pack to unmap the window,
> but it's kept for caching reasons?
Actually very few parts of the code even know about the windows.
Really the only parts that know it are the ones that directly
access the pack file, which is mostly restricted to sha1_file.c.
So since all access is through the more public interfaces what
you find is that the application code never keeps the window.
We are always doing use_pack/unuse_pack on every object access.
So the window is almost never in use. So if we didn't hang onto
it in an LRU we would be in a world of hurt performance wise.
> > I could be wrong. It may not matter. But I think its crazy to
> > unmap otherwise valid mappings just because 2 bytes are on the
> > wrong side of an arbitrary boundary.
>
> You're right, would be unfortunate to remap too often.
>
> use_pack always maps at least 20 bytes, if I understand in_window and
> its use correctly. Actually, now I'm staring at it longer, I think the
> interface I suggested does almost the same, just allows to configure
> (well, hint at) the amount of bytes to be mapped in.
True; but if you look nobody wants more than 20 bytes. They either
want <20 for the object header or 20 for the base object id in
a delta. Otherwise they are shoving the data into zlib which
doesn't care. No need to configure it, just shove it in.
> I still can't let go of the idea to get as much data as possible with
> just one call to sliding window code. Calling use_pack for every byte
> just does not seem right.
True. But the only other idea I have is to copy the data into a
buffer for the caller. Which we use only for the header section,
being that its small... we already copy the delta base (20 bytes)
onto the stack during decompression. Might as well copy the header
to decompress it. Then you can batch up the range checks to at
worst no more than 2 range checks per header.
--
next prev parent reply other threads:[~2006-11-08 22:28 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-11-07 11:02 win2k/cygwin cannot handle even moderately sized packs Alex Riesen
2006-11-07 12:17 ` Noel Grandin
2006-11-07 13:55 ` Alex Riesen
2006-11-07 15:50 ` Jakub Narebski
2006-11-07 17:28 ` Alex Riesen
2006-11-07 17:48 ` Shawn Pearce
2006-11-07 18:13 ` Alex Riesen
2006-11-07 18:18 ` Shawn Pearce
2006-11-07 18:26 ` Shawn Pearce
2006-11-07 18:56 ` Shawn Pearce
2006-11-07 23:11 ` Alex Riesen
2006-11-08 5:19 ` Shawn Pearce
2006-11-08 13:37 ` Alex Riesen
2006-11-08 17:11 ` Shawn Pearce
2006-11-08 21:33 ` Alex Riesen
2006-11-08 22:28 ` Shawn Pearce [this message]
2006-11-07 19:27 ` Alex Riesen
2006-11-08 19:22 ` Christopher Faylor
2006-11-13 12:45 ` Johannes Schindelin
2006-11-13 17:34 ` Alex Riesen
2006-11-13 17:36 ` Alex Riesen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20061108222837.GA14446@spearce.org \
--to=spearce@spearce.org \
--cc=git@vger.kernel.org \
--cc=jnareb@gmail.com \
--cc=junkio@cox.net \
--cc=raa.lkml@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).