From: "Alex Riesen" <raa.lkml@gmail.com>
To: "Shawn Pearce" <spearce@spearce.org>
Cc: "Jakub Narebski" <jnareb@gmail.com>,
"Junio C Hamano" <junkio@cox.net>,
git@vger.kernel.org
Subject: Re: win2k/cygwin cannot handle even moderately sized packs
Date: Wed, 8 Nov 2006 14:37:55 +0100 [thread overview]
Message-ID: <81b0412b0611080537k1087be66x1a4a9686b43d7b46@mail.gmail.com> (raw)
In-Reply-To: <20061108051914.GB28498@spearce.org>
> > I couldn't help noticing that the interface to the packs data is
> > a bit complex:
> >
> > unsigned char *use_pack(struct packed_git *p,
> > struct pack_window **window,
> > unsigned long offset,
> > unsigned int *left);
> > void unuse_pack(struct pack_window **w);
> >
> > Or am I missing something very obvious, and something like this
> > is just not feasible for some reasons?
>
> The use counter. Every time someone asks for a pointer into the
> pack they need to lock that window into memory to prevent us from
> garbage collecting it by unmapping it to make room for another
> window that the application needs.
I think the counters can be kept in struct packed_git somewhere. Given mmap
granularity, and the fact that not all of the pack is used in normal case
(and the granularity help us in the worst case) the memory used up by the
page counters shouldn't be too much.
> > I was almost about to move your code into unpack_object_header_gently,
> > but ... The header isn't that big, is it? It is variable in the pack,
> > but the implementation of the parser is at the moment restricted by
> > the type we use for object size (unsigned long for the particular
> > platform). For example:
>
> All true. However what happens when the header spans two windows?
> Lets say I have the first 4 MiB mapped and the next 4 MiB mapped in
> a different window; these are not necessarily at the same locations
> within memory. Now if an object header is split over these two
> then some bytes are at the end of the first window and the rest
> are at the start of the next window.
Assuming these are adjacent windows, we can just increment counters on the
all touched pages (at least the two together) and return the pointer into
the lowest page. Otherwise - time for garbage collection (why produce the
garbage at all, btw?) and remap.
> I can't just say "make sure we have at least X bytes available
> before starting to decode the header, as to do that in this case
> we'd have to unmap BOTH windows and remap a new one which keeps
> that very small header fully contiguous in memory. That's thrashing
> the VM page tables for really no benefit.
You can't mmap less than a page, can you? So it's actually never a small
portion, but at least 4k on x86.
> > (BTW, current unpack_object_header_gently does not use it's len
> > argument to check if there actually is enough data to hold at least
> > minimal header. Is the size of mapped data checked for correctness
> > somewhere before?)
>
> Yes. Somewhere. I think we make sure there's at least 20 bytes
> in the pack remaining before we start to decode a header. We must
> have at least 20 as that's the trailing SHA1 checksum of the entire
> pack. :-)
next prev parent reply other threads:[~2006-11-08 13:38 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-11-07 11:02 win2k/cygwin cannot handle even moderately sized packs Alex Riesen
2006-11-07 12:17 ` Noel Grandin
2006-11-07 13:55 ` Alex Riesen
2006-11-07 15:50 ` Jakub Narebski
2006-11-07 17:28 ` Alex Riesen
2006-11-07 17:48 ` Shawn Pearce
2006-11-07 18:13 ` Alex Riesen
2006-11-07 18:18 ` Shawn Pearce
2006-11-07 18:26 ` Shawn Pearce
2006-11-07 18:56 ` Shawn Pearce
2006-11-07 23:11 ` Alex Riesen
2006-11-08 5:19 ` Shawn Pearce
2006-11-08 13:37 ` Alex Riesen [this message]
2006-11-08 17:11 ` Shawn Pearce
2006-11-08 21:33 ` Alex Riesen
2006-11-08 22:28 ` Shawn Pearce
2006-11-07 19:27 ` Alex Riesen
2006-11-08 19:22 ` Christopher Faylor
2006-11-13 12:45 ` Johannes Schindelin
2006-11-13 17:34 ` Alex Riesen
2006-11-13 17:36 ` Alex Riesen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=81b0412b0611080537k1087be66x1a4a9686b43d7b46@mail.gmail.com \
--to=raa.lkml@gmail.com \
--cc=git@vger.kernel.org \
--cc=jnareb@gmail.com \
--cc=junkio@cox.net \
--cc=spearce@spearce.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).