From: A Large Angry SCM <gitzilla@gmail.com>
To: Nicolas Pitre <nico@cam.org>
Cc: Junio C Hamano <gitster@pobox.com>, git@vger.kernel.org
Subject: Re: [PATCH 6/7] pack-objects: allow for early delta deflating
Date: Fri, 02 May 2008 18:44:51 -0400 [thread overview]
Message-ID: <481B9963.6050605@gmail.com> (raw)
In-Reply-To: <1209755511-7840-7-git-send-email-nico@cam.org>
Nicolas Pitre wrote:
> When the delta data is cached in memory until it is written to a pack
> file on disk, it is best to compress it right away in find_deltas() for
> the following reasons:
>
> - we have to compress that data anyway;
>
> - this allows for caching more deltas with the same cache size limit;
>
> - compression is potentially threaded.
>
> This last point is especially relevant for SMP run time. For example,
> repacking the Linux repo on a quad core processor using 4 threads with
> all default settings produce the following results before this change:
>
> real 2m27.929s
> user 4m36.492s
> sys 0m3.091s
>
> And with this change applied:
>
> real 2m13.787s
> user 4m37.486s
> sys 0m3.159s
>
> So the actual execution time stayed more or less the same but the
> wall clock time is shorter.
>
> This is however not a good thing to do when generating a pack for
> network transmission. In that case, the network is most likely to
> throttle the data throughput, so it is best to make find_deltas()
> faster in order to start writing data ASAP since we can afford
> spending more time between writes to compress the data
> at that point.
[...]
>
> + /*
> + * If we decided to cache the delta data, then it is best
> + * to compress it right away. First because we have to do
> + * it anyway, and doing it here while we're threaded will
> + * save a lot of time in the non threaded write phase,
> + * as well as allow for caching more deltas within
> + * the same cache size limit.
> + * ...
> + * But only if not writing to stdout, since in that case
> + * the network is most likely throttling writes anyway,
> + * and therefore it is best to go to the write phase ASAP
> + * instead, as we can afford spending more time compressing
> + * between writes at that moment.
> + */
> + if (entry->delta_data && !pack_to_stdout) {
> + entry->z_delta_size = do_compress(&entry->delta_data,
> + entry->delta_size);
> + cache_lock();
> + delta_cache_size -= entry->delta_size;
> + delta_cache_size += entry->z_delta_size;
> + cache_unlock();
> + }
> +
> /* if we made n a delta, and if n is already at max
> * depth, leaving it in the window is pointless. we
> * should evict it first.
Although I like the idea of changing the behavior if the output is
likely to be throttled, I do not like the test for that condition being
"is it going to stdout". This is something better suited to a command
line argument.
next prev parent reply other threads:[~2008-05-02 22:46 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-05-02 19:11 [PATCH 0/7] assorted pack-objects cleanups and improvements Nicolas Pitre
2008-05-02 19:11 ` [PATCH 1/7] pack-objects: small cleanup Nicolas Pitre
2008-05-02 19:11 ` [PATCH 2/7] pack-objects: remove some double negative logic Nicolas Pitre
2008-05-02 19:11 ` [PATCH 3/7] pack-objects: simplify the condition associated with --all-progress Nicolas Pitre
2008-05-02 19:11 ` [PATCH 4/7] pack-objects: clean up write_object() a bit Nicolas Pitre
2008-05-02 19:11 ` [PATCH 5/7] pack-objects: move compression code in a separate function Nicolas Pitre
2008-05-02 19:11 ` [PATCH 6/7] pack-objects: allow for early delta deflating Nicolas Pitre
2008-05-02 19:11 ` [PATCH 7/7] pack-objects: fix early eviction for max depth delta objects Nicolas Pitre
2008-05-02 22:44 ` A Large Angry SCM [this message]
2008-05-02 23:03 ` [PATCH 6/7] pack-objects: allow for early delta deflating Nicolas Pitre
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=481B9963.6050605@gmail.com \
--to=gitzilla@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=nico@cam.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.