From: "J.C. Pizarro" <jcpiza@gmail.com>
To: "Linus Torvalds" <torvalds@linux-foundation.org>,
"Andreas Ericsson" <ae@op5.se>
Cc: "David Miller" <davem@davemloft.net>,
"Nicolas Pitre" <nico@cam.org>,
jonsmirl@gmail.com, "Junio C Hamano" <gitster@pobox.com>,
gcc@gcc.gnu.org, git@vger.kernel.org
Subject: Re: Something is broken in repack. Why not with fork and pipes?
Date: Wed, 12 Dec 2007 19:47:14 +0100 [thread overview]
Message-ID: <998d0e4a0712121047m3cb09f37qc3157b96e5d171e7@mail.gmail.com> (raw)
At http://gcc.gnu.org/ml/gcc/2007-12/msg00360.html, Andreas Ericsson
<ae@op5.se> wrote:
> If it's still an issue next week, we'll have a 16 core (8 dual-core cpu's)
> machine with some 32gb of ram in that'll be free for about two days.
> You'll have to remind me about it though, as I've got a lot on my mind
> these days.
>
>
> --
> Andreas Ericsson andreas.ericsson@op5.se
> OP5 AB www.op5.se
> Tel: +46 8-230225 Fax: +46 8-230231
It's good idea if it's for 24/365.25 that it does
autorepack-compute-again-again-again-those-unexplored-deltas of
git repositories in realtime. :D
Some body can do "git clone" that it could give smaller that one hour ago :D
-----------------------------------------------------------------
To Linus, Why don't you forget the threaded implementation of your repo-pack?
To imagine a "buggy bloated threading implementation originated to try it to
work only in HyperThreading Intel CPUs and 8 cores x 8 threads/core
Niagara Sparcs"
IMHO, in multicored machine, multiprocessed implementation of repo-pack perfomes
better than multithreaded implementation, although i've not their results.
It has not issue, not problem, etc. with memory allocation of threads,
so monothreaded memory allocation is simple and fast!
You can see "Why not with fork and pipes like in linux?" at
http://gcc.gnu.org/ml/gcc/2007-12/msg00203.html
http://gcc.gnu.org/ml/gcc/2007-12/msg00209.html
For easy implementation, don't use threads due to complicated condition races
between threads of multithreaded processes.
To use only condition races between monothreaded processes with select/epoll
only in the parent process. It's due to the KISS principle works.
The children processes share almost readed-only memory due to COW
(Copy On Write), so, before forking, the parent must to have a large
plain data structures in C for children. The children use pipes to
realize a complex intercommunication that the parent updates the
results computated by the children almost of the time.
Another implementation is that the children can realize a locked
load-and-store to/from unique filesystem's database if big memory to
store data is a big problem.
Another implementation is to consider children processes as intensive-CPU
slaves and parent process as the master that manipulates the big database.
If you want to measure the performance between multiprocessed vs multithreaded
implementation of repo-pack then you have to remember that
For same data input size and same data output size, to get the
seconds of your wall-clock or watch-clock as a measure of the benchmark
of this repo-pack.
The numeric data posted to mailing list about the timings dependently of # of
threads are bad measured because they don't say how is small the result repo.
and don't say if the results are the same independently of # of threads.
For good measures, we need "to plot the curves", e.g. based in
( # of threads, elapsed time of wall-clock, data input size, data output size )
and we can observe the intersection between above curves.
J.C.Pizarro
next reply other threads:[~2007-12-12 18:47 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-12-12 18:47 J.C. Pizarro [this message]
2007-12-12 19:41 ` Something is broken in repack. Why not with fork and pipes? Johannes Schindelin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=998d0e4a0712121047m3cb09f37qc3157b96e5d171e7@mail.gmail.com \
--to=jcpiza@gmail.com \
--cc=ae@op5.se \
--cc=davem@davemloft.net \
--cc=gcc@gcc.gnu.org \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jonsmirl@gmail.com \
--cc=nico@cam.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).