From: Petr Baudis <pasky@suse.cz>
To: Jakub Narebski <jnareb@gmail.com>
Cc: git@vger.kernel.org,
John 'Warthog9' Hawley <warthog9@eaglescrag.net>,
John 'Warthog9' Hawley <warthog9@kernel.org>
Subject: Re: [RFC PATCHv2 04/10] gitweb: Use Cache::Cache compatibile (get, set) output caching
Date: Wed, 10 Feb 2010 13:02:57 +0100 [thread overview]
Message-ID: <20100210120257.GP4159@machine.or.cz> (raw)
In-Reply-To: <201002101228.15732.jnareb@gmail.com>
On Wed, Feb 10, 2010 at 12:28:14PM +0100, Jakub Narebski wrote:
> On Wed, 10 Feb 2010, Petr Baudis wrote:
> > On Wed, Feb 10, 2010 at 02:12:24AM +0100, Jakub Narebski wrote:
> > > On Tue, 9 Feb 2010 at 11:30 +0100, Jakub Narebski wrote:
> > >
> > > > The cache_fetch subroutine captures output (from STDOUT only, as
> > > > STDERR is usually logged) using either ->push_layer()/->pop_layer()
> > > > from PerlIO::Util submodule (if it is available), or by setting and
> > > > restoring *STDOUT. Note that only the former could be tested reliably
> > > > to be reliable in t9503 test!
> > >
> > > Scratch that, I have just checked that (at least for Apache + mod_cgi,
> > > but I don't think that it matters) the latter solution, with setting
> > > and restoring *STDOUT doesn't work: I would get data in cache (so it
> > > can be restored later), but instead of output I would get Internal Server
> > > Error ("The server encountered an internal error or misconfiguration and
> > > was unable to complete your request.") without even a hint what the
> > > problem was. Sprinkling "die ...: $!" didn't help to catch this error:
> > > I suspect that the problem is with capturing.
> > >
> > > So we either would have to live with non-core PerlIO::Util or (pure Perl)
> > > Capture::Tiny, or do the 'print -> print $out' patch...
> >
> > All the magic methods seem to be troublesome, but in that case I'd
> > really prefer a level of indirection instead of filehandle - as is,
> > 'print (...) -> output (...)' ins. of 'print (...) -> print $out (...)'
> > (or whatever). That should be really flexible and completely
> > futureproof, and I don't think the level of indirection would incur any
> > measurable overhead, would it?
>
> First, it is not only 'print (...) -> print $out (...)'; you need to
> do all those:
>
> print <sth> -> print $out <sth>
> printf <sth> -> printf $out <sth>
> binmode STDOUT, <mode> -> binmode $out, <mode>
>
> Second, using "tie" on filehandle (on *STDOUT) can be used also for
> just capturing output, not only for "tee"-ing; what's more to print
> while capturing one has to do extra work. It is quite similar to
> replacing 'print (...)' with 'output (...)' etc., but using
> tie/untie doesn't require large patch to gitweb.
>
> Third, as you can see below tie-ing is about 1% slower than using
> 'output (...)', which in turn is less than 10% slower than explicit
> filehandle solution i.e. 'print $out (...)'... and is almost twice
> slower than solution using PerlIO::Util
>
> Benchmark: timing 50000 iterations of output, perlio, print \$out, tie *STDOUT...
> output: 1.81462 wallclock secs ( 1.77 usr + 0.00 sys = 1.77 CPU) @ 28248.59/s (n=50000)
> perlio: 1.05585 wallclock secs ( 1.03 usr + 0.00 sys = 1.03 CPU) @ 48543.69/s (n=50000)
> print \$out: 1.70027 wallclock secs ( 1.66 usr + 0.00 sys = 1.66 CPU) @ 30120.48/s (n=50000)
> tie *STDOUT: 1.82248 wallclock secs ( 1.79 usr + 0.00 sys = 1.79 CPU) @ 27932.96/s (n=50000)
> Rate tie *STDOUT output print \$out perlio
> tie *STDOUT 27933/s -- -1% -7% -42%
> output 28249/s 1% -- -6% -42%
> print \$out 30120/s 8% 7% -- -38%
> perlio 48544/s 74% 72% 61% --
>
> Benchmark: running output, perlio, print \$out, tie *STDOUT for at least 10 CPU seconds...
> output: 10.7199 wallclock secs (10.53 usr + 0.00 sys = 10.53 CPU) @ 28029.63/s (n=295152)
> perlio: 11.2884 wallclock secs (10.46 usr + 0.00 sys = 10.46 CPU) @ 49967.11/s (n=522656)
> print \$out: 10.5978 wallclock secs (10.43 usr + 0.00 sys = 10.43 CPU) @ 30318.79/s (n=316225)
> tie *STDOUT: 11.3525 wallclock secs (10.68 usr + 0.00 sys = 10.68 CPU) @ 27635.96/s (n=295152)
> Rate tie *STDOUT output print \$out perlio
> tie *STDOUT 27636/s -- -1% -9% -45%
> output 28030/s 1% -- -8% -44%
> print \$out 30319/s 10% 8% -- -39%
> perlio 49967/s 81% 78% 65% --
> need
>
> Attached there is script that was used to produce those results.
Ok, on my machine it's similar:
Rate output tie *STDOUT print \$out
output 150962/s -- -1% -7%
tie *STDOUT 152769/s 1% -- -6%
print \$out 162604/s 8% 6% --
is roughly consistent image coming out of it.
I guess the time spent here is generally negligible in gitweb anyway...
I suggested using output() because I think hacking it would be _very_
_slightly_ easier than tied filehandle, but you are right that doing
that is also really easy; having the possibility to use PerlIO::Util if
available would be non-essentially nice, but requiring it by stock
gitweb is not reasonable, especially seeing that it's not packaged even
for Debian. ;-)
--
Petr "Pasky" Baudis
If you can't see the value in jet powered ants you should turn in
your nerd card. -- Dunbal (464142)
next prev parent reply other threads:[~2010-02-10 12:03 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-09 10:30 [RFC PATCHv2 00/10] gitweb: Simple file based output caching Jakub Narebski
2010-02-09 10:30 ` [RFC PATCHv2 01/10] gitweb: href(..., -path_info => 0|1) Jakub Narebski
2010-02-09 10:30 ` [RFC PATCHv2 02/10] gitweb/cache.pm - Very simple file based caching Jakub Narebski
2010-02-09 10:30 ` [RFC PATCHv2 03/10] gitweb/cache.pm - Stat-based cache expiration Jakub Narebski
2010-02-09 10:30 ` [RFC PATCHv2 04/10] gitweb: Use Cache::Cache compatibile (get, set) output caching Jakub Narebski
2010-02-10 1:12 ` Jakub Narebski
2010-02-10 1:23 ` Petr Baudis
2010-02-10 11:28 ` Jakub Narebski
2010-02-10 12:02 ` Petr Baudis [this message]
2010-02-10 18:22 ` Jakub Narebski
2010-02-10 20:32 ` Jakub Narebski
2010-02-09 10:30 ` [RFC PATCHv2 05/10] gitweb/cache.pm - Adaptive cache expiration time Jakub Narebski
2010-02-09 10:30 ` [RFC PATCHv2 06/10] gitweb: Use CHI compatibile (compute method) caching Jakub Narebski
2010-02-09 10:30 ` [RFC PATCHv2 07/10] gitweb/cache.pm - Use locking to avoid 'cache miss stampede' problem Jakub Narebski
2010-02-09 10:30 ` [RFC PATCHv2 08/10] gitweb/cache.pm - Serve stale data when waiting for filling cache Jakub Narebski
2010-02-09 10:30 ` [RFC PATCHv2 09/10] gitweb/cache.pm - Regenerate (refresh) cache in background Jakub Narebski
2010-02-09 22:23 ` Jakub Narebski
2010-02-09 10:30 ` [RFC PATCHv2 10/10] gitweb: Show appropriate "Generating..." page when regenerating cache Jakub Narebski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100210120257.GP4159@machine.or.cz \
--to=pasky@suse.cz \
--cc=git@vger.kernel.org \
--cc=jnareb@gmail.com \
--cc=warthog9@eaglescrag.net \
--cc=warthog9@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.