From: Petr Baudis <pasky@suse.cz>
To: Jakub Narebski <jnareb@gmail.com>
Cc: git@vger.kernel.org,
John 'Warthog9' Hawley <warthog9@eaglescrag.net>,
John 'Warthog9' Hawley <warthog9@kernel.org>
Subject: Re: [RFC PATCHv2 04/10] gitweb: Use Cache::Cache compatibile (get, set) output caching
Date: Wed, 10 Feb 2010 13:02:57 +0100 [thread overview]
Message-ID: <20100210120257.GP4159@machine.or.cz> (raw)
In-Reply-To: <201002101228.15732.jnareb@gmail.com>
On Wed, Feb 10, 2010 at 12:28:14PM +0100, Jakub Narebski wrote:
> On Wed, 10 Feb 2010, Petr Baudis wrote:
> > On Wed, Feb 10, 2010 at 02:12:24AM +0100, Jakub Narebski wrote:
> > > On Tue, 9 Feb 2010 at 11:30 +0100, Jakub Narebski wrote:
> > >
> > > > The cache_fetch subroutine captures output (from STDOUT only, as
> > > > STDERR is usually logged) using either ->push_layer()/->pop_layer()
> > > > from PerlIO::Util submodule (if it is available), or by setting and
> > > > restoring *STDOUT. Note that only the former could be tested reliably
> > > > to be reliable in t9503 test!
> > >
> > > Scratch that, I have just checked that (at least for Apache + mod_cgi,
> > > but I don't think that it matters) the latter solution, with setting
> > > and restoring *STDOUT doesn't work: I would get data in cache (so it
> > > can be restored later), but instead of output I would get Internal Server
> > > Error ("The server encountered an internal error or misconfiguration and
> > > was unable to complete your request.") without even a hint what the
> > > problem was. Sprinkling "die ...: $!" didn't help to catch this error:
> > > I suspect that the problem is with capturing.
> > >
> > > So we either would have to live with non-core PerlIO::Util or (pure Perl)
> > > Capture::Tiny, or do the 'print -> print $out' patch...
> >
> > All the magic methods seem to be troublesome, but in that case I'd
> > really prefer a level of indirection instead of filehandle - as is,
> > 'print (...) -> output (...)' ins. of 'print (...) -> print $out (...)'
> > (or whatever). That should be really flexible and completely
> > futureproof, and I don't think the level of indirection would incur any
> > measurable overhead, would it?
>
> First, it is not only 'print (...) -> print $out (...)'; you need to
> do all those:
>
> print <sth> -> print $out <sth>
> printf <sth> -> printf $out <sth>
> binmode STDOUT, <mode> -> binmode $out, <mode>
>
> Second, using "tie" on filehandle (on *STDOUT) can be used also for
> just capturing output, not only for "tee"-ing; what's more to print
> while capturing one has to do extra work. It is quite similar to
> replacing 'print (...)' with 'output (...)' etc., but using
> tie/untie doesn't require large patch to gitweb.
>
> Third, as you can see below tie-ing is about 1% slower than using
> 'output (...)', which in turn is less than 10% slower than explicit
> filehandle solution i.e. 'print $out (...)'... and is almost twice
> slower than solution using PerlIO::Util
>
> Benchmark: timing 50000 iterations of output, perlio, print \$out, tie *STDOUT...
> output: 1.81462 wallclock secs ( 1.77 usr + 0.00 sys = 1.77 CPU) @ 28248.59/s (n=50000)
> perlio: 1.05585 wallclock secs ( 1.03 usr + 0.00 sys = 1.03 CPU) @ 48543.69/s (n=50000)
> print \$out: 1.70027 wallclock secs ( 1.66 usr + 0.00 sys = 1.66 CPU) @ 30120.48/s (n=50000)
> tie *STDOUT: 1.82248 wallclock secs ( 1.79 usr + 0.00 sys = 1.79 CPU) @ 27932.96/s (n=50000)
> Rate tie *STDOUT output print \$out perlio
> tie *STDOUT 27933/s -- -1% -7% -42%
> output 28249/s 1% -- -6% -42%
> print \$out 30120/s 8% 7% -- -38%
> perlio 48544/s 74% 72% 61% --
>
> Benchmark: running output, perlio, print \$out, tie *STDOUT for at least 10 CPU seconds...
> output: 10.7199 wallclock secs (10.53 usr + 0.00 sys = 10.53 CPU) @ 28029.63/s (n=295152)
> perlio: 11.2884 wallclock secs (10.46 usr + 0.00 sys = 10.46 CPU) @ 49967.11/s (n=522656)
> print \$out: 10.5978 wallclock secs (10.43 usr + 0.00 sys = 10.43 CPU) @ 30318.79/s (n=316225)
> tie *STDOUT: 11.3525 wallclock secs (10.68 usr + 0.00 sys = 10.68 CPU) @ 27635.96/s (n=295152)
> Rate tie *STDOUT output print \$out perlio
> tie *STDOUT 27636/s -- -1% -9% -45%
> output 28030/s 1% -- -8% -44%
> print \$out 30319/s 10% 8% -- -39%
> perlio 49967/s 81% 78% 65% --
> need
>
> Attached there is script that was used to produce those results.
Ok, on my machine it's similar:
Rate output tie *STDOUT print \$out
output 150962/s -- -1% -7%
tie *STDOUT 152769/s 1% -- -6%
print \$out 162604/s 8% 6% --
is roughly consistent image coming out of it.
I guess the time spent here is generally negligible in gitweb anyway...
I suggested using output() because I think hacking it would be _very_
_slightly_ easier than tied filehandle, but you are right that doing
that is also really easy; having the possibility to use PerlIO::Util if
available would be non-essentially nice, but requiring it by stock
gitweb is not reasonable, especially seeing that it's not packaged even
for Debian. ;-)
--
Petr "Pasky" Baudis
If you can't see the value in jet powered ants you should turn in
your nerd card. -- Dunbal (464142)
next prev parent reply other threads:[~2010-02-10 12:03 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-09 10:30 [RFC PATCHv2 00/10] gitweb: Simple file based output caching Jakub Narebski
2010-02-09 10:30 ` [RFC PATCHv2 01/10] gitweb: href(..., -path_info => 0|1) Jakub Narebski
2010-02-09 10:30 ` [RFC PATCHv2 02/10] gitweb/cache.pm - Very simple file based caching Jakub Narebski
2010-02-09 10:30 ` [RFC PATCHv2 03/10] gitweb/cache.pm - Stat-based cache expiration Jakub Narebski
2010-02-09 10:30 ` [RFC PATCHv2 04/10] gitweb: Use Cache::Cache compatibile (get, set) output caching Jakub Narebski
2010-02-10 1:12 ` Jakub Narebski
2010-02-10 1:23 ` Petr Baudis
2010-02-10 11:28 ` Jakub Narebski
2010-02-10 12:02 ` Petr Baudis [this message]
2010-02-10 18:22 ` Jakub Narebski
2010-02-10 20:32 ` Jakub Narebski
2010-02-09 10:30 ` [RFC PATCHv2 05/10] gitweb/cache.pm - Adaptive cache expiration time Jakub Narebski
2010-02-09 10:30 ` [RFC PATCHv2 06/10] gitweb: Use CHI compatibile (compute method) caching Jakub Narebski
2010-02-09 10:30 ` [RFC PATCHv2 07/10] gitweb/cache.pm - Use locking to avoid 'cache miss stampede' problem Jakub Narebski
2010-02-09 10:30 ` [RFC PATCHv2 08/10] gitweb/cache.pm - Serve stale data when waiting for filling cache Jakub Narebski
2010-02-09 10:30 ` [RFC PATCHv2 09/10] gitweb/cache.pm - Regenerate (refresh) cache in background Jakub Narebski
2010-02-09 22:23 ` Jakub Narebski
2010-02-09 10:30 ` [RFC PATCHv2 10/10] gitweb: Show appropriate "Generating..." page when regenerating cache Jakub Narebski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100210120257.GP4159@machine.or.cz \
--to=pasky@suse.cz \
--cc=git@vger.kernel.org \
--cc=jnareb@gmail.com \
--cc=warthog9@eaglescrag.net \
--cc=warthog9@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).