git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jakub Narebski <jnareb@gmail.com>
To: git@vger.kernel.org
Cc: John 'Warthog9' Hawley <warthog9@kernel.org>,
	John 'Warthog9' Hawley <warthog9@eaglescrag.net>,
	Junio C Hamano <gitster@pobox.com>, demerphq <demerphq@gmail.com>,
	Aevar Arnfjord Bjarmason <avarab@gmail.com>,
	Thomas Adam <thomas@xteddy.org>,
	Jakub Narebski <jnareb@gmail.com>
Subject: [PATCH 06/24] gitweb/lib - Simple output capture by redirecting STDOUT
Date: Tue,  7 Dec 2010 00:10:51 +0100	[thread overview]
Message-ID: <1291677069-6559-7-git-send-email-jnareb@gmail.com> (raw)
In-Reply-To: <1291677069-6559-1-git-send-email-jnareb@gmail.com>

Add GitwebCache::Capture::Simple package, which captures output by
redirecting STDOUT to in-memory file (saving what is printed to
scalar), earlier saving original STDOUT to restore it when finished
capturing.

GitwebCache::Capture::Simple preserves PerlIO layers, both those set
before started capturing output, and those set during capture.  The
exceptions is the 'scalar' layer, which needs additional parameter,
and which for proper handling needs non-core module PerlIO::Util.

No care was taken to handle the following special cases (prior to
starting capture): closed STDOUT, STDOUT reopened to scalar reference,
tied STDOUT.  You shouldn't modify STDOUT during capture.

Includes separate tests for capturing output in
t9504/test_capture_interface.pl which is run as external test from
t9504-gitweb-capture-interface.sh.  It tests capturing of utf8 data
printed in :utf8 mode, and of binary data (containing invalid utf8) in
:raw mode.

Note that nested capturing doesn't work (and probably couldn't be made
to work when capturing to in-memory file), but this feature wouldn't
be needed for capturing gitweb output (to cache it).


This patch was based on "gitweb: add output buffering and associated
functions" patch by John 'Warthog9' Hawley (J.H.) in "Gitweb caching v7"
series, and on code of Capture::Tiny by David Golden (Apache License 2.0).

Based-on-work-by: John 'Warthog9' Hawley <warthog9@kernel.org>
Signed-off-by: Jakub Narebski <jnareb@gmail.com>
---
In previous version of this series the basic (default) capturing
engine was GitwebCache::Capture::SelectFH, which made use of
select(FH) to change default filehandle used for print and printf
without filehandle argument (print LIST).  This required changing
  binmode STDOUT, <mode>
to
  binmode select(), <mode>
in gitweb.perl, which is now not necessary.

To simplify code this time we don't use GitwebCache::Capture as base
class providing common features, but we reserve space to add it if we
feel it necessary (e.g. when adding more capturing engines).

Even re-layering is probably not necessary in the case of gitweb, as
we set ':utf8' mode at beginning, and if we change it to ':raw' it is
always after we started capture.

Original "Gitweb caching v7" used capturing to in-memory files in the
case when caching was disabled; in my minimal fixup i.e. in the
"Gitweb caching v7.x" threads capturing to in-memory file is not done
at all; output is redirected straight to cache file (or equivalent).
The same would be done later in this series.

 gitweb/lib/GitwebCache/Capture/Simple.pm           |   96 ++++++++++++++++++++
 ...erface.sh => t9504-gitweb-capture-interface.sh} |   10 +-
 t/t9504/test_capture_interface.pl                  |   91 +++++++++++++++++++
 3 files changed, 192 insertions(+), 5 deletions(-)
 create mode 100644 gitweb/lib/GitwebCache/Capture/Simple.pm
 copy t/{t9503-gitweb-caching-interface.sh => t9504-gitweb-capture-interface.sh} (66%)
 create mode 100755 t/t9504/test_capture_interface.pl

diff --git a/gitweb/lib/GitwebCache/Capture/Simple.pm b/gitweb/lib/GitwebCache/Capture/Simple.pm
new file mode 100644
index 0000000..3585e58
--- /dev/null
+++ b/gitweb/lib/GitwebCache/Capture/Simple.pm
@@ -0,0 +1,96 @@
+# gitweb - simple web interface to track changes in git repositories
+#
+# (C) 2010, Jakub Narebski <jnareb@gmail.com>
+#
+# This program is licensed under the GPLv2
+
+#
+# Simple output capturing via redirecting STDOUT to in-memory file.
+#
+
+# This is the same mechanism that Capture::Tiny uses, only simpler;
+# we don't capture STDERR at all, we don't tee, we don't support
+# capturing output of external commands.
+
+package GitwebCache::Capture::Simple;
+
+use strict;
+use warnings;
+
+use PerlIO;
+
+# Constructor
+sub new {
+	my $class = shift;
+
+	my $self = {};
+	$self = bless($self, $class);
+
+	return $self;
+}
+
+sub capture {
+	my ($self, $code) = @_;
+
+	$self->capture_start();
+	$code->();
+	return $self->capture_stop();
+}
+
+# ----------------------------------------------------------------------
+
+# Start capturing data (STDOUT)
+sub capture_start {
+	my $self = shift;
+
+	# save copy of real STDOUT via duplicating it
+	my @layers = PerlIO::get_layers(\*STDOUT);
+	open $self->{'orig_stdout'}, ">&", \*STDOUT
+		or die "Couldn't dup STDOUT for capture: $!";
+
+	# close STDOUT, so that it isn't used anymode (to have it fd0)
+	close STDOUT;
+
+	# reopen STDOUT as in-memory file
+	$self->{'data'} = '';
+	unless (open STDOUT, '>', \$self->{'data'}) {
+		open STDOUT, '>&', fileno($self->{'orig_stdout'});
+		die "Couldn't reopen STDOUT as in-memory file for capture: $!";
+	}
+	_relayer(\*STDOUT, \@layers);
+
+	# started capturing
+	$self->{'capturing'} = 1;
+}
+
+# Stop capturing data (required for die_error)
+sub capture_stop {
+	my $self = shift;
+
+	# return if we didn't start capturing
+	return unless delete $self->{'capturing'};
+
+	# close in-memory file, and restore original STDOUT
+	my @layers = PerlIO::get_layers(\*STDOUT);
+	close STDOUT;
+	open STDOUT, '>&', fileno($self->{'orig_stdout'});
+	_relayer(\*STDOUT, \@layers);
+
+	return $self->{'data'};
+}
+
+# taken from Capture::Tiny by David Golden, Apache License 2.0
+# with debugging stripped out, and added filtering out 'scalar' layer
+sub _relayer {
+	my ($fh, $layers) = @_;
+
+	my %seen = ( unix => 1, perlio => 1, scalar => 1 ); # filter these out
+	my @unique = grep { !$seen{$_}++ } @$layers;
+
+	binmode($fh, join(":", ":raw", @unique));
+}
+
+
+1;
+__END__
+# end of package GitwebCache::Capture::Simple
diff --git a/t/t9503-gitweb-caching-interface.sh b/t/t9504-gitweb-capture-interface.sh
similarity index 66%
copy from t/t9503-gitweb-caching-interface.sh
copy to t/t9504-gitweb-capture-interface.sh
index 819da1d..82623f1 100755
--- a/t/t9503-gitweb-caching-interface.sh
+++ b/t/t9504-gitweb-capture-interface.sh
@@ -3,10 +3,10 @@
 # Copyright (c) 2010 Jakub Narebski
 #
 
-test_description='gitweb caching interface
+test_description='gitweb capturing interface
 
-This test checks caching interface used in gitweb caching, and caching
-infrastructure (GitwebCache::* modules).'
+This test checks capturing interface used for capturing gitweb output
+in gitweb caching (GitwebCache::Capture* modules).'
 
 # for now we are running only cache interface tests
 . ./test-lib.sh
@@ -28,7 +28,7 @@ fi
 test_external_has_tap=1
 
 test_external \
-	'GitwebCache::* Perl API (in gitweb/lib/)' \
-	"$PERL_PATH" "$TEST_DIRECTORY"/t9503/test_cache_interface.pl
+	'GitwebCache::Capture Perl API (in gitweb/lib/)' \
+	"$PERL_PATH" "$TEST_DIRECTORY"/t9504/test_capture_interface.pl
 
 test_done
diff --git a/t/t9504/test_capture_interface.pl b/t/t9504/test_capture_interface.pl
new file mode 100755
index 0000000..47ab804
--- /dev/null
+++ b/t/t9504/test_capture_interface.pl
@@ -0,0 +1,91 @@
+#!/usr/bin/perl
+use lib (split(/:/, $ENV{GITPERLLIB}));
+
+use warnings;
+use strict;
+use utf8;
+
+use Test::More;
+
+# test source version
+use lib $ENV{GITWEBLIBDIR} || "$ENV{GIT_BUILD_DIR}/gitweb/lib";
+
+# ....................................................................
+
+use_ok('GitwebCache::Capture::Simple');
+diag("Using lib '$INC[0]'");
+diag("Testing '$INC{'GitwebCache/Capture/Simple.pm'}'");
+
+# Test setting up capture
+#
+my $capture = new_ok('GitwebCache::Capture::Simple' => [], 'The $capture');
+
+# Test capturing
+#
+sub capture_block (&) {
+	return $capture->capture(shift);
+}
+
+diag('Should not print anything except test results and diagnostic');
+my $test_data = 'Capture this';
+my $captured = capture_block {
+	print $test_data;
+};
+is($captured, $test_data, 'capture simple data');
+
+binmode STDOUT, ':utf8';
+$test_data = <<'EOF';
+Zażółć gęsią jaźń
+EOF
+utf8::decode($test_data);
+$captured = capture_block {
+	binmode STDOUT, ':utf8';
+
+	print $test_data;
+};
+utf8::decode($captured);
+is($captured, $test_data, 'capture utf8 data');
+
+$test_data = '|\x{fe}\x{ff}|\x{9F}|\000|'; # invalid utf-8
+$captured = capture_block {
+	binmode STDOUT, ':raw';
+
+	print $test_data;
+};
+is($captured, $test_data, 'capture raw data');
+
+# Test nested capturing
+#
+TODO: {
+	local $TODO = "not required for capturing gitweb output";
+	no warnings;
+
+	my $outer_capture = GitwebCache::Capture::Simple->new();
+	$captured = $outer_capture->capture(sub {
+		print "pre|";
+		my $captured = $capture->capture(sub {
+			print "INNER";
+		});
+		print lc($captured);
+		print "|post";
+	});
+	is($captured, "pre|inner|post", 'nested capture');
+}
+
+SKIP: {
+	skip "Capture::Tiny not available", 1
+		unless eval { require Capture::Tiny; };
+
+	$captured = Capture::Tiny::capture(sub {
+		my $inner = $capture->capture(sub {
+			print "INNER";
+		});
+	});
+	is($captured, '', "doesn't print while capturing");
+}
+
+done_testing();
+
+# Local Variables:
+# coding: utf-8
+# End:
-- 
1.7.3

  parent reply	other threads:[~2010-12-06 23:12 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-06 23:10 [PATCHv6/RFC 00/24] gitweb: Simple file based output caching Jakub Narebski
2010-12-06 23:10 ` [PATCH 01/24] t/test-lib.sh: Export also GIT_BUILD_DIR in test_external Jakub Narebski
2010-12-06 23:10 ` [PATCH 02/24] gitweb: Prepare for splitting gitweb Jakub Narebski
2010-12-06 23:10 ` [PATCH 03/24] gitweb/lib - Very simple file based cache Jakub Narebski
2010-12-06 23:10 ` [PATCH 04/24] gitweb/lib - Stat-based cache expiration Jakub Narebski
2010-12-06 23:10 ` [PATCH 05/24] gitweb/lib - Regenerate entry if the cache file has size of 0 Jakub Narebski
2010-12-06 23:10 ` Jakub Narebski [this message]
2010-12-06 23:10 ` [PATCH 07/24] gitweb/lib - Cache captured output (using get/set) Jakub Narebski
2010-12-06 23:10 ` [PATCH 08/24] gitweb: Add optional output caching Jakub Narebski
2010-12-06 23:10 ` [PATCH 09/24] gitweb/lib - Adaptive cache expiration time Jakub Narebski
2010-12-06 23:10 ` [PATCH 10/24] gitweb/lib - Use CHI compatibile (compute method) caching interface Jakub Narebski
2010-12-06 23:10 ` [PATCH 11/24] gitweb/lib - capture output directly to cache entry file Jakub Narebski
2010-12-06 23:10 ` [PATCH 12/24] gitweb/lib - Use locking to avoid 'cache miss stampede' problem Jakub Narebski
2010-12-06 23:10 ` [PATCH 13/24] gitweb/lib - No need for File::Temp when locking Jakub Narebski
2010-12-06 23:10 ` [PATCH 14/24] gitweb/lib - Serve stale data when waiting for filling cache Jakub Narebski
2010-12-06 23:11 ` [PATCH 15/24] gitweb/lib - Regenerate (refresh) cache in background Jakub Narebski
2010-12-06 23:11 ` [PATCH 16/24] gitweb: Introduce %actions_info, gathering information about actions Jakub Narebski
2010-12-06 23:11 ` [PATCH 17/24] gitweb: Show appropriate "Generating..." page when regenerating cache Jakub Narebski
2010-12-06 23:11 ` [PATCH 18/24] gitweb/lib - Configure running 'generating_info' when generating data Jakub Narebski
2010-12-06 23:11 ` [PATCH 19/24] gitweb: Add startup delay to activity indicator for cache Jakub Narebski
2010-12-06 23:11 ` [PATCH/RFC 20/24] gitweb/lib - Add support for setting error handler in cache Jakub Narebski
2010-12-06 23:11 ` [PATCH/RFC 21/24] gitweb: Wrap die_error to use as error handler for caching engine Jakub Narebski
2010-12-06 23:11 ` [PATCH/RFC 22/24] gitweb: Support legacy options used by kernel.org " Jakub Narebski
2010-12-06 23:11 ` [RFC/PATCH 23/24] gitweb/lib - Add clear() and size() methods to caching interface Jakub Narebski
2010-12-06 23:11 ` [RFC PATCH 24/24] gitweb: Add beginnings of cache administration page (proof of concept) Jakub Narebski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1291677069-6559-7-git-send-email-jnareb@gmail.com \
    --to=jnareb@gmail.com \
    --cc=avarab@gmail.com \
    --cc=demerphq@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=thomas@xteddy.org \
    --cc=warthog9@eaglescrag.net \
    --cc=warthog9@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).