git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] git-remote-mediawiki: bugfix for pages w/ >500 revisions
@ 2013-09-23 17:26 Benoit Person
  2013-09-23 17:58 ` Matthieu Moy
  2013-09-23 20:17 ` Eric Sunshine
  0 siblings, 2 replies; 5+ messages in thread
From: Benoit Person @ 2013-09-23 17:26 UTC (permalink / raw)
  To: git; +Cc: Matthieu Moy, Benoit Person

Mediawiki introduces a new API for queries w/ more than 500 results in
version 1.21. That change triggered an infinite loop while cloning a
mediawiki with such a page.

The latest API renamed and moved the "continuing" information in the
response, necessary to build the next query. The code failed to retrieve
that information but still detected that it was in a "continuing
query". As a result, it launched the same query over and over again.

If a "continuing" information is detected in the response (old or new),
the next query is updated accordingly. If not, we quit assuming it's not
a continuing query.

Signed-off-by: Benoit Person <benoit.person@gmail.fr>
Reported-by: Benjamin Cathey
---
 contrib/mw-to-git/git-remote-mediawiki.perl     | 14 ++++++++++++--
 contrib/mw-to-git/t/t9365-continuing-queries.sh | 24 ++++++++++++++++++++++++
 2 files changed, 36 insertions(+), 2 deletions(-)
 create mode 100755 contrib/mw-to-git/t/t9365-continuing-queries.sh

diff --git a/contrib/mw-to-git/git-remote-mediawiki.perl b/contrib/mw-to-git/git-remote-mediawiki.perl
index c9a4805..f1db5d2 100755
--- a/contrib/mw-to-git/git-remote-mediawiki.perl
+++ b/contrib/mw-to-git/git-remote-mediawiki.perl
@@ -625,6 +625,9 @@ sub fetch_mw_revisions_for_page {
 		rvstartid => $fetch_from,
 		rvlimit => 500,
 		pageids => $id,
+
+		# let the mediawiki know that we support the latest API
+		continue => '',
 	};
 
 	my $revnum = 0;
@@ -640,8 +643,15 @@ sub fetch_mw_revisions_for_page {
 			push(@page_revs, $page_rev_ids);
 			$revnum++;
 		}
-		last if (!$result->{'query-continue'});
-		$query->{rvstartid} = $result->{'query-continue'}->{revisions}->{rvstartid};
+
+		if ($result->{'query-continue'}) { # For legacy APIs
+			$query->{rvstartid} = $result->{'query-continue'}->{revisions}->{rvstartid};
+		} elsif ($result->{continue}) { # For newer APIs
+			$query->{rvstartid} = $result->{continue}->{rvcontinue};
+			$query->{continue} = $result->{continue}->{continue};
+		} else {
+			last;
+		}
 	}
 	if ($shallow_import && @page_revs) {
 		print {*STDERR} "  Found 1 revision (shallow import).\n";
diff --git a/contrib/mw-to-git/t/t9365-continuing-queries.sh b/contrib/mw-to-git/t/t9365-continuing-queries.sh
new file mode 100755
index 0000000..6fb5df4
--- /dev/null
+++ b/contrib/mw-to-git/t/t9365-continuing-queries.sh
@@ -0,0 +1,24 @@
+#!/bin/sh
+
+test_description='Test the Git Mediawiki remote helper: queries w/ more than 500 results'
+
+. ./test-gitmw-lib.sh
+. ./push-pull-tests.sh
+. $TEST_DIRECTORY/test-lib.sh
+
+test_check_precond
+
+test_expect_success 'creating page w/ >500 revisions' '
+	wiki_reset &&
+	for i in $(seq 1 501)
+	do
+		echo "creating revision $i"
+		wiki_editpage foo "revision $i<br/>" true
+	done
+'
+
+test_expect_success 'cloning page w/ >500 revisions' '
+	git clone mediawiki::'"$WIKI_URL"' mw_dir
+'
+
+test_done
-- 
1.8.4.GIT

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] git-remote-mediawiki: bugfix for pages w/ >500 revisions
  2013-09-23 17:26 [PATCH v2] git-remote-mediawiki: bugfix for pages w/ >500 revisions Benoit Person
@ 2013-09-23 17:58 ` Matthieu Moy
  2013-09-23 20:17 ` Eric Sunshine
  1 sibling, 0 replies; 5+ messages in thread
From: Matthieu Moy @ 2013-09-23 17:58 UTC (permalink / raw)
  To: Benoit Person; +Cc: git, Benoit Person

Benoit Person <benoit.person@gmail.com> writes:

> The latest API renamed and moved the "continuing" information in the
> response, necessary to build the next query. The code failed to retrieve
> that information but still detected that it was in a "continuing
> query". As a result, it launched the same query over and over again.

Good, that's much cleaner.

> +		if ($result->{'query-continue'}) { # For legacy APIs
> +			$query->{rvstartid} = $result->{'query-continue'}->{revisions}->{rvstartid};
> +		} elsif ($result->{continue}) { # For newer APIs

I'd rather have the comments say "# API version < X" and "# API version
>= X". Next time the API change, "new" Vs "old" will become meaningless.

(But that should not block the patch from inclusion)

Thanks,

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] git-remote-mediawiki: bugfix for pages w/ >500 revisions
  2013-09-23 17:26 [PATCH v2] git-remote-mediawiki: bugfix for pages w/ >500 revisions Benoit Person
  2013-09-23 17:58 ` Matthieu Moy
@ 2013-09-23 20:17 ` Eric Sunshine
  2013-09-24  8:05   ` Benoit Person
  1 sibling, 1 reply; 5+ messages in thread
From: Eric Sunshine @ 2013-09-23 20:17 UTC (permalink / raw)
  To: Benoit Person; +Cc: Git List, Matthieu Moy, Benoit Person

On Mon, Sep 23, 2013 at 1:26 PM, Benoit Person <benoit.person@gmail.com> wrote:
> diff --git a/contrib/mw-to-git/t/t9365-continuing-queries.sh b/contrib/mw-to-git/t/t9365-continuing-queries.sh
> new file mode 100755
> index 0000000..6fb5df4
> --- /dev/null
> +++ b/contrib/mw-to-git/t/t9365-continuing-queries.sh
> @@ -0,0 +1,24 @@
> +#!/bin/sh
> +
> +test_description='Test the Git Mediawiki remote helper: queries w/ more than 500 results'
> +
> +. ./test-gitmw-lib.sh
> +. ./push-pull-tests.sh
> +. $TEST_DIRECTORY/test-lib.sh
> +
> +test_check_precond
> +
> +test_expect_success 'creating page w/ >500 revisions' '
> +       wiki_reset &&
> +       for i in $(seq 1 501)

s/seq/test_seq/

d17cf5f3a32f07bf (tests: Introduce test_seq;  2012-08-03)

> +       do
> +               echo "creating revision $i"

Do you want to end this line with '&&'?

> +               wiki_editpage foo "revision $i<br/>" true
> +       done
> +'
> +
> +test_expect_success 'cloning page w/ >500 revisions' '
> +       git clone mediawiki::'"$WIKI_URL"' mw_dir
> +'
> +
> +test_done
> --

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] git-remote-mediawiki: bugfix for pages w/ >500 revisions
  2013-09-23 20:17 ` Eric Sunshine
@ 2013-09-24  8:05   ` Benoit Person
  2013-09-24  8:19     ` Matthieu Moy
  0 siblings, 1 reply; 5+ messages in thread
From: Benoit Person @ 2013-09-24  8:05 UTC (permalink / raw)
  To: Eric Sunshine, Jonathan Nieder, Matthieu Moy; +Cc: Git List

On 23 September 2013 19:58, Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> wrote:
> I'd rather have the comments say "# API version < X" and "# API version
>>= X". Next time the API change, "new" Vs "old" will become meaningless.
done, thanks


On 23 September 2013 20:26, Jonathan Nieder <jrnieder@gmail.com> wrote:
> Some distros (e.g., Debian) occasionally do run the testsuite
> automatically, but it is still fine since they have a timeout that
> varies by platform to detect if the test has stalled.  I suppose
> ideally git's test harness could learn to do the same thing some day,
> but for now it's easier one level above since an appropriate timeout
> depends on the speed on the platform, what else is creating load on
> the test machine, and other factors that are probably not easy for us
> to guess.
great explanation, thanks


On 23 September 2013 22:17, Eric Sunshine <sunshine@sunshineco.com> wrote:
> s/seq/test_seq/
done, thanks

> d17cf5f3a32f07bf (tests: Introduce test_seq;  2012-08-03)
>
>> +       do
>> +               echo "creating revision $i"
>
> Do you want to end this line with '&&'?
The way it's intended is that it's more a debug information to see how
it's going on (creating >500 revs is *quite* long). If I understand it
correctly, using '&&' would mean that the return value of the echo
statement will be tested for success ? Anyway, I am not sure it makes
sense to fail on a "debug echo" ?

-- 
Benoit Person

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] git-remote-mediawiki: bugfix for pages w/ >500 revisions
  2013-09-24  8:05   ` Benoit Person
@ 2013-09-24  8:19     ` Matthieu Moy
  0 siblings, 0 replies; 5+ messages in thread
From: Matthieu Moy @ 2013-09-24  8:19 UTC (permalink / raw)
  To: Benoit Person; +Cc: Eric Sunshine, Jonathan Nieder, Git List

Benoit Person <benoit.person@gmail.com> writes:

>> d17cf5f3a32f07bf (tests: Introduce test_seq;  2012-08-03)
>>
>>> +       do
>>> +               echo "creating revision $i"
>>
>> Do you want to end this line with '&&'?
> The way it's intended is that it's more a debug information to see how
> it's going on (creating >500 revs is *quite* long). If I understand it
> correctly, using '&&' would mean that the return value of the echo
> statement will be tested for success ? Anyway, I am not sure it makes
> sense to fail on a "debug echo" ?

I don't think you should bother with such reasoning: just put the &&
everywhere.

Without the &&, it's OK in the current version, but what if someone adds
something before the echo? Then it becomes

git something-important &&
echo debug
git something-else-important

and a failure on the first call is unnoticed because of the missing &&
afterwards.

The echo is not supposed to fail, so either the && does not change
anything, or something went terribly wrong and you'd rather notice it.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-09-24  8:19 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-23 17:26 [PATCH v2] git-remote-mediawiki: bugfix for pages w/ >500 revisions Benoit Person
2013-09-23 17:58 ` Matthieu Moy
2013-09-23 20:17 ` Eric Sunshine
2013-09-24  8:05   ` Benoit Person
2013-09-24  8:19     ` Matthieu Moy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).