All of lore.kernel.org
 help / color / mirror / Atom feed
* tabled (CLD) failing on i386
@ 2010-02-15  9:12 Jeff Garzik
  2010-02-15 14:46 ` Pete Zaitcev
  2010-02-15 18:04 ` Pete Zaitcev
  0 siblings, 2 replies; 3+ messages in thread
From: Jeff Garzik @ 2010-02-15  9:12 UTC (permalink / raw)
  To: Project Hail


The tabled rawhide build is currently failing intermittently.  Three 
build attempts yielded:

x86-64 ok, i686 fail:
http://koji.fedoraproject.org/koji/taskinfo?taskID=1986987

x86-64 fail, i686 ok:
http://koji.fedoraproject.org/koji/taskinfo?taskID=1987025

x86-64 ok, i686 ok:
http://koji.fedoraproject.org/koji/taskinfo?taskID=1987047


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: tabled (CLD) failing on i386
  2010-02-15  9:12 tabled (CLD) failing on i386 Jeff Garzik
@ 2010-02-15 14:46 ` Pete Zaitcev
  2010-02-15 18:04 ` Pete Zaitcev
  1 sibling, 0 replies; 3+ messages in thread
From: Pete Zaitcev @ 2010-02-15 14:46 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Project Hail

On Mon, 15 Feb 2010 04:12:19 -0500
Jeff Garzik <jeff@garzik.org> wrote:

> 
> The tabled rawhide build is currently failing intermittently.  Three 
> build attempts yielded:
> 
> x86-64 ok, i686 fail:
> http://koji.fedoraproject.org/koji/taskinfo?taskID=1986987

chunkd[30489]: Waiting for CLD PortFile cld.port
cld[30488]: databases up
cld[30488]: Listening on port 42680
cld[30488]: initialized: verbose 0
chunkd[30489]: Using CLD port 42680
tabled[30491]: Verbose debug output enabled
tabled[30491]: TDB data/tdb PID tabled.pid port auto
tabled[30491]: Forcing local hostname to localhost.localdomain
tabled[30492]: Listening on port 50392
tabled[30492]: Selected CLD host localhost port 42680
chunkd[30490]: Listening on auto port 55087
chunkd[30490]: initialized
tabled[30492]: New CLD session created, sid 0774CF181D0A5A13
tabled[30492]: CLD directory "/tabled-default" created

> x86-64 fail, i686 ok:
> http://koji.fedoraproject.org/koji/taskinfo?taskID=1987025

chunkd[11137]: Using CLD port 60596
tabled[11139]: Verbose debug output enabled
tabled[11139]: TDB data/tdb PID tabled.pid port auto
tabled[11139]: Forcing local hostname to localhost.localdomain
tabled[11140]: Listening on port 42990
tabled[11140]: Selected CLD host localhost port 60596
chunkd[11138]: Listening on auto port 53858
chunkd[11138]: initialized
chunkd[11138]: New CLD session created, sid 59A6788119E11AE0

> x86-64 ok, i686 ok:
> http://koji.fedoraproject.org/koji/taskinfo?taskID=1987047

chunkd[18355]: Using CLD port 53870
tabled[18358]: Verbose debug output enabled
tabled[18358]: TDB data/tdb PID tabled.pid port auto
tabled[18358]: Forcing local hostname to localhost.localdomain
chunkd[18357]: Listening on auto port 44455
chunkd[18357]: initialized
tabled[18359]: Listening on port 41095
tabled[18359]: Selected CLD host localhost port 53870
chunkd[18357]: New CLD session created, sid 54008CA43FD9B477
tabled[18359]: New CLD session created, sid 7308676749B5C803
tabled[18359]: CLD directory "/tabled-default" created
tabled[18359]: CLD file "/tabled-default/localhost.localdomain" created
tabled[18359]: Known tabled nodes
tabled[18359]:  localhost.localdomain (ourselves)
tabled[18359]: Known Chunk nodes
tabled[18359]:  19690720

So, in both bad cases, CLD session fails to start. In both cases
it would be retried, but not in 25s that are permitted for the
tabled service to start by the first test.

Perhaps we lost some kind of low-level retry that was there before?
Curious...

-- Pete

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: tabled (CLD) failing on i386
  2010-02-15  9:12 tabled (CLD) failing on i386 Jeff Garzik
  2010-02-15 14:46 ` Pete Zaitcev
@ 2010-02-15 18:04 ` Pete Zaitcev
  1 sibling, 0 replies; 3+ messages in thread
From: Pete Zaitcev @ 2010-02-15 18:04 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Project Hail

On Mon, 15 Feb 2010 04:12:19 -0500
Jeff Garzik <jeff@garzik.org> wrote:

> x86-64 ok, i686 fail:
> http://koji.fedoraproject.org/koji/taskinfo?taskID=1986987
> 
> x86-64 fail, i686 ok:
> http://koji.fedoraproject.org/koji/taskinfo?taskID=1987025

I would try something like this for a start:

diff --git a/test/wait-for-listen.c b/test/wait-for-listen.c
index c1946c4..a5f1a52 100644
--- a/test/wait-for-listen.c
+++ b/test/wait-for-listen.c
@@ -133,9 +133,12 @@ int main()
  		 * Vote in DB4 replication takes about 12-13s.
 		 * In addition we may have retries when tabled polls for
 		 * Chunk daemons to come up. On busy boxes we may miss 20s.
+		 * So, 25s should be plenty, and we used that for a while,
+		 * but sometimes a daemon can fail establishing a session
+		 * with CLD and a retry takes a minute.
 		 */
-		if (time(NULL) >= start_time + 25) {
-			fprintf(stderr, "server is not up after 25 s\n");
+		if (time(NULL) >= start_time + 100) {
+			fprintf(stderr, "server is not up after 100 s\n");
 			exit(1);
 		}
 
@@ -159,6 +162,8 @@ int main()
 
 		sleep(2);
 	}
+
+	printf("tabled went up after %ld s\n", (long)time(NULL) - start_time);
 	return 0;
 }
 

-- Pete

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2010-02-15 18:04 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-02-15  9:12 tabled (CLD) failing on i386 Jeff Garzik
2010-02-15 14:46 ` Pete Zaitcev
2010-02-15 18:04 ` Pete Zaitcev

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.