All of lore.kernel.org
 help / color / mirror / Atom feed
* [tabled patch 1/1] Stagger the start-daemon
@ 2010-06-30 14:49 Pete Zaitcev
  2010-07-01 20:23 ` Jeff Garzik
  0 siblings, 1 reply; 2+ messages in thread
From: Pete Zaitcev @ 2010-06-30 14:49 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Project Hail List

My rule of thumb is that magic delays are evil or stupid, so I worked on
eliminating them from our scripts. However, in this case it's just not
worth it, because the result is that we have to wait way more than 100s
for several cycles of CLD timeouts to complete, not just one, before we
declare a failure. With this patch, all builds completed that I submitted
to Fedora build system.

Signed-off-by: Pete Zaitcev <zaitcev@redhat.com>

---
 test/start-daemon      |    4 ++++
 test/wait-for-listen.c |    7 ++-----
 2 files changed, 6 insertions(+), 5 deletions(-)

--- tabled-0.5git/test/start-daemon	2010-03-07 05:54:37.000000000 -0700
+++ tabled-tip/test/start-daemon	2010-05-25 15:02:26.001651210 -0600
@@ -18,6 +18,10 @@ fi
 
 # May be different on Solaris... like /usr/libexec or such.
 cld -d data/cld -P cld.pid -p auto --port-file=cld.port -E
+
+# With great sadness we have to use a delay, or else "100 s" happens.
+sleep 3
+
 chunkd -C $top_srcdir/test/chunkd-test.conf -E
 ../server/tabled -C $top_srcdir/test/tabled-test.conf -E
 
--- tabled-0.5git/test/wait-for-listen.c	2010-04-14 13:49:33.000000000 -0600
+++ tabled-tip/test/wait-for-listen.c	2010-06-17 21:19:18.245883298 -0600
@@ -133,12 +133,9 @@ int main(int argc, char **argv)
  		 * Vote in DB4 replication takes about 12-13s.
 		 * In addition we may have retries when tabled polls for
 		 * Chunk daemons to come up. On busy boxes we may miss 20s.
-		 * So, 25s should be plenty, and we used that for a while,
-		 * but sometimes a daemon can fail establishing a session
-		 * with CLD and a retry takes a minute.
 		 */
-		if (time(NULL) >= start_time + 100) {
-			fprintf(stderr, "server is not up after 100 s\n");
+		if (time(NULL) >= start_time + 25) {
+			fprintf(stderr, "server is not up after 25 s\n");
 			exit(1);
 		}
 

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [tabled patch 1/1] Stagger the start-daemon
  2010-06-30 14:49 [tabled patch 1/1] Stagger the start-daemon Pete Zaitcev
@ 2010-07-01 20:23 ` Jeff Garzik
  0 siblings, 0 replies; 2+ messages in thread
From: Jeff Garzik @ 2010-07-01 20:23 UTC (permalink / raw)
  To: Pete Zaitcev; +Cc: Project Hail List

On 06/30/2010 10:49 AM, Pete Zaitcev wrote:
> My rule of thumb is that magic delays are evil or stupid, so I worked on
> eliminating them from our scripts. However, in this case it's just not
> worth it, because the result is that we have to wait way more than 100s
> for several cycles of CLD timeouts to complete, not just one, before we
> declare a failure. With this patch, all builds completed that I submitted
> to Fedora build system.
>
> Signed-off-by: Pete Zaitcev<zaitcev@redhat.com>
>
> ---
>   test/start-daemon      |    4 ++++
>   test/wait-for-listen.c |    7 ++-----
>   2 files changed, 6 insertions(+), 5 deletions(-)

applied


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2010-07-01 20:23 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-30 14:49 [tabled patch 1/1] Stagger the start-daemon Pete Zaitcev
2010-07-01 20:23 ` Jeff Garzik

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.